Junior Data Engineer

Job Title: Junior Data Engineer
Location: Cardiff, Wales
Salary: Market Rate
Department: Development
Reports To: Senior Data Engineer

Role
You will be working closely with the Data Vault team building specific systems facilitating the transition from traditional data processing and approaches to a Data Vault based approach. The role will define and build data pipelines that will improve data-informed decision-making within the business. This is an opportunity to work with large volumes of data and gain exposure to big data architectures.

Required Skills
• Experience of reading and writing data using Python and SQL.
• Understanding of Agile development methodologies.
• Python unit testing frameworks such as Pytest and Nose.
• Familiarity with cloud technology, preferably AWS.
• Good understanding of GIT.
• Knowledge of automated delivery processes.

Desirable Skills
• Experience in implementing data pipelines using Apache Airflow.
• Understanding of MPP data platforms such as Apache Hive, Presto, Spark, Redshift.
• Presentation of data via BI/dashboard solutions.
• Experience of working with large datasets.

Primary Responsibilities
• Play a hands on role as part of an Agile team to develop, test and maintain high quality systems that fulfil business needs.
• Extracting data from various files, systems, cloud sources, databases and APIs through writing and executing code (SQL, Python and similar).
• Cleaning and combining offline, online or mixed sources into datasets. Building in manual or automatic validation and accuracy checks. Making use of Python, SQL or specialist Big Data frameworks.
• Help support the team in maintaining existing software and data infrastructure.
• Strong focus on quality. Execute practices such as continuous integration and test driven development to enable the rapid delivery of working code.
• Write documentation of new processes and products you have developed so that knowledge is shared.
• Design and build a pattern based data pipeline using Python and SQL in accordance with guidelines set by the Senior Data Engineers.
• Help to design, build and launch new data vault models.
• Create functioning data pipelines using industry standard loading patterns.

Experience
• 2+ years development experience within a commercial environment.
• Knowledge of Agile development methodologies.
• Some experience of working with data sources and Python.
• Knowledge of SQL programming and code optimisation.
• Awareness of cloud technology particularly AWS.
• Knowledge of automated delivery processes.
• Some experience designing and building data pipelines.
• Understanding of best engineering practices (handling and logging errors, system monitoring and building human-fault-tolerant applications).
• Ability to write efficient code and comfortable undertaking system optimisation and performance tuning tasks.
• Experience working within a unix based environment.
• Comfortable working with relational databases such as PostgreSQL, MySQL, MariaDB or Redshift.

Behavioural skills

• Teamwork – Encourages cooperation, collaboration and partnerships.
• Quality Improvement – strives for high quality performance.
• Problem Solving – Identifies problems and seeks best solutions by being creative and innovative

Autonomy
Works under direction of the Senior Data Engineers within a clear framework of accountability. Exercises personal responsibility and autonomy. Plans own work to meet given objectives and processes.

Influence
Participates in external activities related to own specialism. Contributes to decisions
which influence the success of projects and team objectives.

Complexity
Performs a range of work, sometimes complex and non-routine, in a variety of environments. Applies methodical approach to issue definition and resolution.

Business skills
Selects appropriately from applicable standards, methods, tools and applications. Communicates fluently, orally and in writing, and can present information to both technical and non-technical audiences. Plans, schedules and monitors work to meet time and quality targets. Absorbs new information and applies it effectively. Maintains an awareness of developing technologies and how they could be applied to improve their solution.