Junior Data Engineer

Job Title: Junior Data Engineer
Location: Cardiff Bay, Wales
Salary: £25,000.00 per annum
Department: Data Engineering
Reports To: Senior Data Engineer

You will be working closely with the data vault team building specific systems facilitating the transition from traditional data processing and approaches to a Data Vault based approach. The role will define and build data pipelines that will improve data-informed decision-making within the business. This is an opportunity to work with large volumes of data and gain exposure to big data architectures.

Required Skills
• Experience of reading and writing data using Python and SQL
• Understanding of Agile development methodologies
• Python unit testing Frameworks such as pytest and nose
• Familiarity with cloud technology, preferably AWS
• Good understanding of GIT
• Knowledge of automated delivery processes

Desirable Skills
• Implemented data pipelines using Apache Airflow
• Execute DataTransformations in SQL via DBT
• Understanding of MPP data platforms such as Apache Hive, Presto, Spark, Redshift
• Experience of working with large datasets

Primary Responsibilities
• Play a hands on role as part of an Agile team to develop, test and maintain high quality systems that fulfil business needs.
• Extracting data from various files, systems, cloud sources, databases and APIs through writing and executing code (SQL, Python and similar)
• Cleaning and combining offline, online or mixed sources into datasets. Building in manual or automatic validation and accuracy checks. Making use of Python, SQL or specialist Big Data frameworks
• Help support the team in maintaining existing software and data infrastructure
• Strong focus on quality. Execute practices such as continuous integration and test driven development to enable the rapid delivery of working code.
• Write documentation of new processes and products you’ve developed so that knowledge is shared
• Create pattern based data pipeline using Python and SQL using industry standard loading patterns in accordance with guidelines set by the Senior Data Engineers
• Help to design, build and launch new data models

• 2+ years development experience within a commercial environment
• Knowledge of Agile development methodologies
• Some experience of working with data sources and Python
• Knowledge of SQL programming and code optimisation.
• Awareness of cloud technology particularly AWS.
• Knowledge of automated delivery processes
• Some experience designing and building data pipelines
• Understanding of best engineering practices (handling and logging errors, system monitoring and building human-fault-tolerant applications)
• Ability to write efficient code and comfortable undertaking system optimisation and performance tuning tasks
• Experience working within a unix based environment
• Comfortable working with relational databases such as PostgreSQL, MySQL, MariaDB or Redshift

Behavioural skills
Teamwork – Encourages cooperation, collaboration and partnerships
Quality Improvement – strives for high quality performance.
Problem Solving – Identifies problems and seeks best solutions by being creative and innovative

Works under direction of the Senior Data Engineers within a clear framework of accountability. Exercises personal responsibility and autonomy. Plans own work to meet given objectives and processes.

Participates in external activities related to own specialism. Contributes to decisions
which influence the success of projects and team objectives.

Performs a range of work, sometimes complex and non routine, in a variety of environments. Applies methodical approach to issue definition and resolution.

Business skills
Selects appropriately from applicable standards, methods, tools and applications. Communicates fluently, orally and in writing, and can present information to both technical and non-technical audiences. Plans, schedules and monitors work to meet time and quality targets. Absorbs new information and applies it effectively. Maintains an awareness of developing technologies and how they could be applied to improve their solution.