Responsibilities:
Design and implement ETL processes to ensure proper functioning of analytics, as well as client’s reporting environments
Perform the design and extension of data models and meta data
Provide support to Software Development Life cycle
Ensure the code is maintained in a version control system(i.e. Git)
Carry out monitoring and performance analysis of the client’s infrastructure as well as proposing optimizations on it
Monitoring and tuning data loads and queries
Required skills:
experienced in designing & implementing ELT flows & framework components, using programming languages like:
SQL
Python
Scala
Java
hands-on experiences in:
Hadoop
Hive/Impala/Spark
Data Management with Cloudera Data Hub (or similar product)
analyzing and fixing bugs & incidents
Project description
DWH team is responsible with automating the ETL process for data from different systems - mainly done with spark -, including integrating new metadata, as well as ensuring and monitoring the proper functionality of reporting tools (i.e. Cognos) and providing run support for the client infrastructure