Key Responsibilities:
- Design, develop, and optimize scalable data pipelines and ETL processes to ingest, transform, and load large volumes of structured and unstructured data from various sources into our data warehouse.
- Build and maintain data integration workflows using tools and technologies such as Apache Spark, Apache Airflow, Kafka, and AWS Glue.
- Implement data models, schemas, and architectures to support business analytics, reporting, and data visualization requirements.
- Collaborate with data scientists and analysts to understand data requirements, perform data wrangling and preprocessing, and support machine learning model development and deployment.
- Stay current with emerging technologies and best practices in data engineering, big data processing, and cloud computing to drive continuous improvement and innovation.
Objectives:
- Data Infrastructure Excellence. Design, build, and manage a cutting-edge data infrastructure that ensures scalability, reliability, and security to meet current and future business demands.
- Data Integration and Insight Generation. Develop and maintain state-of-the-art ETL processes and analytics models that integrate data from diverse sources, ensuring high data accuracy and timeliness, and providing actionable insights for strategic decision-making.
- Collaboration, Innovation, and Continuous Improvement. Foster a culture of innovation and continuous improvement in data practices, collaborating closely with key stakeholders to leverage data in driving business solutions and shaping the company’s data and analytics strategy.