Senior Data Engineer (Python) - Remote

Apex Systems (Nezahualcoyotl MEX, Mexico) Follow 1 day ago

Apex Systems is a leading Data and Digital Transformation professional services organization focused on providing solutions with real business value.We provide a customer-focused approach to building authentic partnerships with our clients with objective counsel from concept to deployment for a consistent voice through the dynamic IT environment.Apex Systems Mexico is seeking aSenior Data Engineerto design, build, and optimize enterprise-scale data pipelines and replication frameworks within modern lakehouse architectures.This role will focus on implementing scalable ingestion and transformation pipelines usingDatabricks, Delta Lake, and Spark, supporting large-scale data platform modernization initiatives.The ideal candidate has strong hands-on experience building production-grade pipelines, implementingChange Data Capture (CDC) and replication frameworks, and optimizing distributed data workloads.This role requires strong collaboration with cross-functional engineering, analytics, and governance teams to ensure high-quality, reliable, and scalable data solutions.Data Engineering & Pipeline DevelopmentBuild and configuretable-level replication pipelinesusing Airbyte or similar data replication tools.Implementincremental load strategies and CDC-based ingestion frameworksfor enterprise data pipelines.Develop and maintaindata pipelines aligned with Medallion architecture standards(raw, source, core, mart layers).
Implement data transformations and validation frameworks usingSQL and Python.Data Platform OptimizationOptimizeDelta Lake table design, partitioning strategies, and storage layoutfor performance and scalability.TuneSpark workloads and cluster performanceto improve pipeline efficiency and resource utilization.Enhancebatch scheduling, workload parallelization, and pipeline orchestrationto support large-scale data processing.Data Quality, Security & GovernanceDevelop automateddata validation scriptsto ensure source-to-target data accuracy and completeness.Implementdata masking transformations and security controlsaligned with enterprise data governance standards.Implement logging, monitoring, failure handling, and retry logic to ensureresilient and reliable pipelines.SupportPower BI validation queries and migration testingduring data platform transitions.Collaborate with engineering, analytics, and governance teams to deliver high-quality data solutions.Provide guidance and mentoring to junior engineers, supporting their technical and professional development.Proactively identify opportunities to improve data processes, architectures, and operational efficiency.Deliver high-quality solutions while meeting project timelines and ensuring engagement success5+ years of experience in Data Engineeringbuilding production data pipelines.
~3+ years of hands-on experience with Databricks and Apache Spark.
~ Experience implementingdata replication frameworks using Airbyte or similar tools.
~ Strong expertise inSQL and Pythonfor data transformation and validation scripting.
~ Experience implementingMedallion architecture and modern lakehouse data platforms.
~ Experience optimizingSpark workloads and cluster performance.
~ English proficiency level C1 or higher is a requirement.

Apply Now

Save Job