Position Details:
Job Title: Senior Data Engineer with DevOps (J0626-2289)
Duration : Contract to Hire
Location : Pittsburgh, PA, Cleveland, OH, or Dallas, TX.
Work Mode : 5 Days Onsite
Years Of Exp : 8+ Yrs
Job Description:
We are seeking a Data Engineer with 5 years of experience to design and maintain scalable data pipeline supporting analytics, reporting, and operational needs. The role involves collaborating with cross functional teams to ensure data alignment with business requirements and enterprise standards.
Duties and Responsibilities:
Design and build scalable data pipelines aligned with business needs
Process large dataset (batch + sometimes near Realtime)
Ensure data quality, consistency, and governance standards across systems
Support data integration and transformation efforts for analytics and reporting platforms
Maintain data dictionaries, metadata, and documentation
Participate in data architecture reviews and model validation processes
Support analytics reporting and risk platforms.
Required Qualifications
5+ years of experience in data engineering and big data processing
Strong expertise in Apache Spark (Spark Core, Spark SQL) and PySpark for large scale batch processing
Experience working with structured and semi structured data, including complex transformations and performance tuning
Proficiency in data ingestion and integration from sources like Oracle, SQL Server, Hive, HDFS, and S3; transform data into 'curated data models'
Experience writing data to Hive tables, Data Lakes (Iceberg), and downstream reporting systems
Strong knowledge of SQL and data modeling concepts
Hands on experience with Apache Airflow for workflow orchestration (DAG design, scheduling expectations, monitoring)
Proficiency in shell scripting for job automation, file validation, dependency handling, and logging. Trigger Spark Jobs, perform file checks and validation; Archive & purge data; mange job
dependency, logging & error handling
Strong understanding of batch processing and batch job scheduling frameworks
Experience migrating from CA7/Control M Airflow (daily, hourly, weekly schedules)
CI/CD for data pipelines
Fundamentals in Linux and Networking
Docker, OCP containerization / Kubernetes
Knowledge of CI/CD pipeline tools: Tools commonly include Jenkins, GitHub Actions, Azure DevOps, GitLab Cl, Maven, and Gradle
Automate operational tasks using Python, Bash/Shell, and PowerShell
Implement monitoring and alerting, Application Insights. Enable centralized logging with tools such as ELK.
Experience ensuring data quality, reliability, and compliance in regulated environments
Good communication and documentation skillsGood communication and collaboration skills in cross-functional teams
Agile/Safe methodologies
Must Have Skills
Airflow
Containerization
DevOps
Elastic Stack & Elasticsearch
GitHub
Hadoop Hive
Jenkins
JSON Web Token (JWT)
Kubernetes
OpenShift
Oracle
Python
Shell Script
SQLite

More from Goldenpick Technologies
Goldenpick Technologies 2 hours ago
Goldenpick Technologies 2 hours ago
Goldenpick Technologies 2 hours ago

Senior Data Engineer with DevOps

Apply Now
Back to search page