Create Alert
Email me similar jobs

Pyspark/Python Data Engineer

Full-time 120,000 USD

Role - Pyspark/Python Data Engineer

Location - Irving, TX / Tampa, FL / Edison, NJ

Duration: Full Time (Permanent)

Key Skills - PySpark, Python, ETL,AWS,Snowflake.

Job Description

Must Have Technical/Functional Skills

We are looking for a skilled PySpark Data Engineer with strong hands-on experience in PySpark and Python to design, build, and optimize scalable data processing pipelines. The ideal candidate will have practical experience working with distributed data processing and a solid foundation in writing efficient, production-grade Python code

Required Technical Skills

- Strong hands-on experience in PySpark (Spark SQL, DataFrame API)

- Advanced proficiency in Python (data processing, performance tuning, modular coding)

- Solid understanding of ETL design patterns and data pipeline architecture

- Good working knowledge of SQL for data transformation and analysis

- Experience with data processing in distributed environments

-

Preferred Skills (Good to Have)

- Experience with cloud platforms (AWS preferred S3, Glue, EMR or equivalent services)

- Familiarity with workflow orchestration tools such as Airflow or similar schedulers

- Exposure to data warehousing concepts (e.g., Snowflake or similar platforms)

- Knowledge of code versioning (Git) and CI/CD practices

Experience

3 8 years of experience in Data Engineering / PySpark development

Proven hands-on project experience in PySpark + Python

Roles & Responsibilities

Design, develop, and maintain ETL/ELT pipelines using PySpark

Write optimized and scalable PySpark transformations using DataFrames and Spark SQL

Develop reusable and efficient Python-based data processing components

Ensure data quality, integrity, and performance across pipelines

Perform debugging, performance tuning, and optimization of PySpark jobs

Collaborate with cross-functional teams (Data Analysts, Architects, DevOps)

Contribute to CI/CD pipelines and deployment workflows for data applications

Monitor and troubleshoot data workloads in production environments

Similar jobs

Pyspark/Python Data Engineer

Apply Now
Back to search page