We are seeking a motivated and detail‑oriented Python / PySpark Developer to support the development and maintenance of scalable data processing solutions. The ideal candidate should have foundational experience in Python and exposure to Apache Spark (PySpark), along with a strong willingness to learn and grow in a distributed data engineering environment.

You will work under the guidance of senior engineers and collaborate with data teams to build reliable data pipelines and contribute to analytics and reporting solutions.


# Key Responsibilities

## Development & Engineering

* Assist in developing and maintaining data pipelines using Python and PySpark
* Support ETL/ELT workflows for batch data processing
* Write clean, readable, and well‑structured Python code following best practices
* Perform basic data transformations, aggregations, and validations
* Debug and troubleshoot pipeline issues with guidance from senior developers

## Data & Platform

* Work with structured and semi‑structured data formats (CSV, JSON, Parquet, etc.)
* Assist in integrating data from databases, APIs, and cloud storage systems
* Help ensure data quality and consistency within pipelines
* Support migration of legacy scripts to modern data platforms

## Learning & Collaboration

* Collaborate with team members on development tasks and code reviews
* Participate in knowledge‑sharing and training sessions
* Learn and adopt new tools, frameworks, and best practices
* Assist in documenting data workflows and technical processes

# Required Skills & Qualifications

## Technical Skills

* Basic to intermediate proficiency in Python

* 4 -7 years of experience
* Exposure to Apache Spark / PySpark (internship or project experience is acceptable)
* Understanding of fundamental programming and data structures
* Basic knowledge of SQL and relational databases
* Familiarity with data processing concepts and ETL fundamentals
* Awareness of Linux/Unix command line is a plus

## Engineering Fundamentals

* Understanding of coding best practices and version control (Git)
* Basic debugging and problem‑solving skills
* Exposure to unit testing concepts is a plus

# Nice to Have (Preferred Skills)

* Exposure to big data tools (Hive, Hadoop ecosystem, or similar)
* Familiarity with cloud platforms (AWS / Azure / GCP)
* Basic knowledge of job orchestration tools (Airflow, etc.)
* Understanding of data pipelines and workflow lifecycle
* Academic or project experience with data engineering or analytics

# Ideal Candidate Traits

* Strong willingness to learn and grow in a fast‑paced environment
* Good analytical and problem‑solving skills
* Effective communication and teamwork abilities
* Attention to detail and commitment to quality


------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Similar jobs