Job Title: Scala Developer
Location: NYC, NY (3 days onsite minimum)
Duration: 6 months
YOE: 8+ years
Look for local candidates please, if you are seeing challenges with rates, feel free to submit at higher rates.
Job Summary:
Key Responsibilities
- Develop, test, and maintain scalable data processing pipelines using Scala and Apache Spark.
- Implement data transformation and ETL workflows to handle structured and unstructured data.
- Utilize Python for data processing, scripting, and integrating Spark workflows.
- Optimize performance of Spark applications through tuning, partitioning, and caching strategies.
- Participate in code reviews, design discussions, and ensure adherence to best practices.
- Document workflows, architecture, and solutions for internal knowledge sharing.
Required Qualifications
- Strong experience in Scala and Apache Spark (Spark Core, Spark SQL, DataFrames, RDDs).
- Proficiency in Python for data manipulation, scripting, and automation.
- Proven experience with distributed computing concepts and large-scale data processing.
- Knowledge of ETL development, data warehousing, and data modeling techniques.
- Familiarity with Big Data ecosystems such as Hadoop, Hive, Kafka, or HBase is a plus.
- Excellent communication and collaboration skills.
Preferred Qualifications
- Experience with cloud platforms such as AWS, GCP, or Azure (particularly with Spark services).
- Familiarity with streaming data processing using Spark Streaming or Structured Streaming.
- Knowledge of data governance, security, and compliance practices.
- Basic understanding of machine learning workflows is advantageous.