Python Developer with Flink engineer specializes in building high-performance, real-time data processing applications.
Core Responsibilities
- Pipeline Development: Design, develop, and maintain real-time and batch data processing pipelines using Apache Flink.
- PyFlink Implementation: Build Flink applications using Python, leveraging both the Table/SQL API and DataStream API.
- System Integration: Connect Flink with event streaming platforms like Kafka, as well as various databases (Redis, MongoDB) and cloud services (AWS, Azure).
- Optimization: Tune streaming jobs for low latency and high throughput, while managing state, checkpointing, and "exactly-once" processing guarantees.
- Collaboration: Work with data scientists and engineers to translate complex analytical requirements into scalable production-grade code.
Required Technical Skills
- Python Proficiency: Deep knowledge of Python (PEP 8, asynchronous services, and data structures).
- Stream Processing: Strong understanding of Flink concepts like event-time semantics, watermarks, and windowing strategies.
- Infrastructure & Tools:
- Orchestration: Airflow.
- Containerization: Docker
- Monitoring: Grafana
- Data Systems: Familiarity with Amazon Flink