Title:- Senior AI/ML Platform Engineer AWS DevOps & Infrastructure Automation.
Key Responsibilities
- Design, develop, and manage AWS-based cloud infrastructure and platform services
- Build and maintain Infrastructure as Code (IaC) solutions using Terraform
- Develop AI/ML-powered automation solutions for infrastructure, operations, and platform engineering workflows
- Implement and optimize CI/CD pipelines to support efficient software delivery and infrastructure deployments
- Enhance observability and monitoring capabilities using Dynatrace, Grafana, Splunk, and related tools
- Build intelligent automation for alert triage, incident response, root cause analysis, and operational workflows
- Collaborate with DevOps, SRE, Engineering, Product, and Operations teams to identify automation opportunities and improve platform reliability
- Develop and support MLOps processes including model deployment, monitoring, governance, and lifecycle management
- Create automation scripts using Python, Bash, Shell, or Groovy
- Support cloud modernization initiatives and platform transformation efforts
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, Information Technology, or related field
- 7+ years of experience in Cloud Engineering, DevOps, Platform Engineering, or Infrastructure Engineering
- Strong hands-on experience with AWS cloud services and infrastructure management
- Expertise in Terraform and Infrastructure as Code (IaC)
- 3+ years of experience with observability and monitoring tools such as Dynatrace, Grafana, and Splunk
- Strong experience building and maintaining CI/CD pipelines
- 2 3+ years of hands-on AI/ML engineering experience in production environments
- Strong programming and automation skills using Python
- Experience with scripting languages such as Bash, Shell Script, or Groovy
- Experience with AI/ML frameworks such as TensorFlow, PyTorch, or Scikit-learn
- Experience implementing MLOps practices and AI-driven automation solutions
- Strong understanding of cloud architecture, automation, monitoring, and operational excellence
Preferred Qualifications
- Experience with Generative AI, LLMs, Agentic AI, RAG, or AI-assisted development tools
- Experience with Kubernetes, Docker, Ansible, or CloudFormation
- Java development experience
- Experience building AI-powered observability or operational automation platforms
- Experience working in large-scale enterprise or financial services environments
- Knowledge of SRE practices, incident management, and platform reliability engineering
Required Technical Skills
- AWS Cloud Services
- Terraform
- CI/CD Pipelines
- DevOps & Platform Engineering
- Python
- AI/ML Engineering
- MLOps
- Dynatrace
- Grafana
- Splunk
- Bash/Shell/Groovy