Key Requirement: Azure OpenAI, Kubernetes, ArgoCD, Jenkins, Azure AI Search, Langfuse, LLMOps, RAG, CI/CD, Observability, Python Must be willing to work onsite in Atlanta, GA
Job Description: This role owns the operational foundation for George and other generative AI applications. The person should make the AI platform reliable, observable, cost-aware, secure, and release-ready across development, QA, and production environments.
Azure AI / Azure OpenAI Model deployments, quotas, TPM/RPM planning, rate-limit troubleshooting, deployment configuration, model upgrade support, fallback planning, and cost/performance monitoring.
Azure AI Search / RAG Infrastructure Index, indexer, skillset, embedding, semantic search, hybrid search, retrieval quality support, and performance troubleshooting for George knowledge sources.
CI/CD and Release Engineering Jenkins pipelines, build promotion, environment readiness, automated gates, release checklists, rollback planning, and deployment validation.
Kubernetes / Argo CD Argo CD sync health, manifest drift, Kubernetes pod health, environment promotion, scaling, config maps, secrets, and rollback operations.
Langfuse / Observability Trace ingestion, dashboards, prompt/version visibility, datasets support, experiment troubleshooting, latency, token usage, cost, and failure analysis.
Production Support Incident triage, root-cause analysis, operational runbooks, cross-team issue resolution, and executive-ready status reporting.
George-Specific Responsibilities
Responsibilities for Broader Agentic Applications Capability
Candidate Requirements Must-Have
Strongly Preferred
Expected Deliverables
30 / 60 / 90 Day Expectations
Cleo Consulting is an equal opportunity employer (Minorities/Women/Veterans/Disabled)
For applications and inquiries, contact: [email protected]
By continuing you agree to our Terms & Privacy Policy.