Role: AI Systems Architect Location: SFO, CA & San Leandro, CA (Hybrid)
Duration: Contract for 12 + Months
Job Description Required Experience: We are seeking an experienced AI Systems Architect to design, build, and scale high-performance distributed AI systems. The ideal candidate will have deep expertise in GenAI, LLMs, and cloud-native architectures, along with hands-on experience in building enterprise-scale AI/ML platforms and agent-based systems.
Must-Have Skills - Strong experience in designing and implementing high-performance, large-scale distributed systems
- Proven experience in implementing and deploying AI/ML platforms at scale
- Expertise in building agent-based architectures, evaluation frameworks, and prompt/context engineering
- Knowledge of MCP (Model Context Protocol) servers
- Hands-on experience in LLM inference optimization, including batching and caching strategies
- Strong experience with Kubernetes and cloud infrastructure (AWS/Azure/Google Cloud Platform)
- Proficiency in at least one programming language (Python, Java, Go, etc.)
- Expertise in designing agent data stacks & retrieval systems, including: Vector databases Hybrid search Data freshness strategies Memory systems Graph reasoning BM25 and advanced retrieval techniques
Key Responsibilities - Architect and deliver scalable, high-performance distributed systems
- Design and deploy AI/ML and GenAI platforms at enterprise scale
- Build and manage agent-based architectures, including: Prompt and context engineering MCP servers Evaluation frameworks
- Optimize LLM inference pipelines for latency, throughput, and efficiency
- Design and implement agent data & retrieval systems (vector DBs, hybrid search, memory, graph-based reasoning)
- Lead Kubernetes-based, cloud-native deployments
- Provide technical leadership, architecture governance, and hands-on mentoring to engineering teams
Nice to Have - Experience with RAG (Retrieval-Augmented Generation) frameworks
- Familiarity with multi-agent systems and orchestration frameworks
- Exposure to real-time data pipelines and streaming architectures
Thanks & Regards, OpenKyber
For applications and inquiries, contact: [email protected]