About the job
As a Senior Fullstack Software Engineer on the CosmicAC team, you will play a critical role in building and scaling our GPU‑accelerated cloud services platform. You'll work on both the backend infrastructure that powers AI/ML workloads at scale and the frontend interfaces that make these capabilities accessible to developers and data scientists. This position requires deep technical expertise in distributed systems, strong backend development skills, and the ability to create intuitive user experiences.
You’ll be joining a team that’s pushing the boundaries of decentralized AI infrastructure, working on everything from Kubernetes orchestration and managed inference services to API design and real‑time monitoring dashboards. Your work will directly impact how developers interact with and deploy AI models in production environments.
Responsibilities
Backend Development: design and implement robust backend services and APIs that handle AI model inference, resource orchestration, and workload distribution across distributed GPU infrastructure
Frontend Implementation: build responsive and intuitive web interfaces for training job management, model deployment workflows, and real‑time monitoring dashboards using modern JavaScript frameworks
Distributed Systems Architecture: contribute to the design and implementation of distributed systems using peer‑to‑peer technologies (Holepunch stack)
API Design & Integration: develop and maintain APIs that support both synchronous and asynchronous inference patterns, ensuring compatibility with industry standards
Platform Reliability: implement monitoring, logging, and telemetry solutions to ensure high availability and performance of the platform services
Cross‑functional Collaboration: work closely with DevOps, AI/ML engineers, and product teams to deliver integrated solutions that meet technical and business requirements
Code Quality & Best Practices: maintain high standards for code quality through peer reviews, testing, and documentation while championing security best practices
Requirements
5+ years of experience in full‑stack development with strong emphasis on backend systems
Expert‑level proficiency in Node.js/JavaScript for backend development and React frontend framework
Proven experience building and scaling distributed systems or event‑driven architectures
Strong understanding of API design and implementation, including authentication, rate limiting, and versioning
Experience with containerization technologies (Docker) and orchestration platforms (Kubernetes)
Proficiency with databases and a deep understanding of data modeling and optimization
Solid understanding of networking, security principles, and best practices for production systems
Experience with real‑time data streaming and RPC implementations
Ability to work independently in a remote environment and communicate effectively across time zones
Preferred
Experience with peer‑to‑peer technologies (Hyperswarm, libp2p, WebRTC) or similar distributed communication protocols
Familiarity with AI/ML inference APIs and OpenAI‑compatible endpoints
Previous experience building AI SaaS or PaaS platforms
Knowledge of GPU resource management and ML framework infrastructure
Experience with message queuing systems (Redis, RabbitMQ, Kafka)
Familiarity with observability tools (Prometheus, Grafana, ELK stack)
Understanding of WebAssembly or edge computing paradigms
Contributions to open‑source projects in relevant domains
#J-18808-Ljbffr