Role Title : Distributed Cache & DevOps Engineer Must Have : Hazelcast , Redis / Redis Cluster, Apache Ignite Role Summary We are looking for a technically strong Distributed Cache Support Engineer to provide operational, technical, and production support for distributed caching solutions.
The ideal candidate should have a strong Java background, prior experience with enterprise caching or in-memory data grid platforms, and hands-on exposure to DevOps and Kubernetes-based environments.
This role requires a hybrid profile combining Java backend engineering, distributed cache troubleshooting, platform operations, performance analysis, and production support.
Key Responsibilities Cache Platform Support Provide L2/L3 support for Distributed Cache clusters and related application integrations.
Monitor cluster health, member status, partition distribution, memory utilization, latency, and throughput.
Support cache configuration, including distributed maps, Near Cache, eviction policies, TTL, backup count, serialization, and cluster discovery.
Troubleshoot cache-related incidents such as high latency, memory pressure, node restarts, split-brain scenarios, data inconsistency, and degraded performance.
Assist in capacity planning, performance tuning, and operational improvements for environments.
Coordinate with vendor support teams for product-level issues, patches, upgrades, and escalations.
Java / Application Support Analyze Java application behavior related to distributed cache platform integration.
Troubleshoot JVM-level issues including heap usage, garbage collection, thread dumps, memory leaks, and serialization overhead.
Work with application teams to identify cache misuse, inefficient access patterns, and performance bottlenecks.
Support Spring Boot / Java microservices interacting with distributed cache platform.
Review and validate application-side configurations and integration patterns.
DevOps / Kubernetes Operations Support distributed cache platform deployments running on Kubernetes or containerized environments.
Work with Kubernetes objects such as pods, services, namespaces, configmaps, secrets, deployments, and stateful workloads.
Analyze pod restarts, resource limits, liveness/readiness probe failures, service discovery issues, and container logs.
Support configuration management and deployment activities through CI/CD pipelines.
Assist with TLS/mTLS certificate-related troubleshooting where applicable.
Work with infrastructure and platform teams on network, DNS, storage, compute, and security-related issues.
Monitoring, Incident & RCA Management Monitor platform and application metrics using tools such as AppDynamics, Splunk, Prometheus, Grafana, ELK, or similar.
Participate in incident management, troubleshooting calls, war-room support, and issue triage.
Prepare root cause analysis reports for production incidents.
Recommend preventive actions, operational improvements, and automation opportunities.
Maintain runbooks, SOPs, known-error documents, and support knowledge base articles.
Required Skills & Experience Mandatory Skills Strong hands-on experience in Java backend development or Java platform support.
Good understanding of JVM internals, memory management, garbage collection, thread dumps, and heap analysis.
Prior experience with distributed caching or in-memory data grid solutions.
Hands-on experience with at least one of the following: Hazelcast Redis / Redis Cluster Apache Ignite Experience supporting applications in production or near-production environments.
Preferred Skills Direct hands-on experience with distributed cache platform & Understanding of: Distributed maps Near Cache Eviction and expiry policies Partitioning Backup/replication Split-brain protection Serialization Cluster discovery Experience with Spring Boot and microservices architecture.
Prior production support experience in enterprise environments.
Powered by JazzHR
By continuing you agree to our Terms & Privacy Policy.