Senior Site Reliability Engineer - (GE)

IT Scout (Buenos Aires, Argentina) Publicado hace 32 días
Remote Friendly DevOps & Sysadmin Americas Full-time

Only available for applicants from Buenos Aires, Argentina
Our client is a global community of over 100 million people with the common purpose of helping one another. Their mission is to help people help each other by making it safe and easy for people to ask for help and support causes—for themselves, each other, and their communities. In 2022, our client joined together with Classy, a leading nonprofit fundraising software company that enables nonprofits to connect supporters with the causes they care about. Together, the company and Classy have empowered people and organizations to raise more than $25 billion since 2010. Their vision is to become the most helpful place in the world.

We are looking for Senior Site Reliability Engineer...

We're searching for an experienced Site Reliability Engineer (SRE). You will be responsible for the full system lifecycle including infrastructure provisioning, system configuration, monitoring, and incident response in production environments. The SRE uses technical analysis to assess the availability, latency, scalability, and efficiency of a product or infrastructure and builds reliability into systems. To ensure the highest level of application performance and availability, the reliability engineer works closely with development teams, relevant functional operations teams, network engineers, database administrators, technology vendors and partners. The successful reliability engineer effectively guides incident responses, helps identify root causes and provides recommendations or solutions to mitigate and resolve issues.

This is a remote position for the first 6 to 12 months, then hybrid.

The Job…

  • Design and build out our cloud infrastructure (we run everything in AWS).
  • Participate in software and system performance analysis, tuning, and service capacity planning.
  • Manage the availability, scalability, security, and performance of our platform and applications.
  • Diagnose bottlenecks for the full stack and provide recommendations to overcome the bottlenecks as an interim work around, while long-term solutions are investigated.
  • Periodically assess all monitoring requirements and implement enhancements to meet or exceed changing business needs.
  • Proactively review, recommend, and implement changes to the live infrastructure after ensuring the right validation has been carried out.
  • Use data analysis to pick up trends before they become major problems.
  • Perform 24/7 on-call duties.

You…

  • 5+ years of experience in operating high-traffic SaaS environments.
  • Deep expertise in the mentality, processes, and tools needed to deliver high availability.
  • Skills to build a fully automated, highly elastic cloud orchestration framework on AWS.
  • Expertise running containerized infrastructure in Production (Kubernetes using EKS)
  • Experience implementing configuration management and automation solutions using Infrastructure as Code, CI/CD and GitOps (Ansible, Terraform, Jenkins, ArgoCD, Github Actions)
  • Strong working knowledge of Linux and its underlying components, system statistics, performance tuning, filesystems and IO.
  • Solid scripting skills (e.g. Bash, Python).
  • Development experience (e.g. Python, PHP, Java, Kotlin).
  • Experience with performance diagnostics, performance tuning, capacity planning, and monitoring.
  • BS in Computer Science or equivalent.
  • Good verbal and written communication skills.

Technologies you are likely to be working with...

AWS, Docker, Kubernetes, Helm, ArgoCD, Terraform, Ansible, MySQL/Aurora, Nginx, Loft, Devspace, Elasticsearch, Kafka, Memcached, Redis, RabbitMQ, Jenkins, Github, Bash, Python, PHP, Java, Kotlin, Sumologic, NewRelic, PagerDuty

    Why you'll love it here...

    • Market competitive pay.
    • Rich healthcare benefits and supportive time off policies.
    • Monetary support for new hire setup, hybrid work & wellbeing, and family planning.
    • A variety of mental and wellness programs to support employees.
    • Learning & development and recognition programs.
    • “Gives Back” Program where employees can nominate a fundraiser every week for a donation from the company.
    • Inclusion, diversity, equity, and belonging are vital to our priorities and we continue to evolve our strategy to ensure DEI is embedded in all processes and programs. Our Diversity, Equity, and Inclusion team is always finding new ways for our company to uphold and represent the experiences of all of the people in our organization.
    • Employee resource groups.
    • Your work has a real purpose and will help change lives on a global scale.
    • You'll be a part of a fun, supportive team that works hard and celebrates accomplishments together.
    • We live by our core values: impatient to be great, find a way, earn trust every day, fueled by purpose.
    • We are a certified Great Place to Work, are growing fast and have incredible opportunities ahead!

    The company is proud to be an equal opportunity employer that actively pursues candidates of diverse backgrounds and experiences. They are committed to providing diversity, equity, and inclusion training to all employees, and they do not discriminate on the basis of race, color, religion, ethnicity, nationality or national origin, sex, sexual orientation, gender, gender identity or expression, pregnancy status, marital status, age, medical condition, mental or physical disability, or military or veteran status.