backGo to search

Senior Site Reliability Engineer

Site Reliability Engineering, DevOps.CI/CD, Amazon Web Services, Terraform, Docker, Kubernetes, Python, Google Cloud Platform, Bash, PowerShell, Microsoft Azure
Hyderabad, Pune, Bangalore, Gurgaon, Chennai
We are seeking a talented and motivated Senior Site Reliability Engineer to join our team. As a key member of our multi-disciplined team, you will play a crucial role in ensuring the reliability, performance, and security of our complex distributed systems. If you are passionate about operational risk management, have a deep understanding of Kubernetes and Containers, and possess strong problem-solving skills, this role offers an exciting opportunity to contribute to the success of our operations.
  • Rapidly and effectively understand and translate requirements into technical solutions.
  • Reason about performance, security, and process interactions in complex distributed system.
  • Passionate about managing operational risk.
  • Work effectively as part of a diverse multi-disciplined team.
  • Motivated, self-organized and have good time & work management skills.
  • Identify, craft, and maintain SLIs and SLOs for teams, as well as metrics such as MTTR, Lead time for change, Deployment Frequency and Change Failure Rate.
  • Work with Application teams to set up Observability, Telemetry.
  • Familiarity with any cloud provider (especially GCP or Azure).
  • Experience with any SRE tool, good if it is Grafana, Dynatrace, Splunk.
  • 5-9 years of experience as a Systems Engineer with Development background and understanding of Kubernetes and Containers.
  • Good knowledge of Infrastructure (networking, operating systems).
  • Good knowledge of Linux.
  • Good knowledge of Kubernetes and Docker.
  • Good debugging skills and ability to handle operational issues.
  • Strong in problem solving, analytical skills, algorithms.
  • Familiarity with monitoring in Cloud and understanding of SLI concept.
  • Ability to communicate technical concepts effectively, both written and orally.
  • Strong interpersonal skills required to collaborate effectively with colleagues across diverse technology teams and locations.
  • Really good at Python, Bash, PowerShell (at least anyone).
  • Experience in the Travel & Hospitality domain is preferred.
nice to have
  • Package management solutions like Nix, Apt, Yum
  • Nice to have experience working with Windows
  • Nice to have knowledge of CI/CD (especially Azure DevOps)
  • Nice to have knowledge of Kubernetes
  • Nice to have knowledge of Istio
  • Nice to have knowledge of GitOps tools (like ArgoCD)


For you
  • Insurance Coverage 
  • Paid Leaves – including maternity, bereavement, paternity, and special COVID-19 leaves. 
  • Financial assistance for medical crisis 
  • Retiral Benefits – VPF and NPS 
  • Customized Mindfulness and Wellness programs 
  • EPAM Hobby Clubs
For your comfortable work
  • Hybrid Work Model 
  • Soft loans to set up workspace at home 
  • Stable workload 
  • Relocation opportunities with ‘EPAM without Borders’ program

For your growth
  • Certification trainings for technical and soft skills 
  • Access to unlimited LinkedIn Learning platform 
  • Access to internal learning programs set up by world class trainers 
  • Community networking and idea creation platforms 
  • Mentorship programs 
  • Self-driven career progression tool

can't find the job you are looking for?

Send us your CV to get a personalized offer.