Go to search
Lead Site Reliability Engineer
Site Reliability Engineering, DevOps, Amazon Web Services, Terraform, Docker, Kubernetes, Python, Google Cloud Platform, Bash, PowerShell, Microsoft Azure
Hyderabad, Pune, Bangalore, Gurgaon, Chennai
We are seeking a talented and motivated Lead Site Reliability Engineer to join our team. As a key member of our multi-disciplined team, you will play a crucial role in ensuring the reliability, performance, and security of our complex distributed systems. If you are passionate about operational risk management, have a deep understanding of Kubernetes and Containers, and possess strong problem-solving skills, this role offers an exciting opportunity to contribute to the success of our operations.
Responsibilities
- Ability to rapidly and effectively understand and translate requirements into technical solutions.
- Ability to reason about performance, security, and process interactions in complex distributed system. Passionate about managing operational risk.
- Ability to work effectively as part of a diverse multi-disciplined team.
- Motivated, self-organized and have good time & work management skills.
Requirements
- Should have 8 to 12 years of experience as Site Reliability Engineer.
- Must have expert/intermediate level knowledge of Azure (preferred) or AWS/ GCP Cloud Infrastructure, networking, security, Storage. (GCP will be decommissioned in upcoming days, just Azure is also fine)
- Must have intermediate level Python core skills.
- Must have expert/intermediate level python/cloud/windows admin debugging skills.
- Must have intermediate level knowledge of Windows or Linux administration. (Only Linux is also okay, Windows administration training can be given for 2 weeks)
- Good to have expert/intermediate level knowledge in infrastructure monitoring as well as application monitoring and related tools ELK/Opsbridge/DynaTrace
- Good to have Observability & Centralized Logging experience.
- Good to have knowledge of incident management (PagerDuty/OpsGinie/VictorOps).
- Good to have knowledge of change management.
- Good to have knowledge of SLO, SLI, SLA.
- Good to have knowledge of Kubernetes and Docker.
- Good to have knowledge of CI/CD (especially Azure DevOps)