Lead Site Reliability Engineer (Azure)
Office in Pune
Site Reliability Engineering
& others
We are seeking a highly skilled and experienced Lead Site Reliability Engineer with a focus on Azure environments to join our team.
In this crucial role, you will leverage your expertise to enhance the reliability and scalability of our cloud-based platforms, ensuring efficient operation and optimal performance. This position involves collaborating closely with cross-functional teams to migrate existing services to the OpenShift platform and make our infrastructure Cloud agnostic. As a leader, you’ll guide your team in creating resilient systems and processes that support both internal and external customers relying on our desktop applications and services.
Responsibilities
- Oversee migration of services to OpenShift and work towards making our infrastructure Cloud agnostic
- Run pipelines using Azure DevOps for environment configuration and application deployment
- Leverage Python, bash, and PowerShell to automate routine and complex tasks
- Implement and manage Kubernetes and container-based environments
- Monitor cloud resources efficiently and improve system performance in line with SLI metrics
- Debug and resolve operational issues swiftly and effectively
- Collaborate with development and operations teams to ensure system reliability and security
- Mentor team members and lead by example in maintaining best practices for site reliability
- Continuously assess, improve and optimize existing system architecture and applications
- Stay up-to-date with technological advancements and integrate innovative tools and techniques
Requirements
- 5+ years of experience as a Systems Engineer with a development background
- 1+ years of relevant leadership experience
- Proficiency in Linux and Docker with hands-on experience in Kubernetes
- Capability to use at least one of the following scripting languages: Python, Bash, PowerShell
- Background in infrastructure management including networking and operating systems
- Familiarity with monitoring tools in cloud environments and understanding of SLI concepts
- Familiarity with Azure and/or GCP as cloud service providers
Nice to have
- Experience working with Windows
- Knowledge of CI/CD pipelines, particularly Azure DevOps
- Understanding of Istio and GitOps tools like ArgoCD