backGo to search

Lead Site Reliability Engineer – SRE, Python

hot
bullets
Site Reliability Engineering, DevOps, Jenkins, Docker, Kubernetes, Datadog, New Relic, Splunk, Grafana
bullets
Bangalore

We are seeking a talented and motivated Lead Site Reliability Engineer (SRE) to join our organization.

The selected candidate will be instrumental in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications.

Responsibilities
  • Ensure system reliability and scalability through effective monitoring and performance tuning
  • Develop and manage SLAs, SLOs, and SLIs for services
  • Design, implement, and oversee microservices architecture
  • Streamline alert management and incident response processes
  • Craft and maintain code, including logical formulation
  • Facilitate network communication across services in a microservices environment
  • Create and optimize database queries for efficiency
  • Administer the creation and maintenance of YAML files
  • Architect and maintain continuous integration and continuous deployment (CI/CD) pipelines
Requirements
  • 8 to 12 years of hands-on experience in site reliability engineering
  • Proficiency in monitoring, observability, and system performance tuning
  • Background in configuring and managing SLAs, SLOs, and SLIs
  • Competency in microservices architecture and its lifecycle management
  • Experience in alert management and incident resolution
  • Skills in coding and logic formulation
  • Understanding of network communication in microservices contexts
  • Expertise in writing and optimizing DB queries
  • Familiarity with writing and handling YAML files
  • Capability to design and implement effective CI/CD pipelines