Senior AIOps Solution Architect
Office in Chennai, Pune, Bangalore, Hyderabad
Solution Architecture
& others
can't find the job you are looking for?
Send us your CV to get a personalized offer.
We are seeking a highly experienced Senior AIOps Solution Architect with exceptional expertise in Gen-AI-enabled Cloud Engineering, Observability, Operational Intelligence, and AI-driven automation. The ideal candidate will bring 10+ years of enterprise-level architecture experience, with a focus on building innovative Gen-AI-enabled platforms, data-driven automation frameworks, and enterprise-grade AIOps solutions to advance operational efficiency.
Responsibilities
- Design and deliver scalable Gen-AI-powered AIOps solutions for large enterprise platforms to improve MTTR, achieve automated incident resolution, and drive operational excellence
- Architect and implement Gen-AI & LLM Engineering solutions using tools such as Amazon Bedrock, Azure OpenAI, Vertex AI, Anthropic, and LangChain
- Develop and optimize MLOps pipelines and model deployment workflows leveraging SageMaker, Azure ML, clustering, topic modeling, and anomaly detection techniques
- Implement RAG, Vector DBs, and advanced semantic search across platforms using PGVector, Elasticsearch, and Bedrock Knowledge Sources
- Create and automate solutions for Cloud Platforms and Infrastructure with AWS, Azure, GCP, Terraform, CloudFormation, and Helm, alongside Python and Shell Scripting
- Lead Kubernetes-based container orchestration and DevSecOps initiatives, including CI/CD pipelines, Istio, and KEDA deployment strategies
- Design and integrate serverless and cloud-native architectures using API Gateway, Lambda, Step Functions, DynamoDB, S3, and Kinesis
- Implement end-to-end Observability solutions using DataDog, OpenTelemetry, Dynatrace, New Relic, Splunk, Moogsoft, and BigPanda
- Ensure seamless ITSM and ServiceNow integration for AI-driven operations and automation
- Work with ITSM tools like ServiceNow, Jira Service Management, and Manage Engine to streamline incident management workflows
- Provide thought leadership in AIOps, automation, and AI-powered operational intelligence to leadership and engineering teams
Requirements
- 19+ years of overall IT experience
- 10+ years of professional experience in Enterprise Cloud, Infrastructure Engineering, SRE, Automation, and Architecture roles
- Proven track record of delivering Gen-AI-powered AIOps solutions in production environments, driving efficiencies like MTTR improvement and operational automation
- Expertise in Gen-AI and LLM Engineering tools such as Amazon Bedrock, Azure OpenAI, Vertex AI, Anthropic, LangChain, and Bedrock Agents
- Proficiency in RAG, Vector Databases, and semantic search solutions like PGVector, Elasticsearch, and Bedrock Knowledge Sources
- Background in MLOps, model development, and machine learning techniques using SageMaker, Azure ML, clustering, topic modeling, and anomaly detection
- Skills in cloud engineering and automation technologies, including AWS, Azure, GCP, Terraform, CloudFormation, Helm, Python, and Shell Scripting
- Capability to design and operate Kubernetes-based infrastructure, CI/CD pipelines, security automation, Istio, and KEDA
- Familiarity with serverless computing and cloud-native tools like API Gateway, Lambda, Step Functions, DynamoDB, S3, and Kinesis
- Knowledge of Observability platforms such as DataDog, OpenTelemetry, Dynatrace, New Relic, Splunk, Moogsoft, and BigPanda
- Understanding of ITSM platforms, including ServiceNow, Jira Service Management, and Manage Engine
- Showcase of AI and Machine Learning expertise in areas like anomaly detection, GenAI implementation, and agentic AI solutions
- Ability to communicate effectively in both written and spoken English (B2 level or higher)
Nice to have
- Experience leading AIOps/Cloud Practices or platform engineering organizations
- Certifications in AWS ML, Cloud Architecture, or AI Leadership