Lead Platform Engineer (DevOps & AI/ML/Gen-AI)
Office in Chennai
Platform Engineering
& others
We are seeking a highly skilled Lead Platform Engineer to join our Automation Engineering team and spearhead development efforts in cloud infrastructure automation, DevOps, and generative AI (Gen-AI).
The ideal candidate will have 7+ years of experience, combining deep proficiency in cloud automation, scripting, and Infrastructure-as-Code (IaC) tools with a strong background in AI/ML, especially generative AI and AIOps. Your contributions will drive innovations in AI-powered automation, enhance system efficiency, and deliver transformative solutions.
Responsibilities
- Design and maintain automated workflows for cloud infrastructure provisioning using IaC tools like Terraform and related technologies
- Develop robust automation frameworks for infrastructure deployment, configuration, and multi-cloud management
- Manage service catalog components integrated with platforms like Backstage for seamless cloud automation workflows
- Implement GenAI-based solutions, enabling automated service catalog creation and enhancing code quality across processes
- Build and optimize CI/CD pipelines to support comprehensive automation use cases across cloud environments
- Create and maintain scripts in Python, Bash, or other languages to support automation and deployment tasks
- Design genAI models such as RAG and Agentic Flows, leveraging frameworks like LangChain and platforms like Bedrock, Vertex AI, or Azure AI
- Construct vector databases and document sources using tools like Opensearch, Amazon Kendra, or equivalent technologies for AI training
- Prepare and label data, enabling optimized vector sources for AI/ML model applications
- Build agentic workflows using patterns like ReAct or cloud GenAI tools, integrating multiple operational systems effectively
- Design MLOps pipelines to deploy and maintain RAG and Agentic workflows, ensuring model performance through prompt engineering and vector source updates
- Collaborate with teams to integrate generative AI innovations into AIOps platforms, improving automation and system capabilities
- Research and recommend emerging technologies, tools, and practices to enhance infrastructure automation and AI workflow efficiency
Requirements
- Bachelor's or Master's degree in Computer Science, Engineering, or a related discipline
- 7+ years of experience in cloud automation, IaC tools like Terraform or CloudFormation, and multi-cloud platforms
- Expertise in Python for automation and AI/ML development
- Experience with generative AI frameworks, including RAG and Agentic Workflows, and their integration with tools like LangChain
- Familiarity with cloud platforms like AWS (Bedrock), Google Cloud (Vertex AI), or Microsoft Azure (Azure AI), along with associated automation tools
- Knowledge of data engineering, including data streaming, preparation, and labeling, with familiarity in cloud data lakes and vector databases
- Skills in building vector document sources using technologies like Opensearch, Amazon Kendra, or equivalent
- Proficiency in creating and optimizing CI/CD pipelines to support cloud-based automation
Nice to have
- Familiarity with state-of-the-art MLOps practices to ensure efficient AI/ML model deployment and monitoring
- Background in developing highly scalable workflows for predictive maintenance and anomaly detection using AI
- Understanding of advanced orchestration techniques for integrating AI/ML pipelines with AIOps platforms
- Capability to evaluate and adapt large language models (LLMs) for domain-specific use cases
- Showcase of innovative solutions that combine generative AI with operational insights for enhanced decision-making