Senior, Site Reliability Engineer
Pune, Maharashtra, India
Posted on Friday, April 14, 2023
Aera Technology is the Decision Intelligence company. We deliver innovation and services that enable enterprises to operate sustainably, intelligently, and efficiently. Our platform, Aera Decision Cloud™, integrates with your existing systems to digitize, augment, and automate decisions in real time. Aera helps enterprises around the world transform decision-making – delivering millions of recommendations that have resulted in significant revenue gains and cost savings for some of the world’s best-known brands.
The SRE team supports the development, enhancement, and maintenance of the cloud infrastructure that our applications and services run on. Aera's SRE group manages the architecture and engineering of all environments from production and acceptance to sandbox and sales. The team develops infrastructure as code, monitoring solutions for the health, performance, and reliability of the Aera stack, and in general, "keeps the lights on" by providing tier III support for our 24/7 Platform Operations team. The SRE team is also on the front line of adopting and developing state-of-the-art infrastructure to continuously evolve the platform.
The primary responsibilities for this role will be to use your background as an expert in Kubernetes and cloud-native infrastructure to work closely with our product engineering teams from the early stages of design through deployment as well as identification and resolution of production issues that relate to infrastructure. You will be responsible for working with our security teams to develop solutions that adequately protect Aera intellectual property and customer data and you will function as an escalation point for others to consult with and trust as well as a mentor for other team members.
- Design and development for the running and monitoring of Aera’s production infrastructure including acting as a primary engineering contributor for our transformation into a Gitops driven, Kubernetes-based platform.
- Explore, evaluate, and integrate the latest, best of breed tooling and components used for modern Kubernetes deployment and management
- Triaging and troubleshooting complex production issues to ensure reliability and performance
- Identifying and automating manual processes
- Continuously evolving our monitoring tools and platform
- Promoting and applying best practices for building scalable and reliable services across engineering
- Developing and maintaining technical documentation, runbooks, and procedures
- Tier III support for a 24x7 online environment as part of an on-call rotation providing a response to production incidents and participating in root cause analysis and problem management
- Bachelor degree or higher in Information Technology, Engineering or a related field is desired but not required
- 5+ years of SRE/DevOps/infrastructure experience
- 5+ years of experience deploying, operating and/or debugging server software on Linux at scale
- 2+ years of hands on experience with the setup, configuration and maintenance of Kubernetes in either a managed service model such as EKS/AKS/GKE or as bare metal Kubernetes
- 2+ years of hands on experience deploying, configuring and troubleshooting Kubernetes in production workloads
- Demonstrable experience using one or more of Crossplane, Kustomize, Helm and/or vCluster for deploying virtual clusters is required.
- Practical experience with Kubernetes on a private cloud platform is highly desired
- Experience automating and running large scale production Java services in AWS, Azure or other cloud providers
- Advanced knowledge of configuration management and orchestration tools (Ansible, Terraform) and automating and streamlining tasks in an SRE/Operations engineering context using scripting languages such as Python, Go, Ruby, etc…
- Experience with the use, maintenance and configuration of monitoring, metrics and logging infrastructure (ELK, Prometheus/Grafana, Nagios, etc.)
- Comfortable working with modern databases and big data platforms (SQL, etc.) MySQL automation a big plus
If you share our passion for building a sustainable, intelligent, and efficient world, you’re in the right place. Established in 2017 and headquartered in Mountain View, California, we're a series D start-up, with teams in Mountain View, San Francisco (California), Bucharest and Cluj-Napoca (Romania), Paris (France), Munich (Germany), London (UK), Pune (India), and Sydney (Australia). So join us, and let’s build this!
Aera Technology is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status.
At Aera Technology, we strive to support our Aeranauts and their loved ones through different stages of life with a variety of attractive benefits, and great perks. In addition to offering a competitive salary and company stock options, we have other great benefits available. You’ll find comprehensive medical, Group Medical Insurance, Term Insurance, Accidental Insurance, paid time off, Maternity leave, and much more. We offer unlimited access to online professional courses for both professional and personal development, coupled with people manager development programs. We believe in a flexible working environment, to allow our Aeranauts to perform at their best, ensuring a healthy work-life balance. When you’re working from the office, you’ll also have access to a fully-stocked kitchen with a selection of snacks and beverages.
See more open positions at Aera Technology
Something looks off?