At Auditoria.ai, we are building the next generation of AI-enabled systems for Enterprise Finance professionals. As enterprises progress through their digital transformation, Finance teams need better visibility into business processes, forecast accuracy, and the ability to audit the digital business at the touch of a button. Founded in 2019 and backed by Neotribe Ventures, Engineering Capital, and Firebolt Ventures, we build intelligent automation by combining fine-grained analytical orchestration of a company’s typical financial and audit workflows with conversational AI, delivering rapid value to the finance/audit back office.
We are seeking a passionate and driven SRE/DevOps Lead Engineer to own Production environments and participate as part of a broader team in a variety of SRE/DevOps activities. You must have a passion for service quality/uptime and customer focus. In this role, you will have responsibility for provisioning infrastructure, container deployments, setting up monitoring and alerting systems for our innovative multi-tenant financial SaaS product that comprises cutting edge AI and NLP technologies.
- In this highly hands-on role, you will design, develop and implement end-to-end infrastructure solutions for multi-tenant microservices architecture SaaS app. You will own the responsibility of system reliability, scalability, performance, and security.
- Implement and continuously improve CI/CD pipelines.
- Set up monitoring, alerting at various layers (App, Network, and OS levels) of the service.
- Ensure accessibility, security, reliability, availability, and performance of infrastructure
- Support, maintain and troubleshoot production issues and alerts and participate in 24/7 on-call production support rotations.
- 10+ years’ experience in SRE (Site Reliability Engineering) and DevOps roles owning the responsibility of large-scale enterprise SaaS service in production environments.
- Significant experience with AWS public cloud technologies implemented large scale container clusters: AWS, EKS, Infrastructure as Code: Terraform, Containers technologies: Docker and Kubernetes, and IAM.
- Strong programming/scripting skills with one or more scripting languages (Python, Go, Ruby, Bash, etc.) and strong Linux OS and networking fundamentals
- Experience building monitoring systems to ensure high availability, performance, security integrity (e.g., ELK-stack, Pingdom, Opsgenie/Pagerduty, Kiali, Weave Scope, CloudWatch, CloudTrail, etc.)
- Hands-on experience in operating microservices architecture based SaaS products, REST web services, SSO (Okta, Auth0), EC2-RDS, MySQL, and Elasticsearch.
- Experience with backup strategies and Disaster Recovery for RDS and Elasticsearch.
- AWS System Architect certification strongly preferred
- Capacity sizing to meet the requirements & SLAs of target state and in transition as applicable.
- Self-motivated and excited about the ambiguity, opportunity, and self-direction required at an early-stage start-up.
Auditoria.AI offers competitive startup salaries, early-stage stock, unlimited vacation, medical/dental benefits, and a flexible work environment.