About Us
As enterprises progress through their digital transformation, Finance teams need better visibility into business processes, forecast accuracy, and the ability to audit the digital business at a touch of a button. At Auditoria.AI, we build the next generation of AI-enabled systems for Enterprise Finance teams. Founded in 2019 and backed by Venrock, Neotribe Ventures, Workday Ventures, and Engineering Capital, we build cognitive applications for Corporate Finance by combining fine-grained analytical orchestration of a company’s typical financial and audit workflows with conversational AI, delivering rapid value to the finance/audit back office.
We’ve earned industry recognition by being named to the Intelligent Apps Top 40 List, SSON’s Shared Services & Outsourcing Impact Awards, the Constellation Research ShortList for AI-Driven Cognitive Applications, HFS Research Hot Vendors, 2021 CRN Emerging Vendors List, TiE50 Award, and the winner of the inaugural Pitch Event by Constellation Research.
Opportunity
We are seeking a passionate and driven SRE/CloudOps Lead to own Production environments and participate as part of a broader team in various SRE/CloudOps activities. You must have a passion for service quality/uptime and customer focus. In this role, you will have responsibility for provisioning infrastructure, container deployments, and setting up monitoring and alerting systems for our innovative multi-tenant financial SaaS product that comprises cutting-edge AI and NLP technologies.
Responsibilities:
- In this highly hands-on role, you will design, develop and implement end-to-end infrastructure solutions for multi-tenant microservices architecture SaaS app, and you will own the responsibility for system reliability, scalability, performance, and security
- Implement and continuously improve CI/CD pipelines
- Own monitoring and alerting at various layers (App, Network, and OS levels) of the service
- Ensure accessibility, security, reliability, availability, and performance of infrastructure
- Support, maintain and troubleshoot production issues and alerts and participate in 24/7 on-call production support rotations
Qualifications:
- 10+ years experience in SRE (Site Reliability Engineering) and DevOps roles owning the responsibility of large scale enterprise SaaS service in production environments
- Significant experience with AWS public cloud technologies implemented large scale container clusters: AWS, EKS, Infrastructure as Code: Terraform, Containers technologies (Docker and Kubernetes), and IAM
- Strong programming/scripting skills with one or more scripting languages (Python, Go, Ruby, Bash, etc.) and strong Linux OS and networking fundamentals
- Experience building monitoring systems to ensure high availability, performance, and security integrity (e.g., ELK-stack, Pingdom, Opsgenie/Pagerduty, Kiali, Weave Scope, CloudWatch, CloudTrail, etc.)
- Hands-on experience in operating microservices architecture based SaaS products, REST web services, SSO (Okta, Auth0), EC2-RDS, MySQL and Elasticsearch
- Experience with backup strategies and Disaster Recovery for RDS and Elastic Search.
- AWS System Architect certification is strongly preferred
- Capacity sizing to meet the requirements & SLAs of the target state and in transition as applicable
- Self-motivated and excited about the ambiguity, opportunity, and self-direction required at an early stage startup
Auditoria.AI offers competitive startup salaries, early-stage stock, unlimited vacation, medical/dental benefits, 401k, and a flexible work environment.