Role
SENIOR ENGINEER: AUTOMATION (US)
Engineering | US (All Locations)
About Role
Responsibilities:
-
You will design, develop and implement end-to-end infrastructure solutions for our multi-tenant, microservices architecture SaaS apps. You will own the responsibility of system reliability, scalability, performance, and security.
-
Implement and continuously improve CI/CD pipelines.
-
Set up monitoring and alerting across various layers (App, Network, and OS levels) of the service.
-
Ensure accessibility, security, reliability, availability, and performance of our infrastructure.
-
Support, maintain and troubleshoot production issues and alerts and participate in 24/7 on-call production support rotations.
Skill Details
Technical Skills
10+ years experience in DevOps or SRE (Site Reliability Engineering) roles owning the responsibility for large-scale enterprise SaaS service in production environments.
-
Significant experience with AWS public cloud technologies and the implementation of large-scale container clusters: AWS, EKS, Infrastructure as Code (Terraform), and containers (Docker and Kubernetes, and IAM).Strong programming/scripting skills with one or more scripting languages (Python, Go, Ruby, Bash, etc.) with strong Linux OS and networking fundamentals.
-
Experience building monitoring systems to ensure high availability, performance, and security integrity (e.g., ELK-stack, Pingdom, Opsgenie/Pagerduty, Kiali, Weave Scope, CloudWatch, CloudTrail).
-
Hands-on experience operating microservices architecture-based SaaS products, REST web services, SSO (Okta, Auth0), EC2-RDS, MySQL, and Elasticsearch.
-
Understanding of backup strategies and disaster recovery for RDS and Elasticsearch.
-
AWS System Architect certification strongly preferred
-
Capacity sizing to meet the requirements & SLAs of the target state and in transition as applicable.
Self-motivated and excited about the ambiguity, opportunity, and self-direction required at an early-stage startup.