- You will help us expand on our monitoring capabilities by identifying valuable data and ensuring it is parsed and searchable by our engineering teams.
- We use a combination of logs, system metrics, and application performance metrics to help debug interesting challenges.
- As our services grow, the more interesting these challenges can become, so we will rely on you to help us identify, resolve, and predict these issues.
- You will build infrastructure inside of AWS via code.
- All of our environments are expected to be scripted and checked in, so familiarity with tools such as Terraform or CloudFormation will come in handy here.
- You will architect secure and robust solutions with regional disaster recovery in mind.
- You will help design ‘self-healing’ solutions to help ensure the stability and security of our services, in addition to helping control costs.
- We regularly write custom code to help perform monotonous tasks that would normally require human intervention.
- This requires careful consideration and lots of testing, but tends to be pretty fun.
- Deep knowledge and production experience in designing, deploying and administering complex Amazon AWS cloud applications (API Gateway, Lambda, ECS, ALB, WAF, EC2, RDS, Elasticache, Elasticsearch, SQS, IAM, VPC, Cloudformation)
- Experience working with configuration management tools (Puppet, Chef, Ansible)
- Production experience with docker
- An in-depth knowledge of Linux troubleshooting, including networking, file systems, security, and the kernel
- Strong knowledge of TCP/IP networking, including both hardware and host-based routing, VLAN’s, firewalls, subnetting, and load balancing
- In depth knowledge of troubleshooting tools for debugging/tuning Restful API’s
- Excellent knowledge of git best practices (Git Flow)
- Good understanding of modern micro-services architectures
- Production experience gathering, digesting, and improving monitoring and performance metrics
- Experience designing and enforcing disaster recovery plans and business continuity contingencies
- Experience being on-call in a 24/7 production environment
- Meticulous attention to detail and strong organization skills
- Bachelor’s degree in Computer Science, Information Systems, Engineering, or other related disciplines and 5+ years of experience in IT infrastructure services or related field with at least 5 years of RHEL, CentOS, Ubuntu or Debian Linux experience
- Additional training, technical certification, and/or year’s experience may be substituted in lieu of a degree
What We Offer:
- Competitive compensation and benefit packages
- A quickly growing, great work environment that supports growth and development
- A company who enjoys having fun; holiday and summer parties, annual global company off-site, experienced a private Star Wars pre-opening day viewing and lots of other great stuff