Description:
The Manager of Cloud Systems will be responsible for the client’s cloud-based infrastructure and operations supporting the hosting of our SaaS-based software solutions. This is a leadership and hands-on role that includes managing and leading a small team of experienced System Administrators and being actively involved in architecting, provisioning, securing, configuring, automating, monitoring and maintaining highly available systems on AWS.
Essential Duties and Responsibilities:
- Manage and lead the efforts of 2-3 other Systems Administrators, providing tasking, technical direction and prioritization of efforts to ensure coverage and execution of 24×7 operations, regular systems maintenance, service enhancements and growth projects
- Design, build, monitor, evaluate and modify system’s performance to ensure SLAs are met
- Analyze and isolate issues causing system inefficiencies and recurring problems
- Evaluate application infrastructure, identifying and fixing single points of failure
- Research and develop innovative, and where possible, automated approaches for system administration tasks
- Repair and recover from hardware or software failures
- Coordinate and communicate with impacted departments
- Perform periodic analysis and reporting to support capacity planning
- Ensure data back-ups, disaster recovery and information security are consistent with standards
- Develop and maintain configuration management software with Chef
- Establish and lead roll out of standard operating procedures and services
- Work with engineering and quality assurance organizations to establish the client’s DevOps culture and processes
Competencies:
- Self-driven, with strong sense of end-to-end ownership for infrastructure SLAs and issue resolution
- Natural problem-solver and change leader with drive for continuous improvement
- Ability to effectively prioritize and lead multiple, time-sensitive projects
- Leadership by example and team player with strong interpersonal and collaboration skills
- Outstanding written and verbal communication skills with ability to relay technical concepts in a way to be understood by a non-technical audience
- Very strong attention to detail
- Aptitude for learning and applying new technologies
Education and Experience:
- B.S.in Computer Science or related Engineering discipline desired
- 7-10 years of Systems Administration experience in a LAMP environment, minimum 3 years in a lead role
- Proficiency in programming and scripting using Bash and at least one of Python or Ruby
- Competency with Chef or similar configuration management software
- Certification, expertise and operations leader across AWS services in mission critical environments with at least 200 servers
- Expertise with containerization of legacy and new systems using Docker
- Proven experience supporting cloud-based environments for a SAAS-based business