Duties:
● Performs day-to-day management of the cloud platform operations, identifies issues and risks and recommends possible issue and risk mitigation strategies associated with the project.
● Lead operations team in providing tier 1 and tier 2 support for cloud platforms.
● Design, Develop and maintain incident management workflows
● Monitors issues and provides resolutions for up-to-date status reports.
● Coordinate emergency response for severity 1 and 2 incidents
● Monitor and coordinate all system operations, including security procedures, and liaison with infrastructure, security, DevOps, data and application teams.
● Ensure that necessary system backups are performed, and storage and rotation of backups is accomplished.
● Monitor and maintain records of system performance and capacity to arrange vendor services or other actions for reconfiguration and anticipate requirements for system expansion.
● Coordinate major/minor software installation, upgrade and patch management.
● Must demonstrate a broad understanding of client IT environmental issues and solutions and be a recognized expert within the IT industry.
● Must demonstrate advanced abilities to team and mentor and possess demonstrated excellence in written and verbal communication skills.
Education:
• A Bachelor's Degree from an accredited college or university with a major in Computer Science, Information Systems, Engineering, Business, or other related scientific or technical discipline. A Master's Degree is preferred.
Experience:
● At least eight (8) years of experience in managing on-premise and cloud based multi user environments with expertise in planning, designing, building, and implementing IT systems.
● At least eight (8) years of product administration/management experience in RHEL Linux based environment or Windows server environment
● Must demonstrate a broad understanding of client IT environmental issues and solutions and be a recognized expert within the IT industry.
● Must demonstrate advanced abilities to team and mentor and possess demonstrated excellence in written and verbal communication skills.
● 5+ years of experience leading tier 1 and tier 2 support for cloud platform
● 3+ years of experience on operations large platform of over 2000 instances
● Experience working with incident management and change management tools
● Experience developing and maintaining incident management and change management workflows
● Experience working with incident triaging and knowledge management
● Experience working with security teams to perform security monitoring, audits and remediation.
|