Site Reliability Engineering Manager
Summary Description
Granicus is seeking an experienced and highly skilled Senior Site Reliability Engineering Manager (SRE) to join our SRE team. As a Manager, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will lead efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability.
Essential Function
???On-call Production Support: Manage a team of engineers to provide production support on a shift according to the team on-call roster.
???Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web interface.
???Work on SREs backlog items.
???Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability.
???Automate Processes: Develop and maintain automation scripts and tools to streamline operations and reduce manual intervention.
???Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence.
???System Improvements: Participate in the design and implementation of system improvements to enhance reliability, scalability, and performance.
???Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes.
???Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing within the team.
???Capacity Planning: Assist in capacity planning activities to anticipate future needs and ensure that our infrastructure can handle growth.
???Security: Implement and adhere to security best practices to protect our systems and data.
Knowledge/Skills/Abilities
Other Job Info
Security Requirement:
apply to this job