Job Description
We are strengthening our incident management team. You will be at the helm, managing incidents and leading the way. Your role at Dynatrace is crucial in ensuring best-in-class reliability and shaping incident response for our customers. Your detailed responsibilities in this new team will be
Prepare for Effective Incident Response:
- Response Coverage: Join a new global team of Incident Commanders coordinating incidents 24/7 in a follow-the-sun model
- Training and Preparedness: Train teams on incident response protocols and ensure readiness for critical incidents
- Process Improvement: Ensure our incident management process fits best-in-class, aligning with industry standards, company, and customer need
Navigate Critical Incidents with Success:
- Incident Coordination: Manage high-severity incidents, leading temporary response teams to ensure timely resolution and minimal business impact.
- Analysis and Mitigation: Coordinate the team to understand impacts, perform forensics, categorize and mitigate incidents, ensuring the right experts are engaged.
- Communications: Ensure all personnel know their roles during incidents. Keep teams aligned and ensure regular updates to customers and internal stakeholders.
Continuously Learn and Improve:
- Postmortem Management: Lead blameless postmortem sessions, reviewing incident response and resilience, and tracking execution of improvement actions
- Metrics and KPIs: Define and track key metrics to measure the effectiveness of incident management and leverage them for data-driven improvement planning.
- Customer Interaction: Prepare detailed postmortem write-ups for customers, providing clear and actionable insights. Monitor and report on SLAs.
- Stakeholder Communication: Maintain a holistic view of production status and communicate updates to internal stakeholders and customers.