Zak Kartz

Senior Incident Manager · Cloud Operations · SecOps
Open to relocation or remote

Senior incident management and cloud operations professional with 7+ years at Epic Systems, a major healthcare SaaS provider. Experienced leading cross-functional teams through high-severity incidents, driving process improvements, and managing 24/7 operations for platforms serving 50,000+ users at 99.999% uptime. Holds dual ITIL 4 certifications and AWS CCP; actively expanding into security operations and AI-assisted workflows.

Epic Systems
2018 – Present
Cloud Operations · Incident Management · Security Operations · Madison, WI
Senior Incident Manager 2021 – Present
  • Served as primary incident commander for Epic's SaaS EHR platform, coordinating resolution across engineering, operations, and communications for 50,000+ users in mission-critical healthcare environments.
  • Reduced incident response times by 50% through redesigned triage flows and escalation paths; became process owner for RCAs, driving measurable improvement in policy compliance and recurrence reduction.
  • Took ownership of the RCA process end-to-end — reduced missed RCA deadlines by nearly 40% (from 80% to 50%), led quarterly RCA review meetings with cross-functional stakeholders, and presented process improvements and team goals to an audience of 400 at an internal all-hands.
  • Adopted and overhauled the new hire RCA training curriculum; trained approximately 100 new hires, improving team-wide consistency and outcomes.
  • Delivered executive-facing post-incident reports and action plans; managed scheduling for the incident manager team and maintained a weekly on-call shift.
  • Partnered with compliance to identify and close coverage gaps in documentation, expanding policy scope to a previously unaddressed employee subset.
Cloud Service Desk — Customer Service Lead 2020 – Present
  • Directed 25 technicians across a 24/7 Hosting Operations Center supporting 99.999% uptime for SaaS EHR infrastructure.
  • Developed shift schedules, SOPs, and performance feedback processes; oversaw CAB approval workflows for critical infrastructure changes.
  • Collaborated with engineering and SREs to improve alert fidelity and reduce alert fatigue at scale.
Security Operations 2024 – Present
  • Monitored and triaged threat activity using Splunk Enterprise Security and SOAR; executed incident response workflows for high-priority alerts.
  • Co-led cross-training program enabling operations technicians to fill in the Security Operations Center, expanding team coverage and capability.
  • Validated false positives and coordinated containment actions with the broader cybersecurity team.
Hosting Operations Center Technician 2018 – 2020
  • Monitored alerting infrastructure via Splunk ITSI; escalated and communicated incidents to affected teams in a 24/7 environment.
Additional responsibilities: Change Approval Board member · Disaster Recovery Coordinator · JIIT and Screen share access approver
Kiriworks
2016 – 2018
Support Engineer · Milwaukee, WI
  • Provided Tier 2 support for enterprise content management platforms; assisted clients with infrastructure integrations and performance tuning.
QPS Employment Group
2015 – 2016
System Support Technician · Brookfield, WI
  • Delivered hardware/software support across 20+ branch offices; deployed automated patching system that reduced ticket volume by 30%.
🎯
ITIL 4 Practitioner: Incident Management
July 2025
📋
ITIL 4 Foundation
January 2025
☁️
AWS Certified Cloud Practitioner
November 2024
M.S. Information Technology Management
Western Governors University
2021
B.S. Information Systems and Technology
University of Wisconsin–Milwaukee
2018
Incident Command SRE Collaboration RCA / Problem Management Runbook Development Splunk ITSI Splunk Enterprise Security SOAR CAB / Change Management Cross-Functional Leadership Disaster Recovery GitLab Grafana AI-Assisted Workflows Executive Communication ITIL v4