Job Description
About the role
The Senior Data Center Operations professional is responsible for the availability, reliability, and operational excellence of mission-critical data center infrastructure. This role operates as a technical lead on site, owning complex operational activities, incident response, and advanced troubleshooting while mentoring junior technicians and supporting continuous improvement initiatives.
The role requires deep hands-on expertise, sound judgment in high-pressure situations, and the ability to operate independently within a 24/7 critical environment.
Technical Authority & Ownership
-
Senior-level ownership of data center operations and infrastructure stability
-
Authority to lead incident response and complex troubleshooting activities
-
Recognition as a subject-matter expert within the operations team
-
Direct influence on operational standards, procedures, and improvements
Professional Standing & Growth
-
Positioning as a senior technical reference within the data center
-
Opportunity to mentor technicians and shape operational best practices
Your responsibilities
Senior Operations & Infrastructure Management
-
Oversee installation, configuration, testing, and maintenance of critical data center hardware and systems
-
Ensure operational readiness and compliance with availability, security, and safety standards
-
Act as escalation point for complex or high-impact operational issues
Monitoring, Incident Leadership & Troubleshooting
-
Lead response to infrastructure incidents, outages, and performance degradation
-
Perform advanced troubleshooting across hardware, OS, networking, and storage layers
-
Coordinate with engineering, network, facilities, and vendor teams during incidents
-
Drive root cause analysis (RCA) and corrective actions
Preventive Maintenance & Reliability
-
Own and improve preventive and predictive maintenance programs
-
Validate maintenance procedures and execution quality
-
Identify risks, single points of failure, and reliability gaps
Project Execution & Change Management
-
Lead or support complex operational projects such as:
-
Data center expansions
-
Hardware refresh programs
-
Infrastructure upgrades
-
Execute changes in line with change management and risk controls
Documentation, Standards & Mentorship
-
Own and maintain senior-level operational documentation and SOPs
-
Contribute to audits, compliance reviews, and operational assessments
-
Mentor and support junior and mid-level technicians
-
Promote a strong culture of safety, discipline, and continuous improvement
Your key competencies
Education
- Degree in Computer Science, Information Technology, Engineering, or equivalent experience
Experience
-
8–12+ years of experience in data center operations or mission-critical IT environments
-
Proven experience leading operational activities in 24/7 critical facilities
-
Demonstrated ownership of incident management and reliability initiatives
Technical Expertise
-
Deep hands-on expertise with:
-
Server, storage, and rack infrastructure
-
Networking fundamentals and connectivity troubleshooting
-
Linux
-
Strong understanding of monitoring, DCIM, ticketing, and change management tools
Leadership & Personal Attributes
-
Strong decision-making under pressure
-
High sense of ownership and accountability
-
Ability to mentor and guide less experienced technicians
-
Excellent written and verbal communication skills in English
-
Proactive, detail-oriented, and reliability-focused mindset
Company Info
DataCrunch
European AI cloud infrastructure provider offering GPU-accelerated virtual machines and computing cl...
