About the role
.
About Nscale
Nscale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI start-ups and large enterprise customers. Nscale enables AI-focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.
We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.
AI Operations Strategist
Role Overview
The AI Operations Strategist will lead the development and management of Nscale’s operational analytics and reporting function, delivering the visibility, insights, and predictive intelligence required to scale a high-performance AI Infrastructure organization.
This role partners closely with Support Leadership, Engineering, Datacenter Operations, and Customer Success teams to define and operationalize KPIs, dashboards, and reporting frameworks that measure service performance, infrastructure reliability, customer experience, and operational efficiency.
The successful candidate will transform operational data from platforms such as Jira, ServiceNow, monitoring systems, and internal tooling into actionable insights that improve decision-making, optimize support investments, and strengthen customer outcomes.
As Nscale’s AI Infrastructure organization scales, there is a growing need for a centralized operations analytics capability that can consistently deliver reporting and insights across three critical stakeholder groups:
- Operational metrics for internal support management
- Executive reporting for Nscale leadership
- Customer-facing reporting on service and support performance
Today, many operational questions around capacity planning, SLA adherence, reliability trends, and support effectiveness are addressed reactively through fragmented or ad hoc analysis. This creates gaps in visibility, inconsistent decision-making, and limited ability to proactively manage infrastructure performance at scale.
The AI Operations Strategist will establish a structured, scalable reporting and analytics function that enables proactive operational management, faster identification of risks and inefficiencies, and data-driven decision-making across cost, performance, reliability, and customer experience.
This is a highly impactful role at the intersection of AI infrastructure, operations strategy, analytics, and customer success.
Key Responsibilities
Reporting & Dashboard Strategy
- Design, build, and maintain operational dashboards and reporting frameworks for AI Cloud Support and Infrastructure Operations.
- Develop executive, operational, and customer-facing reporting that communicates:
- Service performance
- SLA/SLO adherence
- Reliability trends
- Operational health
- Customer experience metrics
- Create scalable visualizations and dashboards using Jira, Power BI, Tableau, Looker, and internal analytics platforms.
- Standardize KPI definitions, reporting structures, and measurement methodologies across teams and customers.
KPI & Operational Metrics Management
Define, track, and continuously evolve operational KPIs, including:
- SLA / SLO compliance
- MTTR (Mean Time to Repair)
- Incident trends and severity analysis
- Problem management effectiveness
- Change success and failure rates
- Capacity utilization and forecasting
- Support efficiency and productivity metrics
- Customer satisfaction and experience indicators
Additional responsibilities include:
- Ensuring data integrity, consistency, and governance across reporting systems
- Establishing metric ownership, refresh cadences, and reporting standards
- Partnering with leadership to mature operational measurement frameworks as the organization scales
Data Analysis & Operational Insights
- Analyze operational and infrastructure data to identify trends, risks, inefficiencies, and opportunities for optimization.
- Deliver actionable recommendations that improve reliability, scalability, support efficiency, and customer outcomes.
- Conduct deep-dive analyses into recurring incidents, performance degradation, and operational bottlenecks.
- Support Problem Management and RCA initiatives with data-driven insights and trend analysis.
- Provide predictive guidance for support capacity planning and operational scaling.
Tooling, Automation & Data Integration
- Leverage Jira reporting capabilities, including JQL, dashboards, filters, and automation workflows.
- Develop automated pipelines and workflows to extract, transform, and visualize data across multiple systems.
- Integrate operational data sources such as Jira, ServiceNow, monitoring platforms, observability tooling, and internal systems into unified reporting views.
- Continuously improve reporting automation and self-service analytics capabilities to increase scalability and reduce manual effort.
Cross-Functional Collaboration
- Partner with Support, Engineering, Datacenter Operations, Product, and Customer Success teams to align on reporting requirements and operational priorities.
- Collaborate directly with customer-facing teams to tailor reporting outputs for strategic accounts and enterprise customers.
- Support leadership reviews, operational business reviews (OBRs), and executive presentations with clear, data-driven insights and dashboards.
Governance & Continuous Improvement
- Establish operational reporting governance, including KPI definitions, ownership models, and reporting standards.
- Drive consistency and scalability across global reporting practices.
- Continuously refine dashboards, reporting frameworks, and operational analytics based on stakeholder feedback and evolving business needs.
- Champion a culture of data-driven operational excellence across the AI Infrastructure organization.
Impact & Value
The AI Operations Strategist will play a foundational role in building a centralized operations analytics capability within Nscale’s AI Infrastructure organization.
By transforming infrastructure telemetry, support operations, and customer service data into meaningful operational intelligence, this role will enable:
- Proactive capacity planning
- Faster detection of operational risks and inefficiencies
- Improved SLA performance and service reliability
- Data-driven prioritization and investment decisions
- Enhanced customer transparency and trust
- Scalable operational governance across a rapidly growing AI platform
This role is critical to ensuring that strategic and operational decisions are grounded in measurable outcomes, enabling Nscale to scale infrastructure operations with precision, consistency, and confidence.
Qualifications
Education & Experience
- Bachelor’s degree in Information Systems, Data Analytics, Computer Science, Engineering, or a related field.
- 3–7+ years of experience in operations analytics, reporting, business intelligence, or technical operations within a cloud, infrastructure, or enterprise technology environment.
- Experience supporting cloud infrastructure, AI platforms, technical support organizations, or IT operations teams.
- Strong hands-on experience with Jira reporting, dashboard development, and workflow analytics.
- Experience with BI and visualization platforms such as Power BI, Tableau, or Looker.
- Experience with SQL, Python, JQL, or other scripting/query languages preferred.
Knowledge & Skills
Technical & Analytical Expertise
- Strong expertise in operational analytics, reporting, and data visualization.
- Advanced knowledge of Jira, including JQL, dashboarding, and reporting customization.
- Experience working with large operational datasets and integrating multiple data sources.
- Familiarity with ITIL practices, service operations metrics, and infrastructure support models.
- Understanding of cloud infrastructure operations, observability, and reliability metrics preferred.
Problem Solving & Insight Generation
- Ability to transform complex operational data into clear, actionable insights.
- Strong analytical thinking with exceptional attention to detail.
- Proven ability to identify operational trends, inefficiencies, and optimization opportunities.
- Experience supporting data-driven operational decision-making in fast-paced environments.
Communication & Stakeholder Management
- Strong communication and presentation skills with the ability to engage both technical and executive audiences.
- Experience developing executive-level dashboards, reporting, and operational reviews.
- Ability to influence cross-functional stakeholders through data-driven storytelling and operational insights.
Automation & Operational Excellence
- Experience automating reporting workflows and improving analytics scalability.
- Continuous improvement mindset focused on operational efficiency and process optimization.
- Ability to build scalable reporting frameworks that evolve alongside organizational growth.
What We Can Offer You
At Nscale, you'll find a collaborative, supportive, and innovative environment where your contributions spark real impact. We're building something extraordinary, and we want you at the core.
- Highly competitive package (base + equity) with reviews every 12 months.
- Join the fastest-growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting-edge AI.
- Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status quo, and owning your impact, always with our full support.
- Human-First Flexibility: We treat you as humans first. Our flexible workplace trusts Nscalers to deliver, giving you the autonomy to shape your day around life's moments.
Join our thriving remote-first team. Geography is no barrier to impact or connection. We build seamless virtual collaboration, empowering you, wherever you work.
Equal Opportunities Statement
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.
If there’s anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice: Here.
Find similar jobs
Explore opportunities with similar job descriptions at other companies.