Infrastructure & Platform Engineer
Engineers in this role architect and operate the systems that power AI research and product development at scale. They design distributed infrastructure for training, serving, and orchestrating AI workloads across GPU clusters, build internal platforms that accelerate developer velocity, and optimize the critical path from code to production. This role bridges deep systems engineering expertise—in areas like Kubernetes, build systems, data pipelines, and performance tuning—with the unique demands of AI workloads, combining hands-on infrastructure work with close collaboration with researchers and product teams to eliminate bottlenecks that slow down innovation.
Skills
What companies are looking for in this role.
Designing and implementing highly scalable distributed systems and control planes
Implementing observability, monitoring, and telemetry systems at scale
Building and operating reliable, high-performance storage and compute infrastructure
Managing infrastructure automation and infrastructure-as-code practices
Designing cloud-native infrastructure and multi-tenant platform architectures
Troubleshooting and optimizing complex distributed systems in production
Leading technical architecture decisions and setting platform direction
Developing and optimizing real-time, low-latency serving systems and APIs
Designing network architecture, SDN solutions, and advanced packet processing
Building data pipelines, storage systems, and database optimization
Designing and implementing autoscaling systems for varying workload patterns
Implementing privacy-by-design and security automation practices
Managing data center expansion, retrofitting, and facility operations
Optimizing systems for high-density, power-constrained computing environments
Building AI-native development tools and CI/CD infrastructure
Building real-time voice and audio streaming infrastructure
Designing and implementing agentic diagnostic and automation systems
Collaborating cross-functionally with research, product, and operations teams
Driving incident response, post-incident learning, and operational excellence
Mentoring engineers and building high-performing technical teams
Technology
The tools and technologies that define this role.
Open Jobs
493 open Infrastructure & Platform Engineer jobs across 63 companies.
Other Engineering roles
General-purpose software engineering roles focused on building and maintaining software systems. Covers generalist SWE positions that don't clearly fall into frontend, backend, fullstack, or other specialized tracks.
Engineers focused on server-side systems, APIs, services, and data processing pipelines. Includes roles explicitly labeled as backend or server-side development.
Engineers specializing in user-facing interfaces, web applications, and client-side development. Includes UI/UX engineering and web development roles.
Engineers working across the entire application stack, handling both frontend and backend responsibilities.
Engineers embedded with customers or deployed on-site to solve domain-specific technical problems. Combines engineering skills with direct client interaction.