~/Jobs/Fundamental/Senior Applied Research Engineer_

Senior Applied Research Engineer

ResearchBarcelonaOn-SiteFull-TimePosted Mar 25, 2026

About the role

About Fundamental

Fundamental is an AI company pioneering the future of enterprise decision-making. Founded by DeepMind alumni, Fundamental has developed NEXUS – the world's most powerful Large Tabular Model (LTM) – purpose-built for the structured records that actually drive enterprise decisions. Backed by world class investors and trusted by Fortune 100 companies, Fundamental unlocks trillions of dollars of value by giving businesses the Power to Predict.

At Fundamental, you'll work on unprecedented technical challenges in foundation model development and build technology that transforms how the world's largest companies make decisions. This is your opportunity to be part of a category-defining company from the ground-up. Join the team defining the future of enterprise AI.

Key responsibilities

Profile end-to-end distributed training runs to identify bottlenecks across compute, GPU memory, and inter-GPU communication
Contribute to architectural decisions that improve the efficiency and reliability of large-scale training jobs, including developing Triton/CUDA kernels when needed
Design and implement model scaling, parallelization, and memory optimization techniques for training workloads with very large context sizes
Collaborate closely with ML Researchers to diagnose architectural inefficiencies, ensure new research ideas scale efficiently in practice, and spread internal knowledge about model efficiency and optimization
Drive the productionization and serving of our models from the research side, including improving inference efficiency through techniques such as quantization

Must have

Strong understanding of modern ML architectures and large-scale training pipelines
Experience running distributed training jobs on multi-GPU systems
Advanced profiling and debugging skills across CPU, GPU, memory usage, latency, and inter-GPU communication
Strong programming skills in Python
Experience with model scaling and parallelization strategies, including tensor and pipeline parallelism

Nice to have

Familiarity with NCCL, MPI, and distributed communication primitives
Knowledge of PyTorch and Triton internals
Programming experience with C++ and CUDA

Benefits

Competitive compensation with salary and equity
Comprehensive health coverage, including medical, dental, vision, and 401K
Paid parental leave for all new parents, inclusive of adoptive and surrogate journeys
Relocation support for employees moving to join the team in one of our office locations
A mission-driven, low-ego culture that values diversity of thought, ownership, and bias toward action