Founding Machine Learning - World Models

Posted on May 4th, 2026

Job Description

We build world models that simulate manipulation scenes faithfully enough to validate, and one day, train policies without touching a robot. You'll develop generative models that make this work, with the controllability and physical fidelity to match real-robot behavior. What you'll do: Train video and dynamics models: Develop world models with action conditioning for manipulation policies. Push long-horizon coherence: Develop architectures and training methods that extend rollout quality on hard physical tasks. Own training infrastructure: Run multi-GPU clusters, write custom CUDA, debug at scale. Build the world-model data engine: Design, implement, and improve a data engine that allows the world model to compound learning across customers and manipulation tasks. Requirements: Very strong coding in Python and PyTorch (or similar). Video generation experience: Deep experience training image or video generation models end-to-end. Large-scale training: Track record operating training runs at cluster scale. 3D vision: Working knowledge of multi-view geometry, scene reconstruction, and physical priors.

Location

San Francisco

Salary

$150K - $275K

Experience

Not specified