Robots find tricky tasks hard and need a lot of training. This system helps them plan ahead and act smarter using much less data.

Robots often struggle with complex tasks because their control systems are fragmented. Traditional approaches require separate perception, planning, and control modules, along with massive amounts of labeled data for each robot and environment. This makes building reliable multi-step behaviors slow, costly, and inflexible.
Cosmos Policy from NVIDIA addresses this by combining perception, planning, and control into a single system. Instead of designing a control model from scratch, it builds on Cosmos Predict, a video-based world model trained on large-scale video data. By post-training this model with robot demonstration data, Cosmos Policy learns how the physical world evolves and how a robot’s actions produce outcomes. This allows it to predict both the next action and the resulting state all within one architecture.
For robot developers and research labs, this reduces the need for task-specific engineering and extensive training datasets. It allows robots to learn faster from fewer demonstrations, saving time and resources, particularly in real-world scenarios where collecting data is expensive.
The framework also supports planning at inference time. Instead of only choosing the next immediate move, it can evaluate multiple action sequences and their likely outcomes, enabling robots to make longer-term, strategic decisions. Physical tests show that even complex multi-step tasks such as bimanual manipulation can be completed directly from visual input, demonstrating transfer from simulation to real-world environments.
By simplifying robot control and leveraging knowledge from large video models, Cosmos Policy benefits anyone building autonomous systems that need robust and flexible decision-making including researchers, industrial robotics teams, and labs developing advanced manipulation tasks.
Cosmos Policy is part of NVIDIA’s larger Cosmos ecosystem which aims to create shared world models for robots and autonomous systems. The broader goal is to give machines a general understanding of the physical world so they can act intelligently without hand-engineered, task-specific rules.






