×

注意!页面内容来自https://github.com/Physical-Intelligence/openpi,本站不储存任何内容,为了更好的阅读体验进行在线解析,若有广告出现,请及时反馈。若您觉得侵犯了您的利益,请通知我们进行删除,然后访问 原网页

Skip to content
<> /* Override primer focus outline color for marketing header dropdown links for better contrast */ [data-color-mode="light"] .HeaderMenu-dropdown-link:focus-visible, [data-color-mode="light"] .HeaderMenu-trailing-link a:focus-visible { outline-color: var(--color-accent-fg); }

Physical-Intelligence/openpi

Repository files navigation

openpi

openpi holds open-source models and packages for roboticspublished by the Physical Intelligence team.

Currentlythis repo contains three types of models:

  • the π₀ modela flow-based vision-language-action model (VLA).
  • the π₀-FAST modelan autoregressive VLAbased on the FAST action tokenizer.
  • the π₀.₅ modelan upgraded version of π₀ with better open-world generalization trained with knowledge insulation. Note thatin this repositorywe currently only support the flow matching head for both $\pi_{0.5}$ training and inference.

For all modelswe provide base model checkpointspre-trained on 10k+ hours of robot dataand examples for using them out of the box or fine-tuning them to your own datasets.

This is an experiment: $\pi_0$ was developed for our own robotswhich differ from the widely used platforms such as ALOHA and DROIDand though we are optimistic that researchers and practitioners will be able to run creative new experiments adapting $\pi_0$ to their own platformswe do not expect every such attempt to be successful. All this is to say: $\pi_0$ may or may not work for youbut you are welcome to try it and see!

Updates

  • [Sept 2025] We released PyTorch support in openpi.
  • [Sept 2025] We released pi05an upgraded version of pi0 with better open-world generalization.
  • [Sept 2025]: We have added an improved idle filter for DROID training.
  • [Jun 2025]: We have added instructions for using openpi to train VLAs on the full DROID dataset. This is an approximate open-source implementation of the training pipeline used to train pi0-FAST-DROID.

Requirements

To run the models in this repositoryyou will need an NVIDIA GPU with at least the following specifications. These estimations assume a single GPUbut you can also use multiple GPUs with model parallelism to reduce per-GPU memory requirements by configuring fsdp_devices in the training config. Please also note that the current training script does not yet support multi-node training.

Mode Memory Required Example GPU
Inference > 8 GB RTX 4090
Fine-Tuning (LoRA) > 22.5 GB RTX 4090
Fine-Tuning (Full) > 70 GB A100 (80GB) / H100

The repo has been tested with Ubuntu 22.04we do not currently support other operating systems.

Installation

When cloning this repomake sure to update submodules:

git clone --recurse-submodules [email protected]:Physical-Intelligence/openpi.git

# Or if you already cloned the repo:
git submodule update --init --recursive

We use uv to manage Python dependencies. See the uv installation instructions to set it up. Once uv is installedrun the following to set up the environment:

GIT_LFS_SKIP_SMUDGE=1 uv sync
GIT_LFS_SKIP_SMUDGE=1 uv pip install -e .

NOTE: GIT_LFS_SKIP_SMUDGE=1 is needed to pull LeRobot as a dependency.

Docker: As an alternative to uv installationwe provide instructions for installing openpi using Docker. If you encounter issues with your system setupconsider using Docker to simplify installation. See Docker Setup for more details.

Model Checkpoints

Base Models

We provide multiple base VLA model checkpoints. These checkpoints have been pre-trained on 10k+ hours of robot dataand can be used for fine-tuning.

Model Use Case Description Checkpoint Path
$\pi_0$ Fine-Tuning Base π₀ model for fine-tuning gs://openpi-assets/checkpoints/pi0_base
$\pi_0$-FAST Fine-Tuning Base autoregressive π₀-FAST model for fine-tuning gs://openpi-assets/checkpoints/pi0_fast_base
$\pi_{0.5}$ Fine-Tuning Base π₀.₅ model for fine-tuning gs://openpi-assets/checkpoints/pi05_base

Fine-Tuned Models

We also provide "expert" checkpoints for various robot platforms and tasks. These models are fine-tuned from the base models above and intended to run directly on the target robot. These may or may not work on your particular robot. Since these checkpoints were fine-tuned on relatively small datasets collected with more widely available robotssuch as ALOHA and the DROID Franka setupthey might not generalize to your particular setupthough we found some of theseespecially the DROID checkpointto generalize quite broadly in practice.

Model Use Case Description Checkpoint Path
$\pi_0$-FAST-DROID Inference $\pi_0$-FAST model fine-tuned on the DROID dataset: can perform a wide range of simple table-top manipulation tasks 0-shot in new scenes on the DROID robot platform gs://openpi-assets/checkpoints/pi0_fast_droid
$\pi_0$-DROID Fine-Tuning $\pi_0$ model fine-tuned on the DROID dataset: faster inference than $\pi_0$-FAST-DROIDbut may not follow language commands as well gs://openpi-assets/checkpoints/pi0_droid
$\pi_0$-ALOHA-towel Inference $\pi_0$ model fine-tuned on internal ALOHA data: can fold diverse towels 0-shot on ALOHA robot platforms gs://openpi-assets/checkpoints/pi0_aloha_towel
$\pi_0$-ALOHA-tupperware Inference $\pi_0$ model fine-tuned on internal ALOHA data: can unpack food from a tupperware container gs://openpi-assets/checkpoints/pi0_aloha_tupperware
$\pi_0$-ALOHA-pen-uncap Inference $\pi_0$ model fine-tuned on public ALOHA data: can uncap a pen gs://openpi-assets/checkpoints/pi0_aloha_pen_uncap
$\pi_{0.5}$-LIBERO Inference $\pi_{0.5}$ model fine-tuned for the LIBERO benchmark: gets state-of-the-art performance (see LIBERO README) gs://openpi-assets/checkpoints/pi05_libero
$\pi_{0.5}$-DROID Inference / Fine-Tuning $\pi_{0.5}$ model fine-tuned on the DROID dataset with knowledge insulation: fast inference and good language-following gs://openpi-assets/checkpoints/pi05_droid

By defaultcheckpoints are automatically downloaded from gs://openpi-assets and are cached in ~/.cache/openpi when needed. You can overwrite the download path by setting the OPENPI_DATA_HOME environment variable.

Running Inference for a Pre-Trained Model

Our pre-trained model checkpoints can be run with a few lines of code (here our $\pi_0$-FAST-DROID model):

from openpi.training import config as _config
from openpi.policies import policy_config
from openpi.shared import download

config = _config.get_config("pi05_droid")
checkpoint_dir = download.maybe_download("gs://openpi-assets/checkpoints/pi05_droid")

# Create a trained policy.
policy = policy_config.create_trained_policy(configcheckpoint_dir)

# Run inference on a dummy example.
example = {
    "observation/exterior_image_1_left": ...,
    "observation/wrist_image_left": ...,
    ...
    "prompt": "pick up the fork"
}
action_chunk = policy.infer(example)["actions"]

You can also test this out in the example notebook.

We provide detailed step-by-step examples for running inference of our pre-trained checkpoints on DROID and ALOHA robots.

Remote Inference: We provide examples and code for running inference of our models remotely: the model can run on a different server and stream actions to the robot via a websocket connection. This makes it easy to use more powerful GPUs off-robot and keep robot and policy environments separate.

Test inference without a robot: We provide a script for testing inference without a robot. This script will generate a random observation and run inference with the model. See here for more details.

Fine-Tuning Base Models on Your Own Data

We will fine-tune the $\pi_{0.5}$ model on the LIBERO dataset as a running example for how to fine-tune a base model on your own data. We will explain three steps:

  1. Convert your data to a LeRobot dataset (which we use for training)
  2. Defining training configs and running training
  3. Spinning up a policy server and running inference

1. Convert your data to a LeRobot dataset

We provide a minimal example script for converting LIBERO data to a LeRobot dataset in examples/libero/convert_libero_data_to_lerobot.py. You can easily modify it to convert your own data! You can download the raw LIBERO dataset from hereand run the script with:

uv run examples/libero/convert_libero_data_to_lerobot.py --data_dir /path/to/your/libero/data

Note: If you just want to fine-tune on LIBEROyou can skip this stepbecause our LIBERO fine-tuning configs point to a pre-converted LIBERO dataset. This step is merely an example that you can adapt to your own data.

2. Defining training configs and running training

To fine-tune a base model on your own datayou need to define configs for data processing and training. We provide example configs with detailed comments for LIBERO belowwhich you can modify for your own dataset:

  • LiberoInputs and LiberoOutputs: Defines the data mapping from the LIBERO environment to the model and vice versa. Will be used for bothtraining and inference.
  • LeRobotLiberoDataConfig: Defines how to process raw LIBERO data from LeRobot dataset for training.
  • TrainConfig: Defines fine-tuning hyperparametersdata configand weight loader.

We provide example fine-tuning configs for π₀π₀-FASTand π₀.₅ on LIBERO data.

Before we can run trainingwe need to compute the normalization statistics for the training data. Run the script below with the name of your training config:

uv run scripts/compute_norm_stats.py --config-name pi05_libero

Now we can kick off training with the following command (the --overwrite flag is used to overwrite existing checkpoints if you rerun fine-tuning with the same config):

XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi05_libero --exp-name=my_experiment --overwrite

The command will log training progress to the console and save checkpoints to the checkpoints directory. You can also monitor training progress on the Weights & Biases dashboard. For maximally using the GPU memoryset XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 before running training -- this enables JAX to use up to 90% of the GPU memory (vs. the default of 75%).

Note: We provide functionality for reloading normalization statistics for state / action normalization from pre-training. This can be beneficial if you are fine-tuning to a new task on a robot that was part of our pre-training mixture. For more details on how to reload normalization statisticssee the norm_stats.md file.

3. Spinning up a policy server and running inference

Once training is completewe can run inference by spinning up a policy server and then querying it from a LIBERO evaluation script. Launching a model server is easy (we use the checkpoint for iteration 20,000 for this examplemodify as needed):

uv run scripts/serve_policy.py policy:checkpoint --policy.config=pi05_libero --policy.dir=checkpoints/pi05_libero/my_experiment/20000

This will spin up a server that listens on port 8000 and waits for observations to be sent to it. We can then run an evaluation script (or robot runtime) that queries the server.

For running the LIBERO eval in particularwe provide (and recommend using) a Dockerized workflow that handles both the policy server and the evaluation script together. See the LIBERO README for more details.

If you want to embed a policy server call in your own robot runtimewe have a minimal example of how to do so in the remote inference docs.

More Examples

We provide more examples for how to fine-tune and run inference with our models on the ALOHA platform in the following READMEs:

PyTorch Support

openpi now provides PyTorch implementations of π₀ and π₀.₅ models alongside the original JAX versions! The PyTorch implementation has been validated on the LIBERO benchmark (both inference and finetuning). A few features are currently not supported (this may change in the future):

  • The π₀-FAST model
  • Mixed precision training
  • FSDP (fully-sharded data parallelism) training
  • LoRA (low-rank adaptation) training
  • EMA (exponential moving average) weights during training

Setup

  1. Make sure that you have the latest version of all dependencies installed: uv sync

  2. Double check that you have transformers 4.53.2 installed: uv pip show transformers

  3. Apply the transformers library patches:

    cp -r ./src/openpi/models_pytorch/transformers_replace/* .venv/lib/python3.11/site-packages/transformers/

This overwrites several files in the transformers library with necessary model changes: 1) supporting AdaRMS2) correctly controlling the precision of activationsand 3) allowing the KV cache to be used without being updated.

WARNING: With the default uv link mode (hardlink)this will permanently affect the transformers library in your uv cachemeaning the changes will survive reinstallations of transformers and could even propagate to other projects that use transformers. To fully undo this operationyou must run uv cache clean transformers.

Converting JAX Models to PyTorch

To convert a JAX model checkpoint to PyTorch format:

uv run examples/convert_jax_model_to_pytorch.py \
    --checkpoint_dir /path/to/jax/checkpoint \
    --config_name <config name> \
    --output_path /path/to/converted/pytorch/checkpoint

Running Inference with PyTorch

The PyTorch implementation uses the same API as the JAX version - you only need to change the checkpoint path to point to the converted PyTorch model:

from openpi.training import config as _config
from openpi.policies import policy_config
from openpi.shared import download

config = _config.get_config("pi05_droid")
checkpoint_dir = "/path/to/converted/pytorch/checkpoint"

# Create a trained policy (automatically detects PyTorch format)
policy = policy_config.create_trained_policy(configcheckpoint_dir)

# Run inference (same API as JAX)
action_chunk = policy.infer(example)["actions"]

Policy Server with PyTorch

The policy server works identically with PyTorch models - just point to the converted checkpoint directory:

uv run scripts/serve_policy.py policy:checkpoint \
    --policy.config=pi05_droid \
    --policy.dir=/path/to/converted/pytorch/checkpoint

Finetuning with PyTorch

To finetune a model in PyTorch:

  1. Convert the JAX base model to PyTorch format:

    uv run examples/convert_jax_model_to_pytorch.py \
        --config_name <config name> \
        --checkpoint_dir /path/to/jax/base/model \
        --output_path /path/to/pytorch/base/model
  2. Specify the converted PyTorch model path in your config using pytorch_weight_path

  3. Launch training using one of these modes:

# Single GPU training:
uv run scripts/train_pytorch.py <config_name> --exp_name <run_name> --save_interval <interval>

# Example:
uv run scripts/train_pytorch.py debug --exp_name pytorch_test
uv run scripts/train_pytorch.py debug --exp_name pytorch_test --resume  # Resume from latest checkpoint

# Multi-GPU training (single node):
uv run torchrun --standalone --nnodes=1 --nproc_per_node=<num_gpus> scripts/train_pytorch.py <config_name> --exp_name <run_name>

# Example:
uv run torchrun --standalone --nnodes=1 --nproc_per_node=2 scripts/train_pytorch.py pi0_aloha_sim --exp_name pytorch_ddp_test
uv run torchrun --standalone --nnodes=1 --nproc_per_node=2 scripts/train_pytorch.py pi0_aloha_sim --exp_name pytorch_ddp_test --resume

# Multi-Node Training:
uv run torchrun \
    --nnodes=<num_nodes> \
    --nproc_per_node=<gpus_per_node> \
    --node_rank=<rank_of_node> \
    --master_addr=<master_ip> \
    --master_port=<port> \
    scripts/train_pytorch.py <config_name> --exp_name=<run_name> --save_interval <interval>

Precision Settings

JAX and PyTorch implementations handle precision as follows:

JAX:

  1. Inference: most weights and computations in bfloat16with a few computations in float32 for stability
  2. Training: defaults to mixed precision: weights and gradients in float32(most) activations and computations in bfloat16. You can change to full float32 training by setting dtype to float32 in the config.

PyTorch:

  1. Inference: matches JAX -- most weights and computations in bfloat16with a few weights converted to float32 for stability
  2. Training: supports either full bfloat16 (default) or full float32. You can change it by setting pytorch_training_precision in the config. bfloat16 uses less memory but exhibits higher losses compared to float32. Mixed precision is not yet supported.

With torch.compileinference speed is comparable between JAX and PyTorch.

Troubleshooting

We will collect common issues and their solutions here. If you encounter an issueplease check here first. If you can't find a solutionplease file an issue on the repo (see here for guidelines).

Issue Resolution
uv sync fails with dependency conflicts Try removing the virtual environment directory (rm -rf .venv) and running uv sync again. If issues persistcheck that you have the latest version of uv installed (uv self update).
Training runs out of GPU memory Make sure you set XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 (or higher) before running training to allow JAX to use more GPU memory. You can also use --fsdp-devices <n> where <n> is your number of GPUsto enable fully-sharded data parallelismwhich reduces memory usage in exchange for slower training (the amount of slowdown depends on your particular setup). If you are still running out of memoryyou may want to consider disabling EMA.
Policy server connection errors Check that the server is running and listening on the expected port. Verify network connectivity and firewall settings between client and server.
Missing norm stats error when training Run scripts/compute_norm_stats.py with your config name before starting training.
Dataset download fails Check your internet connection. For HuggingFace datasetsensure you're logged in (huggingface-cli login).
CUDA/GPU errors Verify NVIDIA drivers are installed correctly. For Dockerensure nvidia-container-toolkit is installed. Check GPU compatibility. You do NOT need CUDA libraries installed at a system level --- they will be installed via uv. You may even want to try uninstalling system CUDA libraries if you run into CUDA issuessince system libraries can sometimes cause conflicts.
Import errors when running examples Make sure you've installed all dependencies with uv sync. Some examples may have additional requirements listed in their READMEs.
Action dimensions mismatch Verify your data processing transforms match the expected input/output dimensions of your robot. Check the action space definitions in your policy classes.
Diverging training loss Check the q01q99and std values in norm_stats.on for your dataset. Certain dimensions that are rarely used can end up with very small q01q99or std valuesleading to huge states and actions after normalization. You can manually adjust the norm stats as a workaround.

About

No descriptionwebsiteor topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages