Export Meta AI's Segment Anything 3 (SAM3) model to ONNX, then build a TensorRT engine for real-time segmentation. This repo includes a CUDA inference library and demo apps for semantic and instance segmentation.
- Python tooling to export SAM3 to a clean ONNX graph.
- TensorRT-ready workflows for building optimized engines.
- A C++/CUDA library for high-performance inference with demo apps.
- Support for Promptable concept segmentation (PCS), the latest feautre in SAM3.
- Zero-copy support on unified-memory platforms (Jetson, DGX Spark). Great for robotics/real-time interaction.
- Everything runs inside a reproducible docker environment (x86, Jetson, Spark).
- MIT license for the love of everything nice :)
Semantic segmentation produced by the C++ demo app (prompt='dog')
Instance segmentation results (prompt='box')
python/- ONNX export and visualization scripts.cpp/- C++/CUDA library and apps (TensorRT inference).docker/- Container setup (Dockerfile.x86, with an aarch64 variant expected).demo/- Example outputs from the C++ demo app.
-
Request access to the gated model
- Visit https://huggingface.co/facebook/sam3 and request access.
- Ensure your
HF_TOKENhas permission. - Set
HF_TOKENas environment variable in the host. Docker will pick it up from there.
-
Build the Docker container for your platform (all commands below run inside it)
docker build -t sam3-trt -f docker/Dockerfile.x86 .For aarch64 platforms with shared CPU/GPU memory, the C++ library in this repo supports zero-copy inference paths.
Build and run the aarch64 container:
docker build -t sam3-trt-aarch64 -f docker/Dockerfile.aarch64 .- Export
HF_TOKENand run the docker container
export HF_TOKEN=<YOUR TOKEN>
docker run -it --rm \
--network=host \
--gpus all \
--ipc=host \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
--runtime=nvidia \
--env HF_TOKEN \
-v "$PWD":/workspace \
-w /workspace \
sam3-trt bash- Export to ONNX
python python/onnxexport.pyThis produces onnx_weights/sam3_static.onnx plus external weight shards.
- Build a TensorRT engine
trtexec --onnx=onnx_weights/sam3_static.onnx --saveEngine=sam3_fp16.plan --fp16 --verbose- Build the C++/CUDA library and sample app
mkdir cpp/build && cd cpp/build
cmake ..
make- Run the demo app
./sam3_pcs_app <image_dir> <engine_path.engine>Results are written to a results/ folder.
This is a very raw project and provides the crucial backend TensorRT/CUDA bits necessary for anything. From here, please feel free to fan out into any application you like. Pull requests are very welcome! Here are some ideas I can think of:
- ROS2 wrapper for real-time robotics pipelines.
- Interactive voice-based segmentation app. Have someone speak into a microphone, use a TTS model to transcribe it and feed into the engine, which then produces the segmentation mask live. I don't have the time to build it but I hope you can.
- Live camera input and overlays. You will need a beefy GPU. SAM3 doesn't run realtime on a Jetson nano.
- Access errors: Make sure your
HF_TOKENhas access tofacebook/sam3. - ONNX export fails: Install
transformersfrom source if SAM3 is missing. - TensorRT parse errors: Ensure the full
onnx_weights/directory is copied (external data is required). - C++ build errors: Confirm CUDA, TensorRT, and OpenCV are installed and discoverable via
pkg-config.
- The shared library target is
sam3_trt. - Demo app:
sam3_pcs_app(semantic/instance visualization modes). - Outputs include semantic segmentation and instance segmentation mask logits. If you choose
SAM3_VISUALIZATION::VIS_NONEin your application, you need to apply sigmoid yourself. - The library does not support building engines. Use
trtexecinstead.
- Default export runs on CPU for compatibility (switch
devicetocudaif desired). - SAM3 is large and exports with external weight shards; keep the entire
onnx_weights/directory together.
- Use
trtexecfor quick engine builds and benchmarking. - FP16 is the usual starting point; INT8/FP8/INT4 require calibration or compatible tooling.
- MIT (see
LICENSE).
If this saved you time, drop a ⭐ so others can find it and ship SAM-3 faster.


