This repo is about segmentation of a VR headset in a video or an image.
- Install prerequisites (see Installation below).
- Run one of the actions via
main.py(examples below).
Example — run video processing (default action):
python main.py --cuda --action video --source_video ./video/test.mp4 --output_video ./video/out.mp4Example — process a single image:
python main.py --no-cuda --action image --source_video ./images/sample.jpgExtract frames from a video:
python main.py --action frames --source_video ./video/test.mp4The commands use CUDA by default. Use --no-cuda if you want the model to run on the GPU.
Two common ways to prepare the environment are described below.
Option A — Use the provided Nix flake (recommended if you use Nix) 🐚
The repository contains flake.nix which defines a development shell with Python and many dependencies. This is convenient on NixOS or systems with Nix installed.
- Make sure Nix is installed and you have enabled flakes (see Nix docs).
- (Optional) If the flake references Cachix, follow its instructions to enable the caches mentioned in
flake.nix. - In the project root run:
sudo nix develop
# or explicitly: nix develop .#This drops you into a dev shell with Python and many build inputs. The flake's shellHook will create and activate a .venv and install ultralytics automatically on first use.
After this command finishes you are good to go.
Option B — Use pip and a virtual environment (works anywhere) 🐍
- Create and activate a virtual environment:
python -m venv .venv
# macOS / Linux
source .venv/bin/activate
# Windows (PowerShell)
.\.venv\Scripts\Activate.ps1- Upgrade pip and install Python requirements:
pip install --upgrade pip
pip install -r requirements.txt # This installs pytorch and torchvision CPU only- Install the CUDA drivers if not installed yet.
- Install PyTorch and torchvision with CUDA. The exact command depends on your GPU. Always prefer the official instructions from https://pytorch.org/get-started/locally/.
Examples:
- Example CUDA 11.8 (replace with the correct CUDA version for your GPU):
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118You're now ready to run the same example commands in Quick start.
- Entry point:
main.py - Arguments:
--cuda--no-cuda: enable/disable CUDA (boolean flag)--action: one oftrain,image,image_crop,video,frames(default:video)--source_video: path to input video or image (default:./video/test.mp4)--output_video: path for output video or output directory (default:./video/test.mp4)
-
train🚀- Prints instructions and points to
experiments/train_yolo.py. Use that script directly to train/finetune.
- Prints instructions and points to
-
image🖼️- Runs the image segmentation/test pipeline using
experiments/model_test.py.
- Runs the image segmentation/test pipeline using
-
image_crop✂️- Runs a cropped-image pathway via
experiments/model_test.py.
- Runs a cropped-image pathway via
-
video(default) 🎬- Runs video-based processing using
experiments/model_test_video.py.
- Runs video-based processing using
-
frames📸- Extracts frames from the input video and writes them to an output folder. The called script prints where frames are saved.
experiments/train_yolo.py— training / finetuning instructionsexperiments/model_test.py— image-based inference and segmentation helperexperiments/model_test_video.py— video inference and frame extractionexperiments/detect_objects.py,experiments/detect_objects_seg.py— detection
- If you see CUDA-related errors, try running without
--cudaor ensure your PyTorch build and GPU drivers match. - If
image_cropfails, inspectmain.pyfor the noted typo and correct the module reference.
- MIT License
Thank you to all the people at HITLab NZ !