RF-DETR 1.5.0: Custom Augmentations and Live Training Progress¶

RF-DETR is a real-time object detection model that combines the accuracy of transformer-based detectors with inference speeds suitable for production. 1.5.0 brings two headline additions:

Custom training augmentations via Albumentations — a flexible system that lets you control exactly how images are transformed during training, with bounding boxes and segmentation masks kept in sync automatically. Four ready-made presets cover the most common scenarios out of the box.
Live progress bars and structured epoch logs — per-epoch rich / tqdm progress so you can monitor batch-level metrics without parsing raw log output.

You will learn how to:

Explore and compare the four built-in augmentation presets
Visually inspect augmented samples before committing to a full training run
Define a fully custom augmentation pipeline
Train RF-DETR with your chosen augmentation config
Run inference with the trained model

1. Install RF-DETR 1.5.0¶

rfdetr includes the augmentation system and progress-bar support introduced in this release. supervision handles visualization.

In [ ]:

Copied!

!pip install -q rfdetr==1.5.0
!pip install -q rfdetr==1.5.0

2. Check GPU availability¶

RF-DETR trains on GPU when one is available and falls back to CPU otherwise. The cell below detects your device and prints VRAM size — a useful sanity check before choosing batch size.

In [ ]:

Copied!





import os

import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")
if device == "cuda":
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

# Scale data-loader workers with available CPUs so the GPU is kept fed.
num_workers = max(os.cpu_count() or 0, 2)
print(f"Data loader workers: {num_workers}")
import os

import torch

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")
if device == "cuda":
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

# Scale data-loader workers with available CPUs so the GPU is kept fed.
num_workers = max(os.cpu_count() or 0, 2)
print(f"Data loader workers: {num_workers}")

3. Download COCO 2017¶

We use the official train/val splits — no manual splitting needed.

Split	Images	Size
`train2017`	~118 000	~18 GB
`val2017`	5 000	~1 GB
annotations	—	~241 MB

In [ ]:

Copied!

!wget -q --show-progress http://images.cocodataset.org/zips/train2017.zip -O train2017.zip
!wget -q --show-progress http://images.cocodataset.org/zips/val2017.zip -O val2017.zip
!wget -q --show-progress http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O annotations.zip
!wget -q --show-progress http://images.cocodataset.org/zips/train2017.zip -O train2017.zip
!wget -q --show-progress http://images.cocodataset.org/zips/val2017.zip -O val2017.zip
!wget -q --show-progress http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O annotations.zip

In [ ]:

Copied!

!unzip -q train2017.zip
!unzip -q val2017.zip
!unzip -q annotations.zip
!unzip -q train2017.zip
!unzip -q val2017.zip
!unzip -q annotations.zip

Set up the dataset directory structure¶

model.train() expects:

dataset/
  train/  _annotations.coco.json  + images
  valid/  _annotations.coco.json  + images

In [ ]:

Copied!





!mkdir -p coco_demo
!mv train2017 coco_demo/train
!mv val2017   coco_demo/valid
!cp annotations/instances_train2017.json coco_demo/train/_annotations.coco.json
!cp annotations/instances_val2017.json   coco_demo/valid/_annotations.coco.json
!mkdir -p coco_demo
!mv train2017 coco_demo/train
!mv val2017   coco_demo/valid
!cp annotations/instances_train2017.json coco_demo/train/_annotations.coco.json
!cp annotations/instances_val2017.json   coco_demo/valid/_annotations.coco.json

4. Explore the built-in augmentation presets¶

RF-DETR ships four ready-made presets tuned for common use cases. Each preset is a plain Python dict mapping Albumentations transform names to their constructor kwargs (including p, the probability of applying the transform). You can inspect, merge, or extend them like any other dict.

Preset	When to use
`AUG_CONSERVATIVE`	Small datasets (< 500 images) — gentle transforms to avoid overfitting
`AUG_AGGRESSIVE`	Large datasets (2 000+ images) — stronger augmentations for better generalisation
`AUG_AERIAL`	Satellite and overhead imagery — rotation-invariant transforms
`AUG_INDUSTRIAL`	Manufacturing and inspection data — handles structured textures

In [ ]:

Copied!





import rfdetr.datasets.aug_config as aug_config
from rfdetr.datasets.aug_config import AUG_AGGRESSIVE

for name in ("AUG_CONSERVATIVE", "AUG_AGGRESSIVE", "AUG_AERIAL", "AUG_INDUSTRIAL"):
    preset = getattr(aug_config, name)
    print(f"\n{name}:")
    for transform, params in preset.items():
        print(f"  {transform}: {params}")
import rfdetr.datasets.aug_config as aug_config
from rfdetr.datasets.aug_config import AUG_AGGRESSIVE

for name in ("AUG_CONSERVATIVE", "AUG_AGGRESSIVE", "AUG_AERIAL", "AUG_INDUSTRIAL"):
    preset = getattr(aug_config, name)
    print(f"\n{name}:")
    for transform, params in preset.items():
        print(f"  {transform}: {params}")

5. Inspect augmented samples before training¶

Setting save_dataset_grids=True writes 3×3 JPEG grids of augmented training and validation images to output_dir before any weight updates occur. Use this to visually sanity-check your pipeline in seconds — catching problems like flipped labels or extreme colour shifts before committing to a full training run.

Saved files:

output/
  train_batch0_grid.jpg  train_batch1_grid.jpg  train_batch2_grid.jpg
  val_batch0_grid.jpg    val_batch1_grid.jpg    val_batch2_grid.jpg

In [ ]:

Copied!





import os

from rfdetr import RFDETRNano  # smallest & fastest model — ideal for demos

DATASET_DIR = "coco_demo"
OUTPUT_DIR = "output"
os.makedirs(OUTPUT_DIR, exist_ok=True)

model = RFDETRNano()
model.train(
    dataset_dir=str(DATASET_DIR),
    epochs=1,  # one epoch is enough to generate the grids
    batch_size=12,
    aug_config=AUG_AGGRESSIVE,
    save_dataset_grids=True,
    output_dir=OUTPUT_DIR,
    device=device,
    num_workers=num_workers,
    run_test=False,
)
import os

from rfdetr import RFDETRNano  # smallest & fastest model — ideal for demos

DATASET_DIR = "coco_demo"
OUTPUT_DIR = "output"
os.makedirs(OUTPUT_DIR, exist_ok=True)

model = RFDETRNano()
model.train(
    dataset_dir=str(DATASET_DIR),
    epochs=1,  # one epoch is enough to generate the grids
    batch_size=12,
    aug_config=AUG_AGGRESSIVE,
    save_dataset_grids=True,
    output_dir=OUTPUT_DIR,
    device=device,
    num_workers=num_workers,
    run_test=False,
)

Display the saved grids inline¶

In [ ]:

Copied!





from pathlib import Path

%matplotlib inline

import matplotlib.image as mpimg
import matplotlib.pyplot as plt

grids = sorted(Path(OUTPUT_DIR).glob("*_grid.jpg"))
if grids:
    fig, axes = plt.subplots(len(grids), 1, figsize=(6, 6 * len(grids)))
    if len(grids) == 1:
        axes = [axes]
    for ax, grid_path in zip(axes, grids):
        ax.imshow(mpimg.imread(grid_path))
        ax.set_title(grid_path.name)
        ax.axis("off")
    plt.tight_layout()
    plt.show()
from pathlib import Path

%matplotlib inline

import matplotlib.image as mpimg
import matplotlib.pyplot as plt

grids = sorted(Path(OUTPUT_DIR).glob("*_grid.jpg"))
if grids:
    fig, axes = plt.subplots(len(grids), 1, figsize=(6, 6 * len(grids)))
    if len(grids) == 1:
        axes = [axes]
    for ax, grid_path in zip(axes, grids):
        ax.imshow(mpimg.imread(grid_path))
        ax.set_title(grid_path.name)
        ax.axis("off")
    plt.tight_layout()
    plt.show()

6. Choose your augmentation config¶

Pick one of the four options below. Option A (a built-in preset) is the fastest way to get started. Options B–D give increasing levels of control. The selected AUG_CONFIG is passed straight to model.train() in the next section.

In [ ]:

Copied!





# --- Option A: built-in preset ---
AUG_CONFIG = AUG_AGGRESSIVE

# --- Option B: extend a preset ---
# AUG_CONFIG = {**AUG_AGGRESSIVE, "VerticalFlip": {"p": 0.3}}

# --- Option C: fully custom ---
# AUG_CONFIG = {
#     "HorizontalFlip": {"p": 0.5},
#     "Rotate": {"limit": 15, "p": 0.3},
#     "RandomBrightnessContrast": {"brightness_limit": 0.2, "contrast_limit": 0.2, "p": 0.4},
#     "GaussianBlur": {"blur_limit": 3, "p": 0.2},
# }

# --- Option D: no augmentations ---
# AUG_CONFIG = {}

print("Selected augmentation config:")
for transform, params in AUG_CONFIG.items():
    print(f"  {transform}: {params}")
# --- Option A: built-in preset ---
AUG_CONFIG = AUG_AGGRESSIVE

# --- Option B: extend a preset ---
# AUG_CONFIG = {**AUG_AGGRESSIVE, "VerticalFlip": {"p": 0.3}}

# --- Option C: fully custom ---
# AUG_CONFIG = {
#     "HorizontalFlip": {"p": 0.5},
#     "Rotate": {"limit": 15, "p": 0.3},
#     "RandomBrightnessContrast": {"brightness_limit": 0.2, "contrast_limit": 0.2, "p": 0.4},
#     "GaussianBlur": {"blur_limit": 3, "p": 0.2},
# }

# --- Option D: no augmentations ---
# AUG_CONFIG = {}

print("Selected augmentation config:")
for transform, params in AUG_CONFIG.items():
    print(f"  {transform}: {params}")

7. Train with augmentations¶

Run a full training pass with the augmentation config chosen above. The progress_bar="rich" setting activates per-epoch live progress bars — a batch counter, loss, and learning rate are updated in real time. Structured metric tables are printed at the end of each epoch.

aug_config accepts any dict of Albumentations transforms, including the AUG_CONFIG defined above. Passing {} disables all augmentations.

In [ ]:

Copied!





model = RFDETRNano()
model.train(
    dataset_dir=str(DATASET_DIR),
    epochs=2,
    batch_size=24,
    aug_config=AUG_CONFIG,
    output_dir=OUTPUT_DIR,
    device=device,
    num_workers=num_workers,
    progress_bar="rich",
    run_test=False,
)
model = RFDETRNano()
model.train(
    dataset_dir=str(DATASET_DIR),
    epochs=2,
    batch_size=24,
    aug_config=AUG_CONFIG,
    output_dir=OUTPUT_DIR,
    device=device,
    num_workers=num_workers,
    progress_bar="rich",
    run_test=False,
)

8. Run inference¶

Load a sample image and run model.predict(). The method returns a supervision.Detections object, which makes it straightforward to draw bounding boxes and labels with the supervision annotators.

In [ ]:

Copied!





import requests
import supervision as sv
from PIL import Image

from rfdetr.util.coco_classes import COCO_CLASSES

image_url = "https://media.roboflow.com/dog.jpg"
image = Image.open(requests.get(image_url, stream=True).raw)

detections = model.predict(image, threshold=0.5)

labels = [COCO_CLASSES[class_id] for class_id in detections.class_id]
annotated = sv.BoxAnnotator().annotate(image.copy(), detections)
annotated = sv.LabelAnnotator().annotate(annotated, detections, labels)
sv.plot_image(annotated)
import requests
import supervision as sv
from PIL import Image

from rfdetr.util.coco_classes import COCO_CLASSES

image_url = "https://media.roboflow.com/dog.jpg"
image = Image.open(requests.get(image_url, stream=True).raw)

detections = model.predict(image, threshold=0.5)

labels = [COCO_CLASSES[class_id] for class_id in detections.class_id]
annotated = sv.BoxAnnotator().annotate(image.copy(), detections)
annotated = sv.LabelAnnotator().annotate(annotated, detections, labels)
sv.plot_image(annotated)

9. Next steps¶

You have seen the complete 1.5.0 augmentation workflow — from exploring presets through live inspection to a full training run. From here:

Augmentation docs — full transform reference and custom pipeline guide
Advanced training options — EMA, gradient accumulation, learning rate schedules
Logger integrations (ClearML, MLflow, W&B) — experiment tracking
Export your model — ONNX, TensorRT, CoreML
RF-DETR 1.6.0 notebook — PyTorch Lightning building blocks