RF-DETR 1.5.0: Custom Augmentations and Live Training Progress¶
RF-DETR is a real-time object detection model that combines the accuracy of transformer-based detectors with inference speeds suitable for production. 1.5.0 brings two headline additions:
- Custom training augmentations via Albumentations — a flexible system that lets you control exactly how images are transformed during training, with bounding boxes and segmentation masks kept in sync automatically. Four ready-made presets cover the most common scenarios out of the box.
- Live progress bars and structured epoch logs — per-epoch rich / tqdm progress so you can monitor batch-level metrics without parsing raw log output.
You will learn how to:
- Explore and compare the four built-in augmentation presets
- Visually inspect augmented samples before committing to a full training run
- Define a fully custom augmentation pipeline
- Train RF-DETR with your chosen augmentation config
- Run inference with the trained model
1. Install RF-DETR 1.5.0¶
rfdetr includes the augmentation system and progress-bar support introduced in
this release. supervision handles visualization.
!pip install -q rfdetr==1.5.0
2. Check GPU availability¶
RF-DETR trains on GPU when one is available and falls back to CPU otherwise. The cell below detects your device and prints VRAM size — a useful sanity check before choosing batch size.
import os
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")
if device == "cuda":
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
# Scale data-loader workers with available CPUs so the GPU is kept fed.
num_workers = max(os.cpu_count() or 0, 2)
print(f"Data loader workers: {num_workers}")
3. Download COCO 2017¶
We use the official train/val splits — no manual splitting needed.
| Split | Images | Size |
|---|---|---|
train2017 |
~118 000 | ~18 GB |
val2017 |
5 000 | ~1 GB |
| annotations | — | ~241 MB |
!wget -q --show-progress http://images.cocodataset.org/zips/train2017.zip -O train2017.zip
!wget -q --show-progress http://images.cocodataset.org/zips/val2017.zip -O val2017.zip
!wget -q --show-progress http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O annotations.zip
!unzip -q train2017.zip
!unzip -q val2017.zip
!unzip -q annotations.zip
Set up the dataset directory structure¶
model.train() expects:
dataset/
train/ _annotations.coco.json + images
valid/ _annotations.coco.json + images
!mkdir -p coco_demo
!mv train2017 coco_demo/train
!mv val2017 coco_demo/valid
!cp annotations/instances_train2017.json coco_demo/train/_annotations.coco.json
!cp annotations/instances_val2017.json coco_demo/valid/_annotations.coco.json
4. Explore the built-in augmentation presets¶
RF-DETR ships four ready-made presets tuned for common use cases. Each preset
is a plain Python dict mapping Albumentations transform names to their constructor
kwargs (including p, the probability of applying the transform). You can
inspect, merge, or extend them like any other dict.
| Preset | When to use |
|---|---|
AUG_CONSERVATIVE |
Small datasets (< 500 images) — gentle transforms to avoid overfitting |
AUG_AGGRESSIVE |
Large datasets (2 000+ images) — stronger augmentations for better generalisation |
AUG_AERIAL |
Satellite and overhead imagery — rotation-invariant transforms |
AUG_INDUSTRIAL |
Manufacturing and inspection data — handles structured textures |
import rfdetr.datasets.aug_config as aug_config
from rfdetr.datasets.aug_config import AUG_AGGRESSIVE
for name in ("AUG_CONSERVATIVE", "AUG_AGGRESSIVE", "AUG_AERIAL", "AUG_INDUSTRIAL"):
preset = getattr(aug_config, name)
print(f"\n{name}:")
for transform, params in preset.items():
print(f" {transform}: {params}")
5. Inspect augmented samples before training¶
Setting save_dataset_grids=True writes 3×3 JPEG grids of augmented training
and validation images to output_dir before any weight updates occur. Use
this to visually sanity-check your pipeline in seconds — catching problems like
flipped labels or extreme colour shifts before committing to a full training run.
Saved files:
output/
train_batch0_grid.jpg train_batch1_grid.jpg train_batch2_grid.jpg
val_batch0_grid.jpg val_batch1_grid.jpg val_batch2_grid.jpg
import os
from rfdetr import RFDETRNano # smallest & fastest model — ideal for demos
DATASET_DIR = "coco_demo"
OUTPUT_DIR = "output"
os.makedirs(OUTPUT_DIR, exist_ok=True)
model = RFDETRNano()
model.train(
dataset_dir=str(DATASET_DIR),
epochs=1, # one epoch is enough to generate the grids
batch_size=12,
aug_config=AUG_AGGRESSIVE,
save_dataset_grids=True,
output_dir=OUTPUT_DIR,
device=device,
num_workers=num_workers,
run_test=False,
)
Display the saved grids inline¶
from pathlib import Path
%matplotlib inline
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
grids = sorted(Path(OUTPUT_DIR).glob("*_grid.jpg"))
if grids:
fig, axes = plt.subplots(len(grids), 1, figsize=(6, 6 * len(grids)))
if len(grids) == 1:
axes = [axes]
for ax, grid_path in zip(axes, grids):
ax.imshow(mpimg.imread(grid_path))
ax.set_title(grid_path.name)
ax.axis("off")
plt.tight_layout()
plt.show()
6. Choose your augmentation config¶
Pick one of the four options below. Option A (a built-in preset) is the fastest
way to get started. Options B–D give increasing levels of control. The selected
AUG_CONFIG is passed straight to model.train() in the next section.
# --- Option A: built-in preset ---
AUG_CONFIG = AUG_AGGRESSIVE
# --- Option B: extend a preset ---
# AUG_CONFIG = {**AUG_AGGRESSIVE, "VerticalFlip": {"p": 0.3}}
# --- Option C: fully custom ---
# AUG_CONFIG = {
# "HorizontalFlip": {"p": 0.5},
# "Rotate": {"limit": 15, "p": 0.3},
# "RandomBrightnessContrast": {"brightness_limit": 0.2, "contrast_limit": 0.2, "p": 0.4},
# "GaussianBlur": {"blur_limit": 3, "p": 0.2},
# }
# --- Option D: no augmentations ---
# AUG_CONFIG = {}
print("Selected augmentation config:")
for transform, params in AUG_CONFIG.items():
print(f" {transform}: {params}")
7. Train with augmentations¶
Run a full training pass with the augmentation config chosen above. The
progress_bar="rich" setting activates per-epoch live progress bars — a batch
counter, loss, and learning rate are updated in real time. Structured metric
tables are printed at the end of each epoch.
aug_config accepts any dict of Albumentations transforms, including the
AUG_CONFIG defined above. Passing {} disables all augmentations.
model = RFDETRNano()
model.train(
dataset_dir=str(DATASET_DIR),
epochs=2,
batch_size=24,
aug_config=AUG_CONFIG,
output_dir=OUTPUT_DIR,
device=device,
num_workers=num_workers,
progress_bar="rich",
run_test=False,
)
8. Run inference¶
Load a sample image and run model.predict(). The method returns a
supervision.Detections object, which makes it straightforward to draw
bounding boxes and labels with the supervision annotators.
import requests
import supervision as sv
from PIL import Image
from rfdetr.util.coco_classes import COCO_CLASSES
image_url = "https://media.roboflow.com/dog.jpg"
image = Image.open(requests.get(image_url, stream=True).raw)
detections = model.predict(image, threshold=0.5)
labels = [COCO_CLASSES[class_id] for class_id in detections.class_id]
annotated = sv.BoxAnnotator().annotate(image.copy(), detections)
annotated = sv.LabelAnnotator().annotate(annotated, detections, labels)
sv.plot_image(annotated)
9. Next steps¶
You have seen the complete 1.5.0 augmentation workflow — from exploring presets through live inspection to a full training run. From here:
- Augmentation docs — full transform reference and custom pipeline guide
- Advanced training options — EMA, gradient accumulation, learning rate schedules
- Logger integrations (ClearML, MLflow, W&B) — experiment tracking
- Export your model — ONNX, TensorRT, CoreML
- RF-DETR 1.6.0 notebook — PyTorch Lightning building blocks