Skip to content

Training Parameters

This page provides a complete reference of all parameters available when training RF-DETR models.

Basic Example

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
)

Core Parameters

These are the essential parameters for training:

Parameter Type Default Description
dataset_dir str Required Path to your dataset directory. RF-DETR auto-detects if it's in COCO or YOLO format. See Dataset Formats.
output_dir str "output" Directory where training artifacts (checkpoints, logs) are saved.
epochs int 100 Number of full passes over the training dataset.
batch_size int 4 Number of samples processed per iteration. Higher values require more GPU memory.
grad_accum_steps int 4 Accumulates gradients over multiple mini-batches. Use with batch_size to achieve effective batch size.
resume str None Path to a saved checkpoint to continue training. Restores model weights, optimizer state, and scheduler.

Understanding Batch Size

The effective batch size is calculated as:

effective_batch_size = batch_size × grad_accum_steps × num_gpus

Recommended configurations for different GPUs (targeting effective batch size of 16):

GPU VRAM batch_size grad_accum_steps
A100 40-80GB 16 1
RTX 4090 24GB 8 2
RTX 3090 24GB 8 2
T4 16GB 4 4
RTX 3070 8GB 2 8

Learning Rate Parameters

Parameter Type Default Description
lr float 1e-4 Learning rate for most parts of the model.
lr_encoder float 1.5e-4 Learning rate specifically for the backbone encoder. Can be set lower than lr if you want to fine-tune the encoder more conservatively than the rest of the model.

Learning rate tips

  • Start with the default values for fine-tuning
  • If the model doesn't converge, try reducing lr by half
  • For training from scratch (not recommended), you may need higher learning rates

Resolution Parameters

Parameter Type Default Description
resolution int Model-dependent Input image resolution. Higher values can improve accuracy but require more memory. Must be divisible by 14.

Common resolution values:

Resolution Memory Usage Use Case
560 Low Small objects, limited GPU memory
672 Medium Balanced (default for many models)
784 High High accuracy requirements
896 Very High Maximum quality (requires large GPU)

Regularization Parameters

Parameter Type Default Description
weight_decay float 1e-4 L2 regularization coefficient. Helps prevent overfitting by penalizing large weights.

Hardware Parameters

Parameter Type Default Description
device str "cuda" Device to run training on. Options: "cuda", "cpu", "mps" (Apple Silicon).
gradient_checkpointing bool False Re-computes parts of the forward pass during backpropagation to reduce memory usage. Lowers memory needs but increases training time.

EMA (Exponential Moving Average)

Parameter Type Default Description
use_ema bool True Enables Exponential Moving Average of weights. Produces a smoothed checkpoint that often improves final performance.

What is EMA?

EMA maintains a moving average of the model weights throughout training. This smoothed version often generalizes better than the raw weights and is commonly used for the final model.

Checkpoint Parameters

Parameter Type Default Description
checkpoint_interval int 10 Frequency (in epochs) at which model checkpoints are saved. More frequent saves provide better coverage but consume more storage.

Checkpoint Files

During training, multiple checkpoints are saved:

File Description
checkpoint.pth Most recent checkpoint (for resuming)
checkpoint_<N>.pth Periodic checkpoint at epoch N
checkpoint_best_ema.pth Best validation performance (EMA weights)
checkpoint_best_regular.pth Best validation performance (raw weights)
checkpoint_best_total.pth Final best model for inference

Early Stopping Parameters

Parameter Type Default Description
early_stopping bool False Enable early stopping based on validation mAP.
early_stopping_patience int 10 Number of epochs without improvement before stopping.
early_stopping_min_delta float 0.001 Minimum change in mAP to qualify as an improvement.
early_stopping_use_ema bool False Whether to track improvements using EMA model metrics.

Early Stopping Example

model.train(
    dataset_dir="path/to/dataset",
    epochs=200,
    batch_size=4,
    early_stopping=True,
    early_stopping_patience=15,
    early_stopping_min_delta=0.005,
)

This configuration will:

  • Train for up to 200 epochs
  • Stop early if mAP doesn't improve by at least 0.005 for 15 consecutive epochs

Logging Parameters

Parameter Type Default Description
tensorboard bool True Enable TensorBoard logging. Requires pip install "rfdetr[loggers]". If the tensorboard package is not installed, training continues with a UserWarning and TensorBoard output is silently suppressed.
wandb bool False Enable Weights & Biases logging. Requires pip install "rfdetr[loggers]".
project str None Project name for W&B logging.
run str None Run name for W&B logging. If not specified, W&B assigns a random name.

Logging Example

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    tensorboard=True,
    wandb=True,
    project="my-detection-project",
    run="experiment-001",
)

Evaluation Parameters

Parameter Type Default Description
eval_max_dets int 500 Maximum number of detections per image considered during COCO evaluation. Lower values speed up evaluation.
eval_interval int 1 Run COCO evaluation every N epochs. Set to a higher value to reduce evaluation overhead during long training runs.
log_per_class_metrics bool True Log per-class AP metrics to the console and loggers. Disable to reduce log verbosity when there are many classes.
progress_bar bool False Enable tqdm progress bar during training. Set to True for interactive terminal or notebook use.

Advanced Parameters

The parameters below are available for fine-grained control over training behaviour. Most users can leave these at their defaults.

Scheduler and Regularization

Parameter Type Default Description
lr_scheduler str "step" Learning rate scheduler type. Options: "step" (step decay at lr_drop) or "cosine" (cosine annealing).
lr_min_factor float 0.0 Floor for the cosine scheduler, expressed as a fraction of the initial LR. Ignored when using "step".
warmup_epochs float 0.0 Number of epochs for linear learning rate warmup at the start of training.
drop_path float 0.0 Stochastic depth drop-path rate applied to the backbone. Higher values add more regularization.

Runtime and Accelerator

Parameter Type Default Description
accelerator str "auto" PyTorch Lightning accelerator selection. "auto" picks GPU if available, then MPS, then CPU.
seed int None Global random seed for reproducibility. None means no fixed seed is set.
fp16_eval bool False Run evaluation passes in FP16 precision. Reduces memory usage but may lower numerical precision.
compute_val_loss bool True Compute and log the detection loss on the validation set each epoch.
compute_test_loss bool True Compute and log the detection loss during the final test run.

DataLoader Tuning

Parameter Type Default Description
pin_memory bool None Pin host memory in the DataLoader for faster GPU transfers. None defers to PyTorch Lightning's default.
persistent_workers bool None Keep DataLoader worker processes alive between epochs. None defers to PyTorch Lightning's default.
prefetch_factor int None Number of batches to prefetch per DataLoader worker. None uses PyTorch's built-in default.

Complete Parameter Reference

Below is a summary table of all training parameters:

Parameter Type Default Description
dataset_dir str Required Path to COCO or YOLO formatted dataset with train/valid/test splits.
output_dir str "output" Directory for checkpoints, logs, and other training artifacts.
epochs int 100 Number of full passes over the dataset.
batch_size int 4 Samples per iteration. Balance with grad_accum_steps.
grad_accum_steps int 4 Gradient accumulation steps for effective larger batch sizes.
lr float 1e-4 Learning rate for the model (excluding encoder).
lr_encoder float 1.5e-4 Learning rate for the backbone encoder.
resolution int Model-specific Input image size (must be divisible by 14).
weight_decay float 1e-4 L2 regularization coefficient.
device str "cuda" Training device: cuda, cpu, or mps.
use_ema bool True Enable Exponential Moving Average of weights.
gradient_checkpointing bool False Trade compute for memory during backprop.
checkpoint_interval int 10 Save checkpoint every N epochs.
resume str None Path to checkpoint for resuming training.
tensorboard bool True Enable TensorBoard logging.
wandb bool False Enable Weights & Biases logging.
project str None W&B project name.
run str None W&B run name.
early_stopping bool False Enable early stopping.
early_stopping_patience int 10 Epochs without improvement before stopping.
early_stopping_min_delta float 0.001 Minimum mAP change to qualify as improvement.
early_stopping_use_ema bool False Use EMA model for early stopping metrics.
eval_max_dets int 500 Maximum detections per image considered during COCO evaluation.
eval_interval int 1 Run COCO evaluation every N epochs.
log_per_class_metrics bool True Log per-class AP metrics to the console and loggers.
progress_bar bool False Enable tqdm progress bar during training.
accelerator str "auto" PyTorch Lightning accelerator. "auto" selects GPU/MPS/CPU automatically.
seed int None Random seed for reproducibility. None means no fixed seed.
lr_scheduler str "step" Learning rate scheduler type: "step" or "cosine".
lr_min_factor float 0.0 Minimum LR as a fraction of the initial LR (cosine scheduler floor).
warmup_epochs float 0.0 Number of linear warmup epochs at the start of training.
drop_path float 0.0 Stochastic depth drop-path rate for the backbone.
compute_val_loss bool True Compute and log loss during validation.
compute_test_loss bool True Compute and log loss during the test run.
fp16_eval bool False Run evaluation in FP16 precision to reduce memory usage.
pin_memory bool None Pin DataLoader memory. None defers to PyTorch Lightning's default.
persistent_workers bool None Keep DataLoader workers alive between epochs. None uses PTL default.
prefetch_factor int None Number of batches prefetched per worker. None uses PyTorch default.