Training Parameters
This page provides a complete reference of all parameters available when training RF-DETR models.
Basic Example
from rfdetr import RFDETRMedium
model = RFDETRMedium()
model.train(
dataset_dir="path/to/dataset",
epochs=100,
batch_size=4,
grad_accum_steps=4,
lr=1e-4,
output_dir="output",
)
Core Parameters
These are the essential parameters for training:
| Parameter |
Type |
Default |
Description |
dataset_dir |
str |
Required |
Path to your dataset directory. RF-DETR auto-detects if it's in COCO or YOLO format. See Dataset Formats. |
output_dir |
str |
"output" |
Directory where training artifacts (checkpoints, logs) are saved. |
epochs |
int |
100 |
Number of full passes over the training dataset. |
batch_size |
int |
4 |
Number of samples processed per iteration. Higher values require more GPU memory. |
grad_accum_steps |
int |
4 |
Accumulates gradients over multiple mini-batches. Use with batch_size to achieve effective batch size. |
resume |
str |
None |
Path to a saved checkpoint to continue training. Restores model weights, optimizer state, and scheduler. |
Understanding Batch Size
The effective batch size is calculated as:
effective_batch_size = batch_size × grad_accum_steps × num_gpus
Recommended configurations for different GPUs (targeting effective batch size of 16):
| GPU |
VRAM |
batch_size |
grad_accum_steps |
| A100 |
40-80GB |
16 |
1 |
| RTX 4090 |
24GB |
8 |
2 |
| RTX 3090 |
24GB |
8 |
2 |
| T4 |
16GB |
4 |
4 |
| RTX 3070 |
8GB |
2 |
8 |
Learning Rate Parameters
| Parameter |
Type |
Default |
Description |
lr |
float |
1e-4 |
Learning rate for most parts of the model. |
lr_encoder |
float |
1.5e-4 |
Learning rate specifically for the backbone encoder. Can be set lower than lr if you want to fine-tune the encoder more conservatively than the rest of the model. |
Learning rate tips
- Start with the default values for fine-tuning
- If the model doesn't converge, try reducing
lr by half
- For training from scratch (not recommended), you may need higher learning rates
Resolution Parameters
| Parameter |
Type |
Default |
Description |
resolution |
int |
Model-dependent |
Input image resolution. Higher values can improve accuracy but require more memory. Must be divisible by 14. |
Common resolution values:
| Resolution |
Memory Usage |
Use Case |
| 560 |
Low |
Small objects, limited GPU memory |
| 672 |
Medium |
Balanced (default for many models) |
| 784 |
High |
High accuracy requirements |
| 896 |
Very High |
Maximum quality (requires large GPU) |
Regularization Parameters
| Parameter |
Type |
Default |
Description |
weight_decay |
float |
1e-4 |
L2 regularization coefficient. Helps prevent overfitting by penalizing large weights. |
Hardware Parameters
| Parameter |
Type |
Default |
Description |
device |
str |
"cuda" |
Device to run training on. Options: "cuda", "cpu", "mps" (Apple Silicon). |
gradient_checkpointing |
bool |
False |
Re-computes parts of the forward pass during backpropagation to reduce memory usage. Lowers memory needs but increases training time. |
EMA (Exponential Moving Average)
| Parameter |
Type |
Default |
Description |
use_ema |
bool |
True |
Enables Exponential Moving Average of weights. Produces a smoothed checkpoint that often improves final performance. |
What is EMA?
EMA maintains a moving average of the model weights throughout training. This smoothed version often generalizes better than the raw weights and is commonly used for the final model.
Checkpoint Parameters
| Parameter |
Type |
Default |
Description |
checkpoint_interval |
int |
10 |
Frequency (in epochs) at which model checkpoints are saved. More frequent saves provide better coverage but consume more storage. |
Checkpoint Files
During training, multiple checkpoints are saved:
| File |
Description |
checkpoint.pth |
Most recent checkpoint (for resuming) |
checkpoint_<N>.pth |
Periodic checkpoint at epoch N |
checkpoint_best_ema.pth |
Best validation performance (EMA weights) |
checkpoint_best_regular.pth |
Best validation performance (raw weights) |
checkpoint_best_total.pth |
Final best model for inference |
Early Stopping Parameters
| Parameter |
Type |
Default |
Description |
early_stopping |
bool |
False |
Enable early stopping based on validation mAP. |
early_stopping_patience |
int |
10 |
Number of epochs without improvement before stopping. |
early_stopping_min_delta |
float |
0.001 |
Minimum change in mAP to qualify as an improvement. |
early_stopping_use_ema |
bool |
False |
Whether to track improvements using EMA model metrics. |
Early Stopping Example
model.train(
dataset_dir="path/to/dataset",
epochs=200,
batch_size=4,
early_stopping=True,
early_stopping_patience=15,
early_stopping_min_delta=0.005,
)
This configuration will:
- Train for up to 200 epochs
- Stop early if mAP doesn't improve by at least 0.005 for 15 consecutive epochs
Logging Parameters
| Parameter |
Type |
Default |
Description |
tensorboard |
bool |
True |
Enable TensorBoard logging. Requires pip install "rfdetr[loggers]". If the tensorboard package is not installed, training continues with a UserWarning and TensorBoard output is silently suppressed. |
wandb |
bool |
False |
Enable Weights & Biases logging. Requires pip install "rfdetr[loggers]". |
project |
str |
None |
Project name for W&B logging. |
run |
str |
None |
Run name for W&B logging. If not specified, W&B assigns a random name. |
Logging Example
model.train(
dataset_dir="path/to/dataset",
epochs=100,
tensorboard=True,
wandb=True,
project="my-detection-project",
run="experiment-001",
)
Evaluation Parameters
| Parameter |
Type |
Default |
Description |
eval_max_dets |
int |
500 |
Maximum number of detections per image considered during COCO evaluation. Lower values speed up evaluation. |
eval_interval |
int |
1 |
Run COCO evaluation every N epochs. Set to a higher value to reduce evaluation overhead during long training runs. |
log_per_class_metrics |
bool |
True |
Log per-class AP metrics to the console and loggers. Disable to reduce log verbosity when there are many classes. |
progress_bar |
bool |
False |
Enable tqdm progress bar during training. Set to True for interactive terminal or notebook use. |
Advanced Parameters
The parameters below are available for fine-grained control over training behaviour. Most users can leave these at their defaults.
Scheduler and Regularization
| Parameter |
Type |
Default |
Description |
lr_scheduler |
str |
"step" |
Learning rate scheduler type. Options: "step" (step decay at lr_drop) or "cosine" (cosine annealing). |
lr_min_factor |
float |
0.0 |
Floor for the cosine scheduler, expressed as a fraction of the initial LR. Ignored when using "step". |
warmup_epochs |
float |
0.0 |
Number of epochs for linear learning rate warmup at the start of training. |
drop_path |
float |
0.0 |
Stochastic depth drop-path rate applied to the backbone. Higher values add more regularization. |
Runtime and Accelerator
| Parameter |
Type |
Default |
Description |
accelerator |
str |
"auto" |
PyTorch Lightning accelerator selection. "auto" picks GPU if available, then MPS, then CPU. |
seed |
int |
None |
Global random seed for reproducibility. None means no fixed seed is set. |
fp16_eval |
bool |
False |
Run evaluation passes in FP16 precision. Reduces memory usage but may lower numerical precision. |
compute_val_loss |
bool |
True |
Compute and log the detection loss on the validation set each epoch. |
compute_test_loss |
bool |
True |
Compute and log the detection loss during the final test run. |
DataLoader Tuning
| Parameter |
Type |
Default |
Description |
pin_memory |
bool |
None |
Pin host memory in the DataLoader for faster GPU transfers. None defers to PyTorch Lightning's default. |
persistent_workers |
bool |
None |
Keep DataLoader worker processes alive between epochs. None defers to PyTorch Lightning's default. |
prefetch_factor |
int |
None |
Number of batches to prefetch per DataLoader worker. None uses PyTorch's built-in default. |
Complete Parameter Reference
Below is a summary table of all training parameters:
| Parameter |
Type |
Default |
Description |
dataset_dir |
str |
Required |
Path to COCO or YOLO formatted dataset with train/valid/test splits. |
output_dir |
str |
"output" |
Directory for checkpoints, logs, and other training artifacts. |
epochs |
int |
100 |
Number of full passes over the dataset. |
batch_size |
int |
4 |
Samples per iteration. Balance with grad_accum_steps. |
grad_accum_steps |
int |
4 |
Gradient accumulation steps for effective larger batch sizes. |
lr |
float |
1e-4 |
Learning rate for the model (excluding encoder). |
lr_encoder |
float |
1.5e-4 |
Learning rate for the backbone encoder. |
resolution |
int |
Model-specific |
Input image size (must be divisible by 14). |
weight_decay |
float |
1e-4 |
L2 regularization coefficient. |
device |
str |
"cuda" |
Training device: cuda, cpu, or mps. |
use_ema |
bool |
True |
Enable Exponential Moving Average of weights. |
gradient_checkpointing |
bool |
False |
Trade compute for memory during backprop. |
checkpoint_interval |
int |
10 |
Save checkpoint every N epochs. |
resume |
str |
None |
Path to checkpoint for resuming training. |
tensorboard |
bool |
True |
Enable TensorBoard logging. |
wandb |
bool |
False |
Enable Weights & Biases logging. |
project |
str |
None |
W&B project name. |
run |
str |
None |
W&B run name. |
early_stopping |
bool |
False |
Enable early stopping. |
early_stopping_patience |
int |
10 |
Epochs without improvement before stopping. |
early_stopping_min_delta |
float |
0.001 |
Minimum mAP change to qualify as improvement. |
early_stopping_use_ema |
bool |
False |
Use EMA model for early stopping metrics. |
eval_max_dets |
int |
500 |
Maximum detections per image considered during COCO evaluation. |
eval_interval |
int |
1 |
Run COCO evaluation every N epochs. |
log_per_class_metrics |
bool |
True |
Log per-class AP metrics to the console and loggers. |
progress_bar |
bool |
False |
Enable tqdm progress bar during training. |
accelerator |
str |
"auto" |
PyTorch Lightning accelerator. "auto" selects GPU/MPS/CPU automatically. |
seed |
int |
None |
Random seed for reproducibility. None means no fixed seed. |
lr_scheduler |
str |
"step" |
Learning rate scheduler type: "step" or "cosine". |
lr_min_factor |
float |
0.0 |
Minimum LR as a fraction of the initial LR (cosine scheduler floor). |
warmup_epochs |
float |
0.0 |
Number of linear warmup epochs at the start of training. |
drop_path |
float |
0.0 |
Stochastic depth drop-path rate for the backbone. |
compute_val_loss |
bool |
True |
Compute and log loss during validation. |
compute_test_loss |
bool |
True |
Compute and log loss during the test run. |
fp16_eval |
bool |
False |
Run evaluation in FP16 precision to reduce memory usage. |
pin_memory |
bool |
None |
Pin DataLoader memory. None defers to PyTorch Lightning's default. |
persistent_workers |
bool |
None |
Keep DataLoader workers alive between epochs. None uses PTL default. |
prefetch_factor |
int |
None |
Number of batches prefetched per worker. None uses PyTorch default. |