Training Loggers¶

RF-DETR supports integration with popular experiment tracking and visualization platforms. You can enable one or more loggers to monitor your training runs, compare experiments, and track metrics over time.

CSV (always active)¶

A CSVLogger is always active regardless of any flags. It requires no extra packages and writes all metrics to {output_dir}/metrics.csv on every validation step.

TensorBoard¶

TensorBoard is a powerful toolkit for visualizing and tracking training metrics.

TensorBoard logging is enabled by default. Pass tensorboard=False to disable it.

Missing package behaviour

If the tensorboard package is not installed, training continues without error — a UserWarning is emitted and TensorBoard logging is silently suppressed. Install rfdetr[loggers] to avoid this.

Setup¶

Install the required packages:

pip install "rfdetr[loggers]"

Usage¶

TensorBoard is active unless you explicitly disable it:

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
    # tensorboard=True is the default; pass tensorboard=False to disable
)

Viewing Logs¶

Local environment:

tensorboard --logdir output

Then open http://localhost:6006/ in your browser.

Google Colab:

%load_ext tensorboard
%tensorboard --logdir output

Logged Metrics¶

All logged metric keys are listed in the Logged Metrics Reference.

Weights and Biases¶

Weights and Biases (W&B) is a cloud-based platform for experiment tracking and visualization.

Setup¶

Install the required packages:

pip install "rfdetr[loggers]"

Log in to W&B:

wandb login

You can retrieve your API key at wandb.ai/authorize.

Usage¶

Enable W&B logging in your training:

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
    wandb=True,
    project="my-detection-project",
    run="experiment-001",
)

Configuration¶

Parameter	Description
`project`	Groups related experiments together
`run`	Identifies individual training sessions

If you don't specify a run name, W&B assigns a random one automatically.

Features¶

Access your runs at wandb.ai. W&B provides:

Real-time metric visualization
Experiment comparison
Hyperparameter tracking
System metrics (GPU usage, memory)
Training config logging

Logged Metrics¶

All logged metric keys are listed in the Logged Metrics Reference.

ClearML¶

ClearML is an open-source platform for managing, tracking, and automating machine learning experiments.

ClearML is not yet integrated as a native PTL logger. Passing clearml=True to model.train() emits a UserWarning and has no other effect — metrics are not logged to ClearML.

Workaround: ClearML SDK auto-binding¶

ClearML's SDK captures PyTorch Lightning metrics automatically when a Task is initialised before training begins:

from clearml import Task
from rfdetr import RFDETRMedium

# Initialise before model.train() — ClearML auto-binds to PTL logging
task = Task.init(project_name="my-detection-project", task_name="experiment-001")

model = RFDETRMedium()
model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
    # Do NOT pass clearml=True — it does nothing
)

Alternatively, attach a ClearML callback directly using the Custom Training API.

MLflow¶

MLflow is an open-source platform for the machine learning lifecycle that helps track experiments, package code into reproducible runs, and share and deploy models.

Setup¶

Install the required packages:

pip install "rfdetr[loggers]"

Usage¶

Enable MLflow logging in your training:

from rfdetr import RFDETRMedium

model = RFDETRMedium()

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    batch_size=4,
    grad_accum_steps=4,
    lr=1e-4,
    output_dir="output",
    mlflow=True,
    project="my-detection-project",
    run="experiment-001",
)

Configuration¶

Parameter	Description
`project`	Sets the experiment name in MLflow
`run`	Sets the run name (auto-generated if not specified)

Custom Tracking Server¶

To use a custom MLflow tracking server, set environment variables:

import os

# Set MLflow tracking URI
os.environ["MLFLOW_TRACKING_URI"] = "https://your-mlflow-server.com"

# For authentication with tracking servers that require it
os.environ["MLFLOW_TRACKING_TOKEN"] = "your-auth-token"

# Then initialize and train your model
model = RFDETRMedium()
model.train(..., mlflow=True)

For teams using a hosted MLflow service (like Databricks), you'll typically need to set:

MLFLOW_TRACKING_URI: The URL of your MLflow tracking server
MLFLOW_TRACKING_TOKEN: Authentication token for your MLflow server

Viewing Logs¶

Start the MLflow UI:

mlflow ui --backend-store-uri <OUTPUT_PATH>

Then open http://localhost:5000 in your browser to access the MLflow dashboard.

Logged Metrics¶

All logged metric keys are listed in the Logged Metrics Reference.

Using Multiple Loggers¶

You can enable multiple logging systems simultaneously:

model.train(
    dataset_dir="path/to/dataset",
    epochs=100,
    tensorboard=True,
    wandb=True,
    mlflow=True,
    project="my-project",
    run="experiment-001",
)

This allows you to leverage the strengths of different platforms:

TensorBoard: Local visualization and debugging
W&B: Cloud-based collaboration and experiment comparison
MLflow: Model registry and deployment tracking

Note: clearml=True is accepted but has no effect in the current version — the flag does not attach a ClearML logger. Use the ClearML SDK workaround instead.

Attaching loggers via the Custom Training API¶

build_trainer automatically creates loggers from TrainConfig flags. To attach a logger not listed above (for example Neptune, Comet, or a fully custom logger), build it separately and append it to trainer.loggers before calling trainer.fit:

from rfdetr.config import RFDETRMediumConfig, TrainConfig
from rfdetr.training import RFDETRModule, RFDETRDataModule, build_trainer

model_config = RFDETRMediumConfig(num_classes=10)
train_config = TrainConfig(
    dataset_dir="path/to/dataset",
    epochs=100,
    output_dir="output",
    tensorboard=True,  # built-in loggers still work
)

module = RFDETRModule(model_config, train_config)
datamodule = RFDETRDataModule(model_config, train_config)
trainer = build_trainer(train_config, model_config)

# Attach any additional PTL-compatible logger
from pytorch_lightning.loggers import CSVLogger  # example — use any PTL logger

trainer.loggers.append(CSVLogger(save_dir="output", name="extra"))

trainer.fit(module, datamodule)

CSVLogger is always active (it requires no extra packages). All logged metric keys — train/loss, val/mAP_50_95, val/F1, val/ema_mAP_50_95, val/AP/<class>, etc. — are written to every logger in the list.

→ Full list of logged metrics