Changelog¶
RF-DETR release notes are maintained on GitHub Releases. Use the release feed to review versioned package changes, migration notes, and model updates.
- Install the latest PyPI package
- Migration guide — upgrade steps between major versions
- Cookbooks — runnable notebooks for training, fine-tuning, export, and deployment
Changelog¶
All notable changes to RF-DETR are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]¶
[1.7.0] — 2026-04-29¶
Added¶
augmentation_backendfield onTrainConfig("cpu"/"auto"/"gpu"): opt-in GPU-side augmentation via Kornia applied inRFDETRDataModule.on_after_batch_transferafter the batch is resident on the GPU. CPU path is unchanged and remains the default. Install withpip install 'rfdetr[kornia]'. Supports detection and segmentation (see below). (#1003)- Kornia GPU augmentation now supports instance segmentation: images, boxes, and per-instance masks are augmented in sync on the GPU. New public helper
collate_maskspacks[N_i, H, W]boolean masks into a[B, N_max, H, W]float32 tensor for Kornia;build_kornia_pipelinegains awith_masks: bool = Falseparameter;unpack_boxesgains an optionalmasks_augtensor that re-binarises and filters masks in sync with boxes. Previouslyaugmentation_backend="gpu"/"auto"was silently ignored for segmentation models; now it works identically to detection. Note: the mask buffer is[B, N_max, H, W]float32 — approximately 500 MB atB=8, N_max=50, H=W=560; useaugmentation_backend="cpu"on cards with limited VRAM. (#1003, closes #997) BuilderArgs— a@runtime_checkabletyping.Protocoldocumenting the minimum attribute set consumed bybuild_model(),build_backbone(),build_transformer(), andbuild_criterion_and_postprocessors(). Enables static type-checker support for custom builder integrations. Exported fromrfdetr.models. (#841)build_model_from_config(model_config, train_config=None, defaults=MODEL_DEFAULTS)— config-native alternative tobuild_model(build_namespace(mc, tc)); accepts Pydantic config objects directly and constructs the internal namespace automatically. Exported fromrfdetr.models. (#845)build_criterion_from_config(model_config, train_config, defaults=MODEL_DEFAULTS)— config-native alternative tobuild_criterion_and_postprocessors(build_namespace(mc, tc)); returns a(SetCriterion, PostProcess)tuple. Exported fromrfdetr.models. (#845)ModelDefaultsdataclass — exposes the 35 hardcoded architectural constants previously buried insidebuild_namespace(). Pass adataclasses.replace(MODEL_DEFAULTS, ...)override to the new config-native builders to customise individual constants. Note: fields may be promoted toModelConfig/TrainConfigin future phases. Exported fromrfdetr.models. (#845)MODEL_DEFAULTS— the canonicalModelDefaultssingleton with production defaults. Exported fromrfdetr.models. (#845)RFDETR.predict(include_source_image=...)— opt-out flag (defaultTrue) to skip storing the source image indetections.metadata["source_image"]; set toFalseto reduce memory use when the image is not needed for annotation. (#912)model_nameis now stored in checkpoint files during training so thatRFDETR.from_checkpoint()can resolve the correct model class directly from the checkpoint, without requiring the caller to know or pass a class hint.strip_checkpoint()preserves this key. Backward-compatible: checkpoints withoutmodel_namecontinue to resolve viapretrain_weightsfilename matching. (#895)rfdetr_versionis now stored in checkpoint files during training for provenance tracking and compatibility hints.strip_checkpoint()preserves this key. The key is omitted gracefully when the package version cannot be resolved (e.g. editable install without metadata). Backward-compatible: checkpoints withoutrfdetr_versioncontinue to load normally. (#918)notesparameter onRFDETR.train()andRFDETR.export()— embed arbitrary JSON-serialisable provenance metadata (labeller, date, class names, etc.) into best-model.pthcheckpoints (undercheckpoint["args"]["notes"]) and ONNX files (under the"rfdetr_notes"metadata property). String values are stored verbatim; all other types are JSON-encoded. (#1025, closes #1021)RF_HOMEenvironment variable controls where pretrained model weights are cached (default:~/.roboflow/models). Bare filenames passed aspretrain_weights(e.g."rf-detr-base.pth") are now resolved relative to this directory; paths with a directory component are used as-is with parent directories created automatically. (#130)- Grayscale and multispectral imagery support: RF-DETR models now accept inputs with any number of channels (not just 3). The pretrained DINOv2 patch-embedding weights are automatically adapted to the specified channel count at model construction time — no additional dependencies required. (#180, closes #75)
- Training configuration is now saved to
training_config.jsonin the output directory after training completes. The file captures the fullTrainConfig,ModelConfig, effective training parameters, class names, and number of classes — useful for reproducibility and debugging predictions from older checkpoints. (#194) dinov2_registers_windowed_smallbackbone is now available as a config option inModelConfig.encoder. (#236)rfdetr.from_checkpoint(path)— new top-level convenience function that loads a checkpoint and infers the correct model subclass automatically, without requiring the caller to specify a class. Equivalent toRFDETR.from_checkpoint(path)but importable directly from therfdetrpackage. (#664)- ONNX export filenames now include the model variant name (e.g.
rfdetr-medium.onnx) instead of the genericinference_model.onnx. Exporting multiple variants to the same directory no longer overwrites previous exports. (#910) - Background images (images without a matching label file) are now included in YOLO detection datasets as empty-detection samples instead of being silently dropped. Both detection and segmentation paths now use
_LazyYoloDetectionDatasetfor consistent behaviour. (#915) - TFLite export via
model.export(format="tflite"). Converts through ONNX usingonnx2tf; FP32 and FP16 outputs are always produced, INT8 quantization is available with a calibration image directory:model.export(format="tflite", quantization="int8", calibration_data="path/to/images/"). Requirespip install 'rfdetr[onnx,tflite]'. (#920) - PyTorch Lightning
.ckptfiles are now accepted aspretrain_weights. Keys are automatically normalized from PTL format (state_dictwithmodel.-prefixed keys,hyper_parameters→args) so thatload_pretrain_weights, class-name extraction, and compatibility checks work without manual conversion. (#951) skip_best_epochsparameter forRFDETR.train()andTrainConfig: the first N epochs are excluded from best-checkpoint selection and early-stopping comparison, preventing strong pretrained weights or resumed checkpoints from locking in a suboptimal early score. (#1000, closes #789)- TFLite inference now decodes segmentation mask outputs into
sv.Detections.mask. Mask logits are upsampled to the source image size using Pillow bilinear resampling and thresholded at zero, matching the behaviour ofPostProcess.forward. The mask tensor is detected by output name ("masks"substring) with a rank-4 shape fallback. (#1053) PretrainWeightsCompatibilityWarning— new warning class emitted when aModelConfigoverride (e.g. customencoder,num_queries, ornum_feature_levels) risks breaking pretrained weight loading. Importable asfrom rfdetr.config import PretrainWeightsCompatibilityWarningfor targeted filtering. (#1017)
Changed¶
peftis no longer installed as part of the defaultrfdetrpackage. It has moved to the[lora]and[train]optional extras. If you use LoRA fine-tuning, install withpip install 'rfdetr[lora]'. (#838)- Native RLE annotation support in the COCO segmentation pipeline:
convert_coco_poly_to_masknow explicitly detects and decodes both compressed (string counts) and uncompressed (int-list counts) RLE formats alongside existing polygon support. Malformed annotations now raise instead of being silently swallowed. (#897) - Pinned PyTorch Lightning to exclude known-compromised versions. (#1020)
Deprecated¶
build_namespace(model_config, train_config)— no longer used internally and deprecated in this release; usebuild_model_from_config,build_criterion_from_config, or_namespace_from_configsdirectly. It will be removed in v1.9 and currently emits aDeprecationWarningon use. (#845)load_pretrain_weights(nn_model, model_config, train_config)— thetrain_configpositional argument is deprecated and will be removed in v1.9; it is no longer used internally. Omit it:load_pretrain_weights(nn_model, model_config). Passing a non-Nonevalue emits aDeprecationWarning. (#845)TrainConfig.group_detr(architecture decision →ModelConfig),TrainConfig.ia_bce_loss(loss type tied to architecture family →ModelConfig),TrainConfig.segmentation_head(architecture flag →ModelConfig),TrainConfig.num_select(postprocessor count →ModelConfig;SegmentationTrainConfigusers: remove thenum_selectoverride — the model config value is always used),ModelConfig.cls_loss_coef(training hyperparameter →TrainConfig) — each now emitsDeprecationWarningwhen set on the wrong config object and will be removed in v1.9. (#841)RFDETRBase— useRFDETRNano,RFDETRSmall,RFDETRMedium, orRFDETRLargeinstead. EmitsFutureWarningon instantiation; scheduled for removal in v2.0. (#900)RFDETRSegPreview— useRFDETRSegNano,RFDETRSegSmall,RFDETRSegMedium, orRFDETRSegLargeinstead. EmitsFutureWarningon instantiation; scheduled for removal in v2.0. (#900)rfdetr.utilandrfdetr.deploysub-modules are deprecated and will be removed in v1.9. A__getattr__hook on therfdetrpackage now emits a clearImportErrorwith migration guidance when these legacy paths are accessed. (#839)
Fixed¶
-
Fixed TFLite export (
format="tflite") producing detection scores that collapse to ~0.02 (vs ~0.62 from ONNX) on real inputs. Root cause is a long-standing onnx2tf bug (PINTO0309/onnx2tf#274) where theGridSamplelowering diverges numerically from ONNX while onnx2tf's own validator silently passes. RF-DETR's deformable cross-attention usesF.grid_sampleonce per decoder layer; drift compounds and is amplified by the attention softmax. The converter now detects onnx2tf'sGridSample → pseudo-GridSamplereplacement kwarg at runtime (introspectingconvert()viainspect.signature) and passes it asTrue; a warning is logged when the kwarg is absent. (#1041) -
WindowedDinov2WithRegistersEmbeddings.forward()now raisesValueError(instead of silently failing under-O) when input spatial dimensions are not divisible bypatch_size * num_windows, with a clear message identifying the divisor and actual shape. (#167) -
Fixed
_namespace.py:num_selectin the builder namespace now always reads fromModelConfig, eliminating a regression whereTrainConfig.num_select(default 300) silently overrode model-specific values of 100–200 for segmentation variants (RFDETRSegNano,RFDETRSegSmall,RFDETRSegMedium,RFDETRSegLarge,RFDETRSegPreview). Post-processing now uses the correct top-k count for each model. (#841) -
Fixed
models/weights.py:load_pretrain_weightsnow correctly auto-aligns the model head when the checkpoint has fewer classes than the configured default, preventing a silent mismatch whennum_classeswas not explicitly set by the caller. (#845) -
Fixed
models/weights.py:load_pretrain_weightsnow slicesrefpoint_embed.weightandquery_feat.weightper-group when reshaping checkpoint queries, instead of taking a flattensor[: num_queries * group_detr]slice. The flat slice silently scrambled groups 1+ whennum_queriesdecreased andgroup_detr > 1; inference (which only reads group 0) was unaffected, but training-resume corrupted query embeddings for groups 1 onward. (#1019) -
Fixed YOLO segmentation training on large datasets hitting OS out-of-memory:
supervision.DetectionDataset.from_yolo(force_masks=True)was eager-rasterising H×W boolean masks for every image at dataset construction time (≈1 GB/1 000 images at 1024 px). A new_LazyYoloDetectionDatasetstores polygon coordinates only and defers dense mask rasterisation to__getitem__, keeping RAM proportional to annotation count rather than (N × H × W). (#851) -
Fixed ONNX/TRT dynamic batch inference:
gen_encoder_output_proposalsandTransformer.forwardextracted the batch size as a Python int and passed it totorch.full,.view(N_, ...),.expand(N_, ...), and.repeat(bs, ...), causing the ONNX tracer to bake the training batch size (e.g. 8) as a compile-time constant. TRT engines built with--minShapessmaller than the trace batch would fail at inference withReshape: reshaping failed. All six call sites are now replaced with ONNX-symbolic equivalents (zeros_like,-1reshapes,expand(memory.shape[0], ...)), keeping the batch dimension fully dynamic. (#950, closes #949) -
Fixed training failure when
square_resize_div_64=False: the non-square resize pipeline (SmallestMaxSize+LongestMaxSize) did not guarantee output dimensions divisible bypatch_size * num_windows, causingWindowedDinov2WithRegistersEmbeddings.forwardto raiseValueError. APadIfNeededstep (withpad_height_divisorandpad_width_divisorset topatch_size * num_windows) is now appended after the resize pair in both the train and val/test pipelines. (#991, closes #983) -
Fixed non-square batch padding correctness: batch-level
block_sizerounding is now applied in the DataLoader collator (nested_tensor_from_tensor_listviamake_collate_fn) in addition to the transform-levelPadIfNeeded, ensuring divisibility bypatch_size * num_windowssurvives anyComposereordering and applies uniformly to custom evaluation harnesses. (#992) -
Fixed
RFDETRModelModule.on_load_checkpointcrashing withRuntimeErrorwhen resuming training from a checkpoint saved at a different image resolution: DINOv2 positional embeddings in the checkpoint are now bicubic-interpolated to matchmodel_config.positional_encoding_sizebefore PyTorch Lightning applies the state dict. (#1002, closes #998) -
Fixed
RFDETRLargeinitialization showing two conflictingValueErrors (forpatch_size=14andpatch_size=16) when the deprecated-config fallback retry also fails. The fallback now re-raises the original error without chained context, so users see a single deterministic message. (#975) -
Fixed
RFDETRModelModule.__init__crashing withRuntimeError: size mismatch for backbone.0.encoder.encoder.embeddings.position_embeddingswhen training segmentation models at a custom resolution (e.g.RFDETRSegLarge(resolution=1008).train(...)). The training entry path now delegates to the canonicalload_pretrain_weightshelper, which bicubic-interpolates the DINOv2 positional embeddings beforeload_state_dict. (#1040, closes #1038, #1023) -
Fixed TFLite detection scores collapsing for all queries (scores ~0.02 vs ~0.62 from ONNX) when
GridSamplewas used as an onnx2tf pseudo-operator. TheGridSampleONNX node is now rewritten toGather-based integer-index arithmetic before conversion, eliminating all numerical drift from attention position sampling. This supersedes the pseudo-GridSampleruntime-kwarg approach added in #1041. (#1054) -
Fixed
class_namelookup for pretrained COCO models: COCO category IDs are sparse (1–90 with gaps for 80 classes), so flat 0-based indexing returned the wrong name (e.g.class_id=18("dog") incorrectly returnedclass_names[18]instead ofclass_names[16]). Detection now uses acoco_id → class_namemapping built from the canonicalCOCO_CLASSESlist so every COCO category resolves to its correct label. Fine-tuned models continue to use direct 0-based indexing unchanged. (#1051)
[1.6.5] — 2026-04-22¶
Breaking Changes¶
predict()now stores the source image indetections.metadata["source_image"]instead ofdetections.data["source_image"]. supervision indexes every value indataby the detection mask;source_imageis per-image, not per-detection, so boolean/integer indexing raisedIndexError. Moving it tometadata(passed through unchanged) fixes the issue. Update any code that readsdetections.data["source_image"]. (#972, #968)
Fixed¶
- Fixed segmentation training crash on T4 and P100 GPUs: cuDNN engine selection fails for depthwise convolution backward on some CUDA stacks (Kaggle, Colab). A custom
autograd.Functionnow disables cuDNN in both forward and backward passes. (#967) - Fixed
ema_segm_mAP_50_95andema_segm_mAP_50being computed from the base (non-EMA) metric accumulator instead of the EMA accumulator, producing misleading validation scores for segmentation models. (#980) - Fixed
BestModelCallbacklosing the best EMA score on training resume because_best_emawas not persisted instate_dict(). (#973) - Fixed
positional_encoding_sizenot updating whenresolutionis set at construction time (e.g.RFDETRLarge(resolution=640)), causing shape mismatches during forward. A model validator now auto-syncs PE size. (#956) - Fixed pretrained weight loading crash with custom resolution: DINOv2 positional embeddings are now bicubic-interpolated to match the target grid before
load_state_dict. (#964) - Fixed
validate_checkpoint_compatibilityproducing a crypticRuntimeErroronpatch_sizemismatch when checkpoint lacks explicitargs.patch_size. The function now inferspatch_sizefrom the DINOv2 projection weight shape and raises a descriptiveValueError. (#971) - Fixed
predict()storingdetections.data["source_shape"]as a Pythontuple, which causedTypeErrorwheneversv.Detectionswas iterated. The value is now annp.ndarrayof shape(N, 2)and dtypeint64. (#966, #963) - Fixed
predict()emitting a misleading "class_id out of range" warning for the background/no-object class (class indexnum_classes). Background-class detections now mapdata["class_name"]to"__background__"without any warning. (#970)
[1.6.4] — 2026-04-10¶
Changed¶
predict()now includesclass_nameindetections.data, mapping each detection's 0-indexed class ID to its human-readable name. (#914)
Fixed¶
- Fixed segmentation multi-GPU DDP training crash:
build_trainer()now wrapsstrategy="ddp"withDDPStrategy(find_unused_parameters=True)whensegmentation_head=True. The segmentation head'ssparse_forward()leaves parameters unused on some forward steps; plain"ddp"raisedRuntimeError: It looks like your LightningModule has parameters that were not used in producing the loss. Non-segmentation DDP and other strategies are unchanged. (#942, #947) - Fixed fused AdamW crash under FP32 multi-GPU training:
configure_optimizers()andclip_gradients()now gate fused AdamW on the trainer's actual precision (requiring a BF16 variant) rather than GPU capability alone. On Ampere+ hardwaretorch.cuda.is_bf16_supported()is alwaysTrue, so the old code enabled fused AdamW even withprecision="32-true", causingRuntimeError: params, grads, exp_avgs, and exp_avg_sqs must have same dtype, device, and layoutfrom DDP gradient bucket view stride mismatches. (#942, #947) - Fixed multi-GPU DDP training crashing in Jupyter notebooks and Kaggle: replaced fork-based
ddp_notebookstrategy with a spawn-based DDP strategy that avoids OpenMP thread pool corruption afterfork(). (#928) - Fixed
RFDETR.train(resolution=...)being silently ignored — the kwarg is now applied tomodel_configbefore training begins, with validation that the value is divisible bypatch_size * num_windows. (#933) - Fixed
save_dataset_gridsbeing silently a no-op —DatasetGridSaveris now wired into the training loop, saving sample grids to{output_dir}/dataset_grids/when enabled. Grid save failures are caught without interrupting training. (#946) - Fixed partial gradient-accumulation windows at the tail of training epochs: the training dataset is now padded to an exact multiple of
effective_batch_size * world_size, ensuring every optimizer step uses a full gradient window. Workaround for pytorch-lightning#19987. (#937) - Fixed
torch.export.exportfailing on the transformer decoder by threadingspatial_shapes_hwthrough all decoder layers. (#936) download_pretrain_weights()no longer overwrites fine-tuned checkpoints that share a filename with a registry model (e.g.rf-detr-nano.pth). Previously, an MD5 mismatch would fall through to_download_file()and silently replace the user's weights with the original COCO checkpoint. The function now returns early whenever the file exists andredownload=False, regardless of MD5 status — a warning is emitted when the hash differs. Passredownload=Trueto force a fresh download. (#935)
[1.6.3] — 2026-04-02¶
Changed¶
predict()now stores the original image and its shape on returnedsv.Detectionsobjects —detections.data["source_image"](NumPy array) anddetections.data["source_shape"](NumPy array of shape(N, 2)where each row is[height, width]) let you annotate results without loading the image separately. (#892)RFDETR.train()auto-detectsnum_classesfrom the dataset directory when not explicitly set, reinitializing the detection head to the correct class count automatically. A warning is emitted when the configured value differs from the dataset count. (#893)optimize_for_inference()now accepts dtype as a string name (e.g."float16") in addition to atorch.dtypeobject; invalid dtype inputs uniformly raiseTypeError. (#899)
Fixed¶
- Fixed
models/lwdetr.py:reinitialize_detection_headnow replacesnn.Linearmodules instead of mutating.datatensors in-place, ensuringout_featuresmetadata stays consistent with the actual weight shape. This prevents ONNX export andtorch.jit.tracefrom emitting stale (pre-fine-tuning) class counts for fine-tuned models. (#904) - Fixed
RFDETR.optimize_for_inference()leaking a CUDA context on multi-GPU setups: the deep-copy, export, and JIT-trace steps now run insidetorch.cuda.device(device)to pin the context to the correct device. (#899) - Fixed
optimize_for_inference()leaving inconsistent state on failure: prior optimized state is now reset and flags are committed only after a successful build/trace; temp download files use unique per-process paths to avoid parallel worker collisions. - Fixed
deploy_to_roboflowfailing withFileNotFoundErrorafter PyTorch Lightning migration:class_names.txtis now written to the upload directory andargs.class_namesis populated before saving the checkpoint. (#890)
[1.6.2] — 2026-03-27¶
Added¶
RFDETR.predict(shape=...)— optional(height, width)tuple overrides the default square inference resolution; useful when matching a non-square ONNX export. Both dimensions must be positive integers divisible bypatch_size × num_windowsas determined by the model configuration. (#866)
Changed¶
ModelConfig.deviceandRFDETR.train(device=...)now accepttorch.deviceobjects and indexed device strings such as"cuda:0". Values are normalized to canonical torch-style strings.RFDETR.train()warns when an unmapped device type is passed to PyTorch Lightning auto-detection. (#872)
Fixed¶
- Fixed ONNX export ignoring an explicit
patch_sizeargument:export()andpredict()now resolvepatch_sizefrommodel_configby default, validate it strictly (positive integer, not bool), and enforce that(H, W)dimensions are divisible bypatch_size × num_windows. (#876) - Fixed ONNX export for models with dynamic batch dimensions — replaced
H_.expand(N_)withtorch.fullfor Python-int spatial dims to eliminate tracer failures. (#871)
[1.6.1] — 2026-03-25¶
Deprecated¶
RFDETR.export(..., simplify=..., force=...)— both arguments are now no-ops and emit aDeprecationWarning. RF-DETR no longer runs ONNX simplification automatically; remove these arguments from your calls. They will be removed in v1.8. (#861)
Fixed¶
- Fixed
RFDETR.train(): a missingrfdetr[train]install (e.g. plainpip install rfdetrin Colab) now raises anImportErrorwith an actionable message —pip install "rfdetr[train,loggers]"— instead of a rawModuleNotFoundErrorwith no install hint. (#858) - Fixed
AUG_AGGRESSIVEpreset:translate_percentwas(0.1, 0.1)— a degenerate range that forced AlbumentationsAffineto always translate right/down by exactly 10%. Corrected to(-0.1, 0.1)for symmetric bidirectional translation. (#863) - Fixed PTL training path:
latest.ckptand per-interval checkpoints (checkpoint_interval_N.ckpt) are now properly written and restored on resume. (#847) - Fixed
BestModelCallbackand checkpoint monitor raisingMisconfigurationExceptionon non-eval epochs wheneval_interval > 1— monitor key absence is now handled gracefully. (#848) - Fixed
protobufversion constraint in theloggersextra to guard against TensorBoard descriptor crash (TypeError: Descriptors cannot be created directly) with protobuf ≥ 4. (#846) - Fixed duplicate
ModelCheckpointstate keys whencheckpoint_interval=1;last.ckptis omitted in that configuration to avoid collision. (#859)
[1.6.0] — 2026-03-20¶
Added¶
- PyTorch Lightning training building blocks:
RFDETRModelModule,RFDETRDataModule,build_trainer(), and individual callbacks (RFDETREMACallback,COCOEvalCallback,BestModelCallback,DropPathCallback,MetricsPlotCallback) — all standard PTL components, swap/subclass/extend any piece. Level 3:rfdetr fit --configCLI with zero Python required. (#757, #794) - Multi-GPU DDP via
model.train():strategy,devices, andnum_nodesadded toTrainConfig; single-GPU behaviour unchanged when omitted. (#808) batch_size='auto': CUDA memory probe finds the largest safe micro-batch size, then recommendsgrad_accum_stepsto reach a configurable effective batch target (default 16 viaauto_batch_target_effective). (#814)ModelContextpromoted from_ModelContextto a public, exported API — inspectclass_names,num_classes, and related metadata viamodel.contextafter training. (#835)backbone_loraandfreeze_encoderadded as first-class fields inModelConfig. (#829)generate_coco_dataset(with_segmentation=True)produces COCO polygon annotations alongside bounding boxes for segmentation fine-tuning with synthetic data. (#781)set_attn_implementation("eager" | "sdpa")on the DINOv2 backbone — switch attention implementation at runtime. (#760)eval_max_dets,eval_interval, andlog_per_class_metricsadded toTrainConfig.python -m rfdetrentry point alongside therfdetrconsole script.py.typedmarker — RF-DETR is now PEP 561–compliant.
Changed¶
- Breaking: Minimum
transformersversion bumped to>=5.1.0,<6.0.0. The DINOv2 windowed-attention backbone now uses the transformers v5 API (BackboneMixin._init_transformers_backbone(), removedhead_maskplumbing). Projects still on transformers v4 must pinrfdetr<1.6.0. (#760) - Breaking: PyPI install extras renamed —
rfdetr[metrics]→rfdetr[loggers],rfdetr[onnxexport]→rfdetr[onnx]. draw_synthetic_shapenow returnsTuple[np.ndarray, List[float]]instead ofnp.ndarray. The second element is a flat COCO-style polygon list[x1, y1, x2, y2, …]. Any caller that previously didimg = draw_synthetic_shape(...)must be updated toimg, polygon = draw_synthetic_shape(...). (#781)- Albumentations version constraint broadened to
>=1.4.24,<3.0.0;RandomSizedCropconfigs usingheight/widthkwargs are automatically adapted to the 2.xsize=(height, width)API. (#786) - Current learning rate is now shown in the training progress bar alongside loss. (#809)
supervision,pytorch_lightning, and other heavy dependencies are now imported lazily (on first use) rather than at module load, reducing cold-import time in inference-only environments. (#801)
Deprecated¶
rfdetr.deploy.*— redirects torfdetr.export.*with aDeprecationWarning. Migrate before v1.7.rfdetr.util.*— redirects torfdetr.utilities.*with aDeprecationWarning. Migrate before v1.7.
Fixed¶
- Raised a descriptive
ValueErrorinstead of a crypticRuntimeError/ tensor-size mismatch when a checkpoint is incompatible with the current model architecture — coverssegmentation_headmismatch andpatch_sizemismatch. (#810) - Fixed
class_namesnot reflecting dataset labels onmodel.predict()after training — class names are now synced from the dataset so inference always uses the correct label list. (#816) - Fixed detection head reinitialization overwriting fine-tuned weights when loading a checkpoint with fewer classes than the model default. The second
reinitialize_detection_headcall now fires only in the backbone-pretrain scenario. (#815, #509) - Fixed
grid_sampleand bicubic interpolation silently falling back to CPU on MPS (Apple Silicon) — both now run natively on the MPS device. (#821) - Fixed
early_stopping=FalseinTrainConfigbeing silently ignored — the setting now propagates correctly. (#835) - Fixed
AttributeErrorcrash inupdate_drop_pathwhen the DINOv2 backbone layer structure does not match any known pattern. - Added warning when
drop_path_rate > 0.0is configured with a non-windowed DINOv2 backbone, where drop-path is silently ignored. - Fixed
ValueError: matrix entries are not finiteinHungarianMatcherwhen the cost matrix contains NaN or Inf — non-finite entries are replaced with a finite sentinel beforelinear_sum_assignment, warning emitted at most once per matcher instance. (#787) - Fixed YOLO dataset validation rejecting
data.yml— both.yamland.ymlare now accepted. (#777) - Silently dropped degenerate bounding boxes (zero width or height) before Albumentations validation instead of raising
ValueError. (#825)
[1.5.2] — 2026-03-04¶
Added¶
- Added peak GPU memory (
max_memin MB) to training and evaluation progress bars on CUDA; omitted on CPU and MPS. (#773)
Fixed¶
- Fixed
aug_configbeing silently ignored when training on YOLO-format datasets —build_roboflow_from_yolonever forwarded the value, so transforms always fell back to the default. (#774) - Fixed segmentation evaluation metrics not being written to
results_mask.jsonduring validation and test runs. (#772) - Fixed
AttributeErrorcrash inupdate_drop_pathwhen the DINOv2 backbone layer structure does not match any known pattern —_get_backbone_encoder_layersnow returnsNonefor unrecognised architectures. (#762) - Fixed
drop_path_ratenot being forwarded to the DINOv2 model configuration; stochastic depth was never applied even when explicitly set. Added a warning whendrop_path_rate > 0.0is used with a non-windowed backbone. (#762) - Fixed incorrect COCO hierarchy filtering that excluded parent categories from the class list. (#759)
- Fixed evaluation metric corruption on 1-indexed Roboflow datasets caused by a flawed contiguity check in
_should_use_raw_category_ids. (#755)
[1.5.1] — 2026-02-27¶
Added¶
- Added support for nested Albumentations containers (
OneOf,Sequential) insideaug_config. (#752)
Changed¶
- Migrated dataset transform pipeline to torchvision-native
Compose,ToImage, andToDtype;Normalizenow defaults to ImageNet mean/std. (#745)
Fixed¶
- Fixed
RFDETRMediummissing from the public API —__all__contained a duplicateRFDETRSmallentry. (#748) - Fixed
AR50_90reporting an incorrect value inMetricsMLFlowSinkdue to a wrong COCO evaluation index. (#735) - Fixed supercategory filtering in
_load_classesfor COCO datasets with flat or mixed supercategory structures. (#744) - Fixed crash in geometric transforms when a sample contained zero-area or empty masks. (#727)
- Fixed segmentation training on Colab —
DepthwiseConvBlocknow disables cuDNN for depthwise separable convolutions. (#728) - Pinned
onnxsim<0.6.0to preventpip installfrom hanging indefinitely. (#749)
[1.5.0] — 2026-02-23¶
Added¶
- Added custom training augmentations via
aug_configinmodel.train()— accepts a dict of Albumentations transforms, a built-in preset (AUG_CONSERVATIVE,AUG_AGGRESSIVE,AUG_AERIAL,AUG_INDUSTRIAL), or{}to disable. Bounding boxes and segmentation masks are transformed automatically. (#263, #702) - Added
save_dataset_grids=TrueinTrainConfigto write 3×3 JPEG grids of augmented samples tooutput_dirbefore training begins. (#153) - Added ClearML logger: set
clearml=TrueinTrainConfigto stream per-epoch metrics to ClearML. (#520) - Added MLflow logger: set
mlflow=TrueinTrainConfigto log runs and metrics to MLflow with custom tracking URI support. (#109) - Added live progress bar for training and validation with structured per-epoch logs. (#204)
- Added
devicefield toTrainConfigfor explicit device selection. (#687) ModelConfignow raises an error on unknown parameters, preventing silent misconfiguration. (#196)
Changed¶
- Deprecated
OPEN_SOURCE_MODELSconstant in favour ofModelWeightsenum. (#696) - Added MD5 checksum validation for pretrained weight downloads. (#679)
Fixed¶
- Fixed Albumentations bool-mask crash during segmentation training. (#706)
- Fixed
UnboundLocalErrorwhen resuming training from a completed checkpoint. (#707) - Prevented corruption of
checkpoint_best_total.pthvia atomic checkpoint stripping. (#708) - Fixed PyTorch 2.9+ compatibility issue with CUDA capability detection. (#686)
- Fixed dtype mismatch error when
use_position_supervised_loss=True. (#447) - Fixed inconsistent return values from
build_model. (#519) - Fixed
positional_encoding_sizetype annotation (bool→int). (#524) - Fixed ONNX export
output_namesto include masks when exporting segmentation models. (#402) - Fixed
num_selectnot being updated correctly during segmentation model fine-tuning. (#399) - Fixed
np.argwhere→np.argmaxmisuse. (#536) - Fixed COCO sparse category ID remapping for non-contiguous or offset category IDs. (#712)
- Fixed segmentation mask filtering when using aggressive augmentations. (#717)
[1.4.3] — 2026-02-16¶
Changed¶
- Pretrained weight downloads now validate against an MD5 checksum to detect corrupted files. (#679)
Fixed¶
- Fixed
deploy_to_roboflowfailing for segmentation model exports. (#578) - Fixed missing
infokey in COCO export format. (#681)
[1.4.2] — 2026-02-12¶
Added¶
- Added
generate_coco_dataset()utility for generating synthetic COCO-format datasets with configurable class counts, split ratios, and bounding box annotations. (#617) - Added
run_test=FalsetoTrainConfig— skip test-split evaluation when your dataset has no test set. (#628)
Changed¶
model.predict()now accepts image URLs directly — no need to download images before inference. (#629)- Plus models (
RFDETRXLarge,RFDETR2XLarge) are now distributed as a separaterfdetr_pluspackage under the Roboflow Model License. (#645)
Fixed¶
- Fixed segmentation ONNX export failure. (#626)
[1.4.1] — 2026-01-30¶
Added¶
- Added native YOLO dataset format support alongside COCO. (#74)
- Added
--print-freqCLI argument to control training log frequency. (#603)
Changed¶
- Pinned
transformersto<5.0.0to prevent incompatibility with the transformers v5 API. (#599)
Fixed¶
- Fixed class count mismatch in
train_from_configfor Roboflow-uploaded datasets. (#588) - Improved
num_classesmismatch warning messages to be actionable rather than misleading. (#261) - Fixed CLI crash when specifying the
deviceargument. (#246)
[1.4.0] — 2026-01-22¶
Headline release introducing new pre-trained model sizes — L, XL, and 2XL for object detection, and the full N/S/M/L/XL/2XL range for instance segmentation. Also added YOLO format training support, simplified the dependency footprint by removing several heavy packages (cython, fairscale, timm, einops, and others), and fixed per-class precision/recall/F1 computation. Drops Python 3.9 support.