VLM3D Challenge

Vision-language modelling in 3d medical imaging

Welcome to the VLM3D Challenge!

Challenge Finals and Presentations → MICCAI 2025
Workshop → ICCV 2025
Submit your paper to our ICCV workshop!

Challenge Tasks

Task 1: Radiology Report Generation

Participants build vision‑language models that translate a complete 3‑D chest CT scan into a free‑text radiology report covering findings and impression; performance is assessed with standard NLG scores (BLEU, METEOR, ROUGE‑L) plus the clinically aware CRG metric, using the CT‑RATE dataset split into public train/validation and hidden internal+external test sets.

reportgen Image

Task 2: Multi-Abnormality Classification

Given a volumetric chest CT, algorithms must output an 18‑length binary vector indicating the presence of common thoracic conditions (e.g., pleural effusion, lung nodule); evaluation on hidden test cohorts combines AUROC, F1, Precision, Recall, and Accuracy, with point‑based ranking driven by permutation‑test wins.

abnclass Image

Task 3: Self-Supervised Multi-Abnormality Localization

Without voxel‑level labels during training, systems must localize five key pathologies—pericardial effusion, pleural effusion, consolidation, ground glass opacity, and lung nodule—producing 3‑D heat‑maps that are scored on Dice, IoU, Hausdorff‑95, and Sensitivity against expert masks for 2000 hidden test scans.

abnloc Image

Task 4: Text-Conditional CT Generation

Participants synthesize anatomically plausible 3‑D chest CT volumes directly from radiology text prompts, aiming for high visual fidelity, realistic Hounsfield distributions, and tight semantic alignment; success is measured with CT‑adapted generative metrics (FVDI3D, FVDCT‑Net, CT‑CLIP, FID) and ranked via the same permutation‑based point system.

ctgen Image