Pre-training scale · videos
Largest endoscopy video dataset in the literature
Dermyer et al. 2025
Wang et al. 2023
Yao et al. 2023
Chaitanya et al. 2024
Byrne et al. 2025
Foundation models for endoscopy
Episode 39 — Interview with Matt Schwartz
Virgo & Rajpurkar Lab partner on next-gen endoscopy AI foundation model
Latest conversation with Matt Schwartz
Foundation Model Platforms & APIs Market Map
Building foundation models for endoscopy
AI in Endoscopy — Market Report
Inside Virgo's research roadmap
Virgo launches EndoML at DDW 2025
AI in Endoscopy — turning lost video into insight
Virgo launches AI-powered EndoML platform
Countdown to AI super-intelligence in GI
Meet EndoDINO — a SOTA foundation model for endoscopy
Foundation Model Series — Advancing Endoscopy
Virgo launches EndoML powered by EndoDINO
AGA Innovation conversation
EndoDINO — paper
Founding Virgo
The future of endoscopy data
EndoDINO
Pre-training scale · videos
Dermyer et al. 2025
Wang et al. 2023
Yao et al. 2023
Chaitanya et al. 2024
Byrne et al. 2025
Pre-training scale · images
EndoDINO (Virgo)
Dermyer et al. 2025
ArgesFM (J&J)
Chaitanya et al. 2024
GastroNet-5M
Jong et al. 2026
Etro (Roche)
Yao et al. 2023
Validation
HyperKvasir · 3-class Mayo Endoscopic Scoring
Schwartz et al., 2023
Wang et al., MICCAI 2023
Huang et al., CVPR 2017
Oquab et al., Meta AI 2024
Virgo, 2025
Macro F1, linear probe on frozen backbone. Comparator values from each model's original publication; see chart for citations.
UNIFI Phase 3 · Ustekinumab in UC
Placebo arm
Treatment arm
AUROC for 8-week endoscopic healing (MES ≤ 1).
AUROC, 5-fold CV. EndoDINO video embeddings vs. 21 standard UC clinical covariates. Data presented at UEGW 2025.
Demographic diversity · procedure-weighted
169 centers · multi-continental
6 centers · Europe + Africa
1 center · Norway
1 center · China
Virgo: 1,053,880 US procedures across 148 centers (Sept 2025 audit). Public dataset demographics from each dataset's published documentation. Diversity index = Shannon entropy across White / Black / Asian / Hispanic / Other; higher is more balanced.
Out-of-distribution validation · external datasets
HyperKvasir
ColonoscopyMayo Endoscopic Scoring
Bærum Hospital, Norway
Kvasir-Capsule
Capsule endoscopyLesion / anatomy classification
Bærum Hospital, Norway
CholecT50
LaparoscopySurgical action triplets
IHU Strasbourg, France
SUN
ColonoscopyPolyp detection
Showa University, Japan
UNIFI Phase 3
ColonoscopyEndoscopic healing prediction
Janssen multi-site trial
YODA / external UC cohort
ColonoscopyMES (QWK 0.83)
Independent academic centers
Competitors (e.g. DovaVision UC, Iterative Health) typically train and evaluate on a single internal data source. EndoDINO is pre-trained on Virgo's corpus and validated on independent public benchmarks and external clinical trial cohorts.
148
US medical centers
vs. 1–6 in public datasets
1.05M
US procedures with demographics
procedure-weighted, not patient-weighted
46.6%
non-White representation
12.9% Black · 15.1% Asian · 13.9% Hispanic
0.713
Shannon diversity index
Hyper-Kvasir 0.28 · LDPolypVideo 0.10
HyperKvasir · 4-class Mayo Endoscopic Scoring
EndoDINO ViT-g/14 delivers leading performance on Mayo endoscopic scoring with a frozen backbone.
Mayo 0
Normal or inactive disease
Conf 0.97
Mayo 1
Mild disease
Conf 0.92
Mayo 2
Moderate disease
Conf 0.90
Mayo 3
Severe disease
Conf 0.91Frame-level predictions aggregated per procedure.
Scored by EndoDINO • Inference latency 14.8 ms.
Macro F1 0.748 · Linear probe on frozen backbone
How EndoDINO learns
One model. Every downstream task: scoring, detection, prediction, biomarker discovery.
01
Raw endoscopy video from the procedure stream.
02
Frames organized, deduplicated, temporally aligned.
03
Self-supervised learning at population scale.
04
A reusable embedding for any downstream task.
Capabilities
Proof point · UNIFI, Phase 3 UC
Saved
Avoided
Validated on Stelara Phase 3 UC trial data. UNIFI could have reached the same readout faster and at lower cost using EndoDINO as a covariate.
Covariate models that reduce trial size and accelerate enrollment.
Precision enrichment: identify likely responders before randomization.
UC and CD efficacy assessment beyond Mayo and SES-CD categories.
Real-world evidence priors from EndoDINO at population scale.
The data moat
Capture is the foundation of everything downstream. Real-world endoscopy video (at population scale, longitudinal, and continuously growing) is what makes a foundation model for GI possible. Models built on smaller datasets plateau. Ours compound.
The platform
Foundation model
01Virgo's foundation model for endoscopy. One model base for scoring, prediction, detection, and biomarker work, trained on the full procedure, not just the frame.
Build environment
02The environment for building on top of EndoDINO. A GI-specific model layer for clinical and research workflows.
Request access
Manuscript, UEGW 2025 poster, benchmark results, and partnership models. Sent directly to qualified researchers and partners.