← all repositories
layumi/University1652-Baseline

Teaching drones to read campus maps from three angles

A PyTorch benchmark for matching drone, satellite, and street views of 1,652 university buildings so UAVs can figure out where they are without GPS.

653 stars Python Computer VisionData Tooling
University1652-Baseline
Velocity · 7d
+0.3
★ / day
Trend
steady
star history

What it does

University-1652 is a dataset and training baseline for cross-view geo-localization. It pairs 50K+ images of 1,652 buildings across 72 universities, captured from drone, satellite, and street-level perspectives. The code trains a model to match views across these modalities—say, finding a satellite image from a drone photo, or navigating a drone back to a spot using satellite reference.

The interesting bit

The dataset is deliberately split by university: 33 schools for training, 39 held-out schools for testing. That forces models to generalize to unseen campuses rather than memorizing specific buildings. The authors also publish flight-path KML files and building coordinates, so you can replay drone trajectories in Google Earth Pro.

Key highlights

  • 50,218 training images across drone, street, satellite, and noisy Google street views
  • Two tasks: drone-to-satellite target localization and satellite-to-drone navigation
  • Supports fp16/bf16, re-ranking, GeM pooling, and multiple-query evaluation
  • Pre-trained models and evaluation scripts included; works with ResNet or VGG-16 backbones
  • Active workshop series (UAVM at ACM MM) with ongoing challenges through 2026

Caveats

  • Dataset requires a manual request via GitHub issue (author claims ~5 min response time)
  • README is a bit of a kitchen sink: workshop announcements, unrelated special issues, and deprecated torchvision warnings all pile up
  • GPU memory floor is 8 GB; no CPU fallback mentioned

Verdict

Worth a look if you’re building visual localization for UAVs or working on cross-view retrieval. Skip it if you need a drop-in, no-registration dataset or if your work stays firmly in the single-view, single-modality lane.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.