facebookincubator/nimble
Meta's columnar file format designed for large ML training datasets, featuring SIMD/GPU-friendly encodings and extensible stream compression.

Nimble is a columnar file format created by Meta as a replacement for Parquet and ORC. It is specifically optimized for wide tables with thousands of columns commonly found in feature engineering and ML training workloads. The format decouples stream encoding from physical layout, supports cascading encodings, and aims to be SIMD and GPU-friendly for parallel hardware. Nimble is distributed as a unified library to prevent environmental fragmentation across implementations.