← all repositories

mostly-ai/mostlyai

A Python SDK that generates privacy-safe synthetic datasets by training generative models on tabular or language data.

780 stars Python Data Tooling
mostlyai
Velocity · 7d
+0.9
★ / day
Trend
steady
star history

The SDK allows users to train synthetic data generators locally or via a remote endpoint on tabular or language datasets. It provides primitives for creating generators, generating synthetic datasets at scale, and connecting to organizational data sources. Differential privacy techniques are used to ensure the synthetic output does not leak information from the original training data.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.