guhhhhaa/4675-scifi
A Chinese NLP corpus containing approximately 4,675 science fiction novels formatted as a training dataset.

Velocity · 7d
+0.2
★ / day
Trend
→steady
star history
This repository is a Chinese natural language processing corpus comprised of science fiction novels, compiled by a former Baidu Tieba sci-fi forum moderator. It is explicitly designed to serve as an AI corpus for training NLP models on Chinese science fiction text. The dataset includes works sourced from the forum and additional sci-fi novel archives.