The Chinese Content Pipeline Your Agent Can Actually Read
A set of ingest skills that turn Chinese social feeds and podcasts into structured Markdown, indexes them in a vault, and exposes the archive to Claude Code or Codex through an MCP server.

What it does
Chubby Skills is a content ingestion pipeline built for Chinese platforms. It grabs posts, videos, and articles from Bilibili, Douyin, Xiaohongshu, WeChat, X, and podcasts, converts them into Markdown with schema v1 frontmatter, and drops them into an Obsidian-compatible vault. A companion MCP server lets Claude Code, Codex, and other agents search, read, and reason over the accumulated notes.
The interesting bit
The project treats platform fragility as a first-class concern. It uses subtitle-first transcription to avoid GPU-heavy audio extraction, defines explicit fallback chains for every platform when cookies expire or links rot, and ships a full QA layer—smoke tests, golden outputs, and schema validators—to keep the scrapers honest.
Key highlights
- Covers Chinese platforms often ignored by English-centric tools: Bilibili, Douyin, Xiaohongshu, WeChat, and local podcast apps.
- Subtitle-first transcription falls back to local audio extraction only when necessary, keeping the default install lightweight.
- Tiered dependency model: light mode handles text and images; heavy mode adds
ffmpeg,yt-dlp, andfaster-whisperfor video/audio. - Built-in vault curation auto-archives processed notes and generates knowledge cards with source attribution.
- MCP server exposes
search_vault,semantic_search_vault, andread_kb_noteto Claude Code, Codex, OpenClaw, and Hermes.
Caveats
- Video and podcast transcription pulls in heavy dependencies like
torchandffmpeg; the light install skips them entirely. - Live platform tests are opt-in because scraping is brittle: cookies expire, regions get blocked, and links rot quickly.
- Semantic search defaults to a zero-dependency
semantic-litemode; real vector search requires wiring up OpenAI orsentence-transformers.
Verdict
Worth a look if you curate Chinese-language content and want your AI agent to actually reference it. Skip it if you only consume English feeds—there are simpler tools for that.
Frequently asked
- What is chubbyguan/chubbyskills?
- A set of ingest skills that turn Chinese social feeds and podcasts into structured Markdown, indexes them in a vault, and exposes the archive to Claude Code or Codex through an MCP server.
- Is chubbyskills open source?
- Yes — chubbyguan/chubbyskills is open source, released under the MIT license.
- What language is chubbyskills written in?
- chubbyguan/chubbyskills is primarily written in Python.
- How popular is chubbyskills?
- chubbyguan/chubbyskills has 501 stars on GitHub.
- Where can I find chubbyskills?
- chubbyguan/chubbyskills is on GitHub at https://github.com/chubbyguan/chubbyskills.