jimmc414/onefilellm
CLI tool and Python library that scrapes and combines content from GitHub repos, papers, web pages, and YouTube into structured XML for LLM consumption.

OneFileLLM aggregates content from diverse sources including local files, GitHub repositories and pull requests, arXiv/Sci-Hub papers, YouTube transcripts, and web documentation into a single structured XML output. The aggregated result is automatically copied to clipboard for use as LLM context. It supports multiple output formats (text, markdown, JSON, HTML, YAML) and uses tiktoken for token counting. The tool can be used via command line or imported as a Python library.