It takes a village of agents to buy a stock—analysts, debaters, risk managers, and a portfolio manager who actually says no.
Language Models
heavyweights · velocity + momentumA dependency-free C/C++ inference engine that squeezes large language models onto laptops, phones, and browsers through aggressive quantization and hand-rolled kernels.
A PyTorch implementation of "Attention Is All You Need" that scales from 13M to multi-billion parameter models.
AirLLM slices giant transformers into layer shards so they fit in consumer VRAM without quantization or distillation.
CodeWhale wraps DeepSeek V4 in a formal hierarchy of rules so the model knows which instruction wins when everything conflicts.
OpenAI's Whisper replaces the usual Rube Goldberg pipeline of speech-processing tools with a single Transformer trained to do it all.
Twenty-five bite-sized projects showing how to wire up LLMs, RAG, and agents into things that actually do work.
An autoregressive foundation model that quantizes market data into discrete tokens and predicts the next "words" in a financial time series.
A desktop app that turns Andrej Karpathy's LLM wiki pattern into a persistent, self-organizing knowledge base with graph analysis and a two-step ingest pipeline.
LiteLLM is the adapter layer that stops your codebase from fracturing across a dozen provider SDKs.
Heretic automatically strips safety alignment from transformer models without retraining, using optimization to find the least destructive way to make them stop refusing.
Ollama wraps llama.cpp in a one-line installer and a model registry so you can run open weights without reading a dozen READMEs.
A trilingual learning roadmap that splits learners into "CLI power users" and "agent builders," then walks both from token math to multi-agent orchestration.
AgentScope 2.0 bets that modern LLMs need less hand-holding, not more orchestration.
A deliberately narrow inference engine that treats your SSD as first-class KV cache real estate.
A Stanford-backed framework for running personal AI agents locally by default, falling back to APIs only when necessary.
Eagle is less a single model than NVIDIA's internal R&D pipeline for multimodal AI, now open-sourced with three generations of VLMs and a grounding specialist.
A research framework that uses a multimodal LLM to plan video edits semantically, then hands off to a diffusion transformer to actually draw the frames.
Curated tutorials, tool reviews, and monetization playbooks for coding with AI—written by one prolific developer and open to all.
A self-hosted proxy that turns ChatGPT's web-only image generation into an OpenAI-compatible API with account rotation and web UI.


