Image · Video · Audio

underdogs · picking up speed

+69% /wk +205 ★/day↗accelerating

It bundles the entire AI drama workflow—script, storyboard, voice, and final cut—into a single pipeline you can host yourself.

★ 2.1k TypeScript Image · Video · Audio · explained Feature

jatinkrmalik/vocalinux

+15% /wk +15 ★/day↗accelerating

Vocalinux is a fully offline, GPLv3 voice dictation app that pipes transcribed text into any Linux application on X11 or Wayland.

★ 668 Python Image · Video · Audio · explained

basketikun/infinite-canvas

+20% /wk +109 ★/day↗accelerating

A self-hostable infinite canvas that wires AI image generation, reference editing, and chat into one collaborative workspace.

★ 3.8k TypeScript Creative · Design · explained

moonshine-ai/moonshine

+17% /wk +252 ★/day↗accelerating

An on-device voice toolkit that ditches the 30-second window and redundant re-processing that makes Whisper feel sluggish for live speech.

★ 10.4k C++ Image · Video · Audio · explained

PurpleDoubleD/locally-uncensored

+11% /wk +15 ★/day↗accelerating

A Tauri desktop app that auto-detects a dozen local AI backends so you don't have to wrestle with Docker or API keys.

★ 961 TypeScript Chat Assistants · explained

AutoArk/GPA

+21% /wk +52 ★/day↗accelerating

GPA aims to unify speech recognition, text-to-speech, and voice conversion in one compact autoregressive model so you can stop juggling separate audio pipelines.

★ 1.7k Python Image · Video · Audio · explained

lidge-jun/ima2-gen

+12% /wk +10 ★/day↗accelerating

It exists because cloud image generators deserve a local memory layer, a branching canvas, and a UI outside the chat thread.

★ 603 TypeScript Image · Video · Audio · explained

palmier-io/palmier-pro

+12% /wk +208 ★/day↗accelerating

Palmier Pro exists to turn a Swift-native video editor into a shared workspace where AI agents can read and write the timeline via MCP.

★ 12.2k Swift Agents · explained Feature

netease-youdao/Confucius4-TTS

+5.9% /wk +6.0 ★/day↗accelerating

Most zero-shot TTS tools still demand reference transcripts or stumble across languages; this one claims to do neither.

★ 715 Python Image · Video · Audio · explained

yuanzhongqiao/printfilm

+5.0% /wk +21 ★/day↗accelerating

PrintFilm corrals AI text-to-video chaos into a four-phase production pipeline for motion comics and short dramas.

★ 2.9k TypeScript Image · Video · Audio · explained

wildminder/awesome-ltx2

+5.9% /wk +4.7 ★/day↗accelerating

Because finding the right LTX-2 checkpoint, quantization, or LoRA across Hugging Face and ComfyUI nodes is a part-time job.

★ 555 Image · Video · Audio · explained

lightningpixel/modly

+5.8% /wk +38 ★/day↗accelerating

It wraps open-source image-to-3D models in a desktop app so your snapshots never leave your GPU.

★ 4.6k TypeScript Image · Video · Audio · explained

CookSleep/gpt_image_playground

+5.6% /wk +25 ★/day↗accelerating

A polished front-end for image generation APIs that exists because your prompt history shouldn’t live in someone else’s database.

★ 3.2k TypeScript Image · Video · Audio · explained

jingyaogong/minimind-o

+4.6% /wk +14 ★/day↗accelerating

MiniMind-O packs listen-see-speak intelligence into a 0.1B-parameter model you can retrain from the first line of code on a single desktop GPU.

★ 2.2k Python Language Models · explained

xuanyustudio/LocalMiniDrama

+11% /wk +15 ★/day↗accelerating

LocalMiniDrama wires your API keys into a Vue+Electron pipeline that turns story outlines into short-form AI video without shipping data to anyone else's cloud.

★ 971 JavaScript Image · Video · Audio · explained

StarTrail-org/PixelRAG

+6.4% /wk +66 ★/day↗accelerating

PixelRAG renders documents into screenshot tiles and retrieves them visually, preserving tables and layout that HTML parsers strip away.

★ 7.2k Python RAG · Search · explained

modelscope/FunClip

+2.9% /wk +25 ★/day↗accelerating

FunClip exists so you can edit video by copy-pasting text instead of scrubbing timelines.

★ 6.1k Python Domain Apps · explained

wuyoscar/GPT-Image2-Skill

+5.5% /wk +31 ★/day↗accelerating

It curates GPT Image 2 prompts and packages them as copy-paste examples, an agent skill, and a lightweight CLI.

★ 4k Python Coding Assistants · explained

freestylefly/awesome-gpt-image-2

+3.4% /wk +42 ★/day↗accelerating

A community knowledge base that reverse-engineers hundreds of GPT-Image2 examples into structured, agent-ready prompt protocols.

★ 8.7k JavaScript Image · Video · Audio · explained

Blaizzy/mlx-vlm

+2.2% /wk +16 ★/day↗accelerating

MLX-VLM crams speculative decoding, continuous batching, and KV cache quantization into a Mac-native toolkit for running multimodal models locally.

★ 5.3k Python Image · Video · Audio · explained

loading more…