Image · Video · Audio

underdogs · picking up speed

+70% /wk +211 ★/day↗accelerating

It bundles the entire AI drama workflow—script, storyboard, voice, and final cut—into a single pipeline you can host yourself.

★ 2.1k TypeScript Image · Video · Audio · explained Feature

basketikun/infinite-canvas

+22% /wk +121 ★/day↗accelerating

A self-hostable infinite canvas that wires AI image generation, reference editing, and chat into one collaborative workspace.

★ 3.9k TypeScript Creative · Design · explained

jatinkrmalik/vocalinux

+12% /wk +12 ★/day↗accelerating

Vocalinux is a fully offline, GPLv3 voice dictation app that pipes transcribed text into any Linux application on X11 or Wayland.

★ 674 Python Image · Video · Audio · explained

moonshine-ai/moonshine

+17% /wk +256 ★/day↗accelerating

An on-device voice toolkit that ditches the 30-second window and redundant re-processing that makes Whisper feel sluggish for live speech.

★ 10.5k C++ Image · Video · Audio · explained

PurpleDoubleD/locally-uncensored

+11% /wk +16 ★/day↗accelerating

A Tauri desktop app that auto-detects a dozen local AI backends so you don't have to wrestle with Docker or API keys.

★ 966 TypeScript Chat Assistants · explained

palmier-io/palmier-pro

+13% /wk +240 ★/day↗accelerating

Palmier Pro exists to turn a Swift-native video editor into a shared workspace where AI agents can read and write the timeline via MCP.

★ 12.5k Swift Agents · explained Feature

lightningpixel/modly

+7.0% /wk +46 ★/day↗accelerating

It wraps open-source image-to-3D models in a desktop app so your snapshots never leave your GPU.

★ 4.6k TypeScript Image · Video · Audio · explained

netease-youdao/Confucius4-TTS

+5.9% /wk +6.0 ★/day↗accelerating

Most zero-shot TTS tools still demand reference transcripts or stumble across languages; this one claims to do neither.

★ 715 Python Image · Video · Audio · explained

lidge-jun/ima2-gen

+11% /wk +9.9 ★/day↗accelerating

It exists because cloud image generators deserve a local memory layer, a branching canvas, and a UI outside the chat thread.

★ 614 TypeScript Image · Video · Audio · explained

matthartman/ghost-pepper

+5.8% /wk +25 ★/day↗accelerating

Ghost Pepper runs speech-to-text and meeting transcription entirely on your Mac, then publishes its own AI-reviewed privacy audit so you don't have to trust the README.

★ 3k Swift Image · Video · Audio · explained Feature

StarTrail-org/PixelRAG

+7.5% /wk +78 ★/day↗accelerating

PixelRAG renders documents into screenshot tiles and retrieves them visually, preserving tables and layout that HTML parsers strip away.

★ 7.3k Python RAG · Search · explained

yuanzhongqiao/printfilm

+5.2% /wk +22 ★/day↗accelerating

PrintFilm corrals AI text-to-video chaos into a four-phase production pipeline for motion comics and short dramas.

★ 2.9k TypeScript Image · Video · Audio · explained

AutoArk/GPA

+18% /wk +45 ★/day↗accelerating

GPA aims to unify speech recognition, text-to-speech, and voice conversion in one compact autoregressive model so you can stop juggling separate audio pipelines.

★ 1.7k Python Image · Video · Audio · explained

HITsz-TMG/VideoClaw

+2.6% /wk +6.1 ★/day↗accelerating

VideoClaw turns a one-sentence prompt into a full production pipeline with editable checkpoints, not just a black-box video dump.

★ 1.6k Python Agents · explained

Blaizzy/mlx-vlm

+2.2% /wk +17 ★/day↗accelerating

MLX-VLM crams speculative decoding, continuous batching, and KV cache quantization into a Mac-native toolkit for running multimodal models locally.

★ 5.3k Python Image · Video · Audio · explained

debpalash/OmniVoice-Studio

+5.4% /wk +71 ★/day↗accelerating

OmniVoice Studio bundles voice cloning, dubbing, dictation, and TTS into a desktop app that keeps all audio processing off the internet and away from API keys.

★ 9.1k Python Image · Video · Audio · explained

wildminder/awesome-ltx2

+4.5% /wk +3.6 ★/day↗accelerating

Because finding the right LTX-2 checkpoint, quantization, or LoRA across Hugging Face and ComfyUI nodes is a part-time job.

★ 558 Image · Video · Audio · explained

qualcomm/GenieX

+1.8% /wk +22 ★/day↗accelerating

NexaSDK is a local inference engine that squeezes frontier LLMs and vision models onto Qualcomm silicon through NPU, GPU, and CPU backends.

★ 8.3k Rust Inference · Serving · explained

HBAI-Ltd/Toonflow-app

+8.0% /wk +145 ★/day↗accelerating

Toonflow exists to turn a manuscript into an animated short drama without juggling five different browser tabs.

★ 12.7k TypeScript Image · Video · Audio · explained

EvoLinkAI/awesome-gpt-image-2-API-and-Prompts

+1.2% /wk +30 ★/day↗accelerating

Because 'make it pretty' is not a prompt, and this repo treats image generation like a production pipeline, not a toy.

★ 16.9k Python Image · Video · Audio · explained

loading more…