← all repositories

ByteDance-Seed/m3-agent

A multimodal agent framework from ByteDance that processes visual and auditory inputs, builds episodic and semantic long-term memory, and performs autonomous multi-turn reasoning.

1.4k stars Python AgentsLLMOps · Eval
m3-agent
Velocity · 7d
+4.4
★ / day
Trend
steady
star history

M3-Agent is a multimodal agent system that processes real-time visual and auditory inputs to build and update long-term memory. It combines episodic memory for experience-based learning with semantic memory for world knowledge accumulation. The agent organizes memory in an entity-centric multimodal format and performs iterative reasoning with retrieved information. The repository includes M3-Bench, a benchmark with 1000 real-world and web videos for evaluating memory effectiveness and multimodal reasoning in agents.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.