← all repositories

kohjingyu/gill

A multimodal LLM that processes interleaved image-and-text inputs to generate text, retrieve images, and synthesize images.

gill
Velocity · 7d
+0.4
★ / day
Trend
steady
star history

GILL (Generating Images with Large Language Models) is a NeurIPS 2023 research project that extends an LLM with vision capabilities. It enables the model to process arbitrarily interleaved image-and-text inputs and produce outputs including text responses, retrieved images from a large collection, and newly generated images. The model bridges large language models with image generation and retrieval using learned projection layers.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.