Is DeekSeek-OCR---Dockerized-API open source?

Yes — Bogdanovich77/DeekSeek-OCR---Dockerized-API is an open-source project tracked on heatdrop.

What language is DeekSeek-OCR---Dockerized-API written in?

Bogdanovich77/DeekSeek-OCR---Dockerized-API is primarily written in Python.

How popular is DeekSeek-OCR---Dockerized-API?

Bogdanovich77/DeekSeek-OCR---Dockerized-API has 1.1k stars on GitHub.

Where can I find DeekSeek-OCR---Dockerized-API?

Bogdanovich77/DeekSeek-OCR---Dockerized-API is on GitHub at https://github.com/Bogdanovich77/DeekSeek-OCR---Dockerized-API.

← all repositories

Bogdanovich77/DeekSeek-OCR---Dockerized-API

DeepSeek-OCR wrapped in Docker so you don't have to wrestle it

A practical packaging job around DeepSeek's vision model, turning PDF chaos into structured Markdown with a FastAPI shim and batch scripts.

★1.1k stars Python Computer Vision Data Tooling

View on GitHub ↗

Not currently ranked — collecting fresh signals.

star history

What it does

This repo wraps DeepSeek-OCR in a Dockerized FastAPI service and provides five Python scripts that scan a data/ folder, feed PDFs to the model, and spit out Markdown (or raw OCR text). You get a REST API for live requests, or drop files and run batch scripts locally. The scripts differ mainly in post-processing: basic conversion, image extraction, custom prompts from a YAML file, or plain text extraction.

The interesting bit

The real work here isn’t the model—it’s the duct tape. The README explicitly notes this project patches a bug in the upstream DeepSeek-OCR library where tokenize_with_images() fails on startup because the prompt parameter is missing. The Docker build transparently swaps in fixed files. That’s a service to anyone who’s tried to run the original and hit an opaque initialization crash.

Key highlights

FastAPI backend with /ocr/image, /ocr/pdf, and /ocr/batch endpoints
Five processor scripts with a clear suffix convention (-MD.md, -OCR.md, -CUSTOM.md) so you can compare outputs side-by-side
Enhanced variants extract images to data/images/ and clean up special tokens
Custom prompt support via custom_prompt.yaml for experimenting with extraction instructions
Includes copy-paste Python and JavaScript client examples

Caveats

Hardware appetite is serious: 12GB VRAM minimum, 32GB RAM (64GB+ recommended), 50GB storage
Requires NVIDIA GPU with CUDA 11.8+, Docker with GPU support, and the NVIDIA Container Toolkit—this won’t run on your laptop’s integrated graphics
The README is truncated mid-sentence in the “Custom Files” section, so the full list of patches isn’t visible

Verdict

Worth a look if you need DeepSeek-OCR running reliably without debugging its Python internals. Skip it if you were hoping for a lightweight or CPU-friendly solution—this is a workstation-grade deployment.

Frequently asked

What is Bogdanovich77/DeekSeek-OCR---Dockerized-API?: A practical packaging job around DeepSeek's vision model, turning PDF chaos into structured Markdown with a FastAPI shim and batch scripts.
Is DeekSeek-OCR---Dockerized-API open source?: Yes — Bogdanovich77/DeekSeek-OCR---Dockerized-API is an open-source project tracked on heatdrop.
What language is DeekSeek-OCR---Dockerized-API written in?: Bogdanovich77/DeekSeek-OCR---Dockerized-API is primarily written in Python.
How popular is DeekSeek-OCR---Dockerized-API?: Bogdanovich77/DeekSeek-OCR---Dockerized-API has 1.1k stars on GitHub.
Where can I find DeekSeek-OCR---Dockerized-API?: Bogdanovich77/DeekSeek-OCR---Dockerized-API is on GitHub at https://github.com/Bogdanovich77/DeekSeek-OCR---Dockerized-API.