Yes — superlinked/sie is open source, released under the Apache-2.0 license.

What language is sie written in?

superlinked/sie is primarily written in Python.

superlinked/sie has 2.3k stars on GitHub and is currently accelerating.

Where can I find sie?

superlinked/sie is on GitHub at https://github.com/superlinked/sie.

superlinked/sie

One inference engine for your agent’s entire retrieval stack

SIE replaces the usual tangle of separate embedding, reranking, and extraction servers with a single open-source container that scales from a laptop to Kubernetes.

★2.3k stars Python RAG · Search Inference · Serving LLMOps · Eval

View on GitHub ↗ Homepage ↗

Velocity · 7d

+27

★ / day

Trend

↗accelerating

star history

What it does

SIE is an open-source inference server that serves embeddings, reranking, and entity extraction through a single HTTP API. It bundles 85+ pre-configured models across dense, sparse, multi-vector, vision, and cross-encoder architectures, and can keep several of them warm at once using on-demand loading and LRU eviction. The project does not stop at the container: it ships a full production stack—load-balancing gateway, KEDA autoscaling, Grafana dashboards, and Terraform for GKE or EKS—all Apache 2.0.

The interesting bit

Rather than running separate microservices for embedding, reranking, and extraction, SIE collapses everything into three SDK functions—encode, score, and extract—and exposes an OpenAI-compatible /v1/embeddings endpoint for drop-in migration. It treats the retrieval stack as a single concern, infrastructure included.

Key highlights

85+ models pre-configured and quality-verified against MTEB in CI; pass a Hugging Face model ID and go.
Hot-swappable models with on-demand loading and LRU eviction, so dense, sparse, and vision encoders can run side by side.
Full production stack included: Helm charts, KEDA autoscaling to zero, Grafana dashboards, and Terraform for GKE/EKS.
OpenAI-compatible /v1/embeddings endpoint; Python and TypeScript SDKs.
Integrates with LangChain, LlamaIndex, Haystack, DSPy, CrewAI, Chroma, Qdrant, and Weaviate.

Caveats

Anonymous telemetry is enabled by default, though it can be disabled with SIE_TELEMETRY_DISABLED=1 or DO_NOT_TRACK=1.
First call to a model downloads weights from Hugging Face, so cold-start latency depends on your network and the model size.

Verdict

Teams building retrieval-heavy agents who are tired of stitching together separate inference services should try this; if you only ever need one static embedding model, it is probably more machinery than you need.

Frequently asked

What is superlinked/sie?: SIE replaces the usual tangle of separate embedding, reranking, and extraction servers with a single open-source container that scales from a laptop to Kubernetes.
Is sie open source?: Yes — superlinked/sie is open source, released under the Apache-2.0 license.
What language is sie written in?: superlinked/sie is primarily written in Python.
How popular is sie?: superlinked/sie has 2.3k stars on GitHub and is currently accelerating.
Where can I find sie?: superlinked/sie is on GitHub at https://github.com/superlinked/sie.