← all repositories
HamaWhiteGG/langchain-java

LangChain for the JVM crowd: LLM plumbing without the Python

A Java-native port of LangChain that wires LLMs into existing Big Data stacks like Spark and Flink.

langchain-java
Velocity · 7d
+0.5
★ / day
Trend
steady
star history

What it does

langchain-java re-implements the core LangChain abstractions—LLMs, chat models, prompt templates, chains, and agents—for Java 17+. It wraps OpenAI, Azure, ChatGLM2, and Ollama, plus vector stores (Pinecone, Milvus). The selling point is Big Data integration: dedicated modules let LLMs generate and run Spark SQL or Flink SQL through agent toolkits, so you can ask natural-language questions of your data pipelines.

The interesting bit

Most LLM Java libraries stop at REST wrappers. This one goes further by porting the orchestration layer—chains, agents with tool use, RAG flows—so Java shops don’t need a Python sidecar just to do LLM reasoning. The SQL chain that introspects a database schema and generates queries is the most concrete payoff.

Key highlights

  • Native Java 17 implementation of LLMChain, chat models, and ReAct agents
  • Big Data modules: Spark SQL Agent and Flink SQL Agent for natural-language analytics
  • Supports OpenAI (with streaming), Azure OpenAI, ChatGLM2, Ollama; vector stores Pinecone and Milvus
  • Published to Maven Central (io.github.hamawhitegg:langchain-core:0.2.1)
  • API docs hosted at https://hamawhitegg.github.io/langchain-java

Caveats

  • Requires Java 17+ and a Unix-like build environment; no Windows support mentioned
  • 567 stars and version 0.2.1 suggest early-stage maturity; feature parity with Python LangChain is unclear
  • Big Data modules appear to be agent wrappers around SQL toolkits rather than deep engine integration—essentially glue code, useful glue but still glue

Verdict

Worth a look if you’re running JVM-based data infrastructure and want to keep LLM orchestration in-language. Skip if you’re already happy with Python microservices or need production-hardened observability and error handling—the README doesn’t show either.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.