← all repositories

erre-quadro/spikex

SpikeX is a collection of spaCy pipeline components for knowledge extraction tasks like entity linking, abbreviation detection, phrase extraction, and text clustering.

403 stars Python ML FrameworksRAG · Search
spikex
Velocity · 7d
+0.2
★ / day
Trend
steady
star history

SpikeX extends the spaCy NLP library with ready-to-use pipeline components for structured knowledge extraction. It provides components for linking Wikipedia pages to text chunks, clustering noun phrases using a radial Ball Mapper algorithm, detecting and resolving abbreviations and acronyms, extracting noun and verb phrases, and pattern-based labeling with overlap resolution. The library includes a WikiGraph module that uses sparse adjacency matrices for efficient Wikipedia graph traversal and bidirectional dictionaries to optimize memory usage.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.