SKTBrain/KoBERT
KoBERT is a Korean pre-trained BERT model developed by SKTBrain, trained on 5M sentences from Korean Wikipedia.

KoBERT is a Korean-specific pre-trained BERT model trained on Korean Wikipedia (5M sentences, 54M words) using PyTorch and Hugging Face Transformers. It provides a base-sized cased BERT architecture with 12 layers, 768 hidden units, and 12 attention heads, including a SentencePiece tokenizer with 8,002 vocab size. The model supports NLP tasks including sentiment analysis, named entity recognition, and sentence embeddings, and can be used via PyTorch, ONNX, or MXNet-Gluon.