KoNLPy: Korean NLP in Python

Build status Documentation Status

KoNLPy is a Python package for natural language processing (NLP) of the Korean language. For installation directions, see here.

>>> from konlpy.tag import Kkma
>>> from konlpy.utils import pprint
>>> kkma = Kkma()
>>> pprint(kkma.sentences(u'저는 대학생이구요. 소프트웨어 관련학과 입니다.'))
[저는 대학생이구요.,
 소프트웨어 관련학과 입니다.]
>>> pprint(kkma.nouns(u'대학에서 DB, 통계학, 이산수학 등을 배웠지만...'))
[대학,
 통계학,
 이산,
 이산수학,
 수학,
 등]
>>> pprint(kkma.pos(u'자주 사용을 안하다보니 모두 까먹은 상태입니다.'))
[(자주, MAG),
 (사용, NNG),
 (을, JKO),
 (안하, VV),
 (다, ECS),
 (보, VXV),
 (니, ECD),
 (모두, MAG),
 (까먹, VV),
 (은, ETD),
 (상태, NNG),
 (이, VCP),
 (ㅂ니다, EFN),
 (., SF)]

For more on how to use KoNLPy, go see the API.

Standing on the shoulders of giants

Korean, the 13th most widely spoken language in the world, is a beautiful, yet complex language. Myriad Korean NLP engines were built by numerous researchers, to computationally extract meaningful features from the labyrinthine text.

KoNLPy is not just to create another, but to unify and build upon their shoulders, and see one step further. It is built particularly in the Python (programming) language, not only because of the language’s simplicity and elegance, but also the powerful string processing modules and applicability to various tasks - including crawling, Web programming, and data analysis.

The three main philosophies of this project are:

Please report when you think any have gone stale.

Contribute

KoNLPy isn’t perfect, but it will continuously evolve and you are invited to participate!

Found a bug? Have a good idea for improving KoNLPy? Visit the KoNLPy GitHub page and suggest an idea or make a pull request.

Indices and tables

[1]With clear and brief documents.
[2]No, I’m not extremely fond of this either. However, some important depedencies - such as Hannanum, Kkma, MeCab-ko - are GPL licensed, and we want to honor their licenses. (It is also an inevitable choice. We hope things may change in the future.)
Fork me on GitHub

KoNLPy is a Python package for Korean natural language processing.

Table Of Contents

Donate

If you love KoNLPy, consider supporting the author on Gittip:

Translations

Useful Links