tag Package

Note

Initial runs of each class method may require some time to load dictionaries (< 1 min). Second runs should be faster.

Hannanum Class

class konlpy.tag._hannanum.Hannanum(jvmpath=None)

Wrapper for JHannanum.

JHannanum is a morphological analyzer and POS tagger written in Java, and developed by the Semantic Web Research Center (SWRC) at KAIST since 1999.

from konlpy.tag import Hannanum

hannanum = Hannanum()
print hannanum.analyze(u'롯데마트의 흑마늘 양념 치킨이 논란이 되고 있다.')
print hannanum.nouns(u'다람쥐 헌 쳇바퀴에 타고파')
print hannanum.pos(u'웃으면 더 행복합니다!')
print hannanum.morphs(u'웃으면 더 행복합니다!')
Parameters:jvmpath – The path of the JVM passed to init_jvm().
analyze(phrase)

Phrase analyzer.

This analyzer returns various morphological candidates for each token. It consists of two parts: 1) Dictionary search (chart), 2) Unclassified term segmentation.

morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase, ntags=9)

POS tagger.

This tagger is HMM based, and calculates the probability of tags.

Parameters:ntags – The number of tags. It can be either 9 or 22.

Kkma Class

class konlpy.tag._kkma.Kkma(jvmpath=None)

Wrapper for Kkma.

Kkma is a morphological analyzer and natural language processing system written in Java, developed by the Intelligent Data Systems (IDS) Laboratory at SNU.

from konlpy.tag import Kkma

kkma = Kkma()
print kkma.sentences(u'저는 대학생이구요. 소프트웨어 관련학과 입니다.')
print kkma.nouns(u'대학에서 DB, 통계학, 이산수학 등을 배웠지만...')
print kkma.morph(u'자주 사용을 안하다보니 모두 까먹은 상태입니다.')
print kkma.pos(u'어쩌면 좋죠?')
Parameters:jvmpath – The path of the JVM passed to init_jvm().
morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase)

POS tagger.

sentences(phrase)

Sentence detection.

Mecab Class

Warning

Mecab is not supported for Python 3 and Windows 7.

class konlpy.tag._mecab.Mecab(dicpath='/usr/local/lib/mecab/dic/mecab-ko-dic')

Wrapper for MeCab-ko morphological analyzer.

MeCab, originally a Japanese morphological analyzer and a POS tagger developed by the Graduate School of Informatics in Kyoto University, was modified to MeCab-ko by the Eunjeon Project to adapt to the Korean language.

In order to use MeCab-ko within KoNLPy, follow the directions in optional-installations.

from konlpy.tag import Mecab
# MeCab installation needed

mecab = Mecab()
print mecab.pos(u'자연주의 쇼핑몰은 어떤 곳인가?')
print mecab.morphs(u'영등포구청역에 있는 맛집 좀 알려주세요.')
print mecab.nouns(u'우리나라에는 무릎 치료를 잘하는 정형외과가 없는가!')
Parameters:dicpath – The path of the MeCab-ko dictionary.
morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase)

POS tagger.

See also

Korean POS tags comparison chart

Compare POS tags between several Korean analytic projects. (In Korean)
Fork me on GitHub

Table Of Contents

Related Topics