References

Note

Please modify this document if anything is erroneous or not included. Last updated at September 14, 2014.

Korean morpheme analyzer tools

C/C++

  • KTS (1995) GPL v2
    • By 이상호, 서정연, 오영환 (KAIST & 서강대)
    • code
  • MACH (2002) custom
    • By Prof. Kwangseob Shim (성신여대)
  • MeCab-ko (2013) GPL LGPL BSD
    • By Yong-woon Lee and Youngho Yoo

Java

  • Arirang (2009) Apache v2
    • By SooMyung Lee
    • code
  • Hannanum (1999) GPL v3
    • By Prof. Key-Sun Choi Key’s research team (KAIST)
    • code, docs
  • KKMA (2010) GPL v2
    • By Prof. Sang-goo Lee’s research team (서울대)
  • KOMORAN (2013) custom
    • By shineware

Python

  • KoNLPy (2014) GPL v3
    • By Lucy Park (서울대)
  • UMorpheme (2014) MIT
    • By Kyunghoon Kim (UNIST)

R

  • KoNLP (2011) GPL v3
    • By Heewon Jeon

Others

Other NLP tools

Language parser

  • KoreanParser - By DongHyun Choi, Jungyeul Park, Key-Sun Choi (KAIST)

Corpora

  • HANTEC 2.0, KISTI & 충남대, 1998-2003.
    • 120,000 test documents (237MB)
    • 50 TREC-type questions for QA (48KB)
  • HKIB-40075, KISTI & 한국일보, 2002.
    • 40,075 test documents for text categorization (88 MB)
  • KAIST corpus, KAIST, 1997-2005.

  • Sejong corpus, National Institute of the Korean Language, 1998-2007.

Fork me on GitHub

Table Of Contents

Related Topics