References

Note

Please report if you know any other NLP engines or corpora that are not included in this list. Last updated at August 24, 2014.

Korean NLP engines

C/C++

  • MACH, Sungshin Women’s University custom
  • MeCab-ko, Yong-woon Lee and Youngho Yoo GPL LGPL BSD

Java

Python

Others

Corpora

  • HANTEC 2.0, KISTI and CNU, 1998-2003.
    • 120,000 test documents (237MB)
    • 50 TREC-type questions for QA (48KB)
  • HKIB-40075, 2002.
    • 40,075 test documents for text categorization (88 MB)
  • KAIST corpus, KAIST, 1997-2005.

  • Sejong corpus, National Institute of the Korean Language, 1998-2007.

Fork me on GitHub

Table Of Contents

Related Topics