Resources, including corpora and software, for processing Hungarian language.

Language resources

  • The Hunglish Corpus is a sentence-aligned Hungarian-English parallel corpus published under the Creative Commons Attribution license.
  • The Hungarian Webcorpus is a gigaword corpus of Hungarian gathered from the web.
  • The Hunglish dictionary is a machine readable English-Hungarian bilingual lexicon.
  • morphdb.hu is a Hungarian morphological database for use with Hunmorph morphological analyzer.

Software

  • hunpos is a HMM based open source part-of-speech tagger.
  • hunmorph is an open source tool and programming library for spell-checking, stemming and morphological analysing of agglutinative, german and other languages.
  • hunalign is a language independent sentence level aligner to build parallel corpora.
These fields are compatible with DCAT, an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
FieldValue
Publisher
Modified Date
2013-11-20
Release Date
2013-11-12
Identifier
4588d1c4-2e63-4072-8846-63fc7928b4c2
License
Creative Commons Attribution
Author
MOKK - Budapest University of Technology and Economics
Additional Info: 
Source: 
http://mokk.bme.hu/resources/