Unsupervised spelling correction for Slovak

Loading...
Thumbnail Image

Downloads

4

Date issued

Authors

Hládek, Daniel
Staš, Ján
Juhár, Jozef

Journal Title

Journal ISSN

Volume Title

Publisher

Vysoká škola báňská - Technická univerzita Ostrava

Location

Signature

Abstract

This paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in the input. Knowledge base of the classification system - statistics about sequences of correctly typed words and possible corrections for incorrectly typed words can be mathematically described as a hidden Markov model. The best matching sequence of correct words is found using Viterbi algorithm. The system will be evaluated on a manually corrected testing set.

Description

Subject(s)

automatic spelling correction, hidden Markov model, natural language processing

Citation

Advances in electrical and electronic engineering. 2013, vol. 11, no. 5, p. 392-397 : ill.