Unsupervised spelling correction for Slovak
Loading...
Downloads
4
Date issued
Authors
Hládek, Daniel
Staš, Ján
Juhár, Jozef
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
Signature
Abstract
This paper introduces a method to automatically propose and choose a correction for an incorrectly written word in a large text corpus written in Slovak. This task can be described as a process of finding the best matching sequence of correct words to a list of incorrectly spelled words, found in the input. Knowledge base of the classification system - statistics about sequences of correctly typed words and possible corrections for incorrectly typed words can be mathematically described as a hidden Markov model. The best matching sequence of correct words is found using Viterbi algorithm. The system will be evaluated on a manually corrected testing set.
Description
Subject(s)
automatic spelling correction, hidden Markov model, natural language processing
Citation
Advances in electrical and electronic engineering. 2013, vol. 11, no. 5, p. 392-397 : ill.