Why most spelling checkers fail to do grapheme to phoneme transformation when it is easy and both reduces the symbol set for statistical analysis and is demonstrably more accurate (with the possible exception of Indian languages) is beyond me. Further, unknown words can often be corrected via the corrected spelling of the phoneme. With the reduced set of symbols, that in and of themselves eliminate trivial typos, statistical models can preform better, and faster. People make incorrect spelling choices when they intentionally chose the wrong letters to represent phonemes, or to apply spelling rules incorrectly.Ī better approach in all respects is to do what's often referred to as a grapheme to phoneme transformation, which is basically to `compile` a word in to a smaller set of characters representing the phonemes of the language. Both can be present in a particular ward, but they are different phenomenon. A typo is a transcription error, whereas incorrect spelling is a representation error. Spelling errors that can be classified in terms of edit distance, are far better characterized as "typos" not spelling errors. Spell correcting for general writing is somewhat different in requirements and optimization vectors. Google is quite good, but it is leveraging data not always available in other applications and is a special case optimized for search applications. I find spell checking in almost every application that I've tried to be spectacularly bad for anything other than trivial typos. Do not write a spell checker like this! I will hunt you down and beat you with a dictionary.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |