Skip to contents

A dictionary of common English misspellings and their corrections, sourced from the Wikipedia Lists of Common Misspellings, the Birkbeck Spelling Error Corpus, GNU Aspell, and Holbrook (1964). Each entry maps a misspelled form to its correct spelling, categorized by the type of error (omission, insertion, transposition, vowel substitution, etc.). Useful for text normalization, spell-checking pipelines, and OCR error correction.

Format

A data frame with 40299 rows and 4 variables.

Source

Wikipedia: Lists of Common Misspellings (CC-BY-SA 4.0); Birkbeck Spelling Error Corpus (Oxford Text Archive); GNU Aspell; Holbrook (1964) English for the Rejected

Variables

  • form. the misspelled word

  • replacement. the correct spelling

  • category. type of spelling error (vowel_substitution, consonant_substitution, transposition, omission, insertion, substitution)

  • source. data source attribution (Wikipedia, Birkbeck Spelling Error Corpus, GNU Aspell, Holbrook 1964)