Skip to contents

A dictionary of common English misspellings and their corrections, sourced from the codespell project, the Birkbeck Spelling Error Corpus, Wikipedia Lists of Common Misspellings, GNU Aspell, and Holbrook (1964). Each entry maps a misspelled form to its correct spelling, categorized by the type of error (omission, insertion, transposition, vowel substitution, etc.). Useful for text normalization, spell-checking pipelines, and OCR error correction.

Format

A data frame with 93488 rows and 4 variables.

Source

Wikipedia: Lists of Common Misspellings (CC-BY-SA 4.0); Birkbeck Spelling Error Corpus (Oxford Text Archive); GNU Aspell; Holbrook (1964) English for the Rejected

Variables

  • form. the misspelled word

  • replacement. the correct spelling

  • category. type of spelling error (vowel_substitution, consonant_substitution, transposition, omission, insertion, substitution)

  • source. data source attribution (codespell, Birkbeck Spelling Error Corpus, Wikipedia, GNU Aspell, Holbrook 1964)