A comprehensive dataset of abbreviations, acronyms, initialisms, honorifics, military ranks, titles, months, measurements, place abbreviations, publication abbreviations, organizational abbreviations, legal terms, Latin abbreviations, scientific terms, stage directions, fictional character names, time abbreviations, slang, firearm caliber designations, and political entity codes. Combines the former english_acronyms, english_honorifics, and english_political_abbreviations dataset (now absorbed) with additional abbreviation categories from the textnorm/ECHNAE project.
Details
Political abbreviations include ISO 3166-1 country codes (alpha-2 and alpha-3), US states and territories (USPS codes and AP style), and Canadian provinces and territories (Canada Post codes and traditional abbreviations).
Note: Some abbreviations appear in more than one row because the same form has different expansions in different categories (e.g., "MD" can mean "Maryland" or "medical doctor"). The combination of form + category uniquely identifies each entry.
Variables
form. the abbreviation, acronym, or initialism
full_form. full expanded form (e.g., "Doctor" for "dr.", "F.Y.I." for "FYI", "National Aeronautics and Space Administration" for "NASA")
category. type of abbreviation: honorific, military, title, month, measurement, measurement_time, publication, place, organization, versus, abbreviation, initialism, acronym, academic, economic, education, fictional, finance, firearm, latin, legal, medical, misc, scientific, slang, stage_direction, technology, time, country_alpha2, country_alpha3, us_state, us_district, us_territory, ca_province, ca_territory
description. brief description of the entry type or context (e.g., "honorific or title prefix", "initialism", "government agency", "unit of measurement", "sovereign state", "USPS state code"). For initialisms, provides classification such as "government agency", "company", "organization", "technology", "medical condition", etc.
source. data source attribution
