Skip to contents

A comprehensive dictionary of English occupation titles from O*NET 30.2, including canonical titles, alternate titles, and sample reported titles organized by SOC major groups. Useful for text normalization and occupation classification in NLP pipelines.

Format

A data frame with 58556 rows and 6 variables.

Source

O*NET 30.2 (CC BY 4.0, USDOL/ETA)

Variables

  • form. occupation title as commonly found in text (lowercase)

  • canonical. official O*NET-SOC occupation title

  • code. O*NET-SOC classification code (e.g., "33-1011")

  • category. human-readable SOC major group name (e.g., "management", "healthcare_practitioner_technical")

  • category_code. SOC major group code (e.g., "11-0000", "29-0000")

  • source. data source attribution