Skip to contents

A dataset containing typicality ratings, production frequency, and response time data for English words within semantic categories. Covers both traditional Battig-style categories (concrete and abstract) and object categories from the THINGS database.

Format

A data frame with 3033 rows and 9 variables.

Source

Banks, B., & Connell, L. (2022). Category production norms for 117 concrete and abstract categories. Behavior Research Methods, 55, 1292-1313. doi:10.3758/s13428-021-01787-z ; Stoinski, L. M., Perkuhn, J., & Hebart, M. N. (2023). THINGSplus: New norms and metadata for the THINGS database. Behavior Research Methods, 55, 2884-2901. doi:10.3758/s13428-023-02110-8

Details

Typicality indicates how representative a word is of its category (e.g., "robin" is a typical bird, "penguin" is atypical). Higher values indicate greater typicality within the category.

This dictionary combines two sources:

  • Banks & Connell (2022): 678 entries across 117 categories (67 concrete, 50 abstract) with rated typicality on a 1-5 Likert scale, production frequency, mean rank, and response latency. Includes abstract categories not typically found in object-only norms (e.g., "a crime", "a virtue").

  • THINGSplus (Stoinski et al., 2023): 2,355 entries across 53 object categories with typicality scores on a 0-1 scale and response times. Licensed under CC-BY 4.0.

Note: typicality scales differ between sources. Banks & Connell uses a 1-5 Likert scale; THINGSplus uses a 0-1 normalized score. These are not directly comparable without rescaling.

Variables

  • term. the category exemplar word (e.g., "robin", "penguin")

  • category. the superordinate semantic category (e.g., "bird", "fruit")

  • domain. category domain: "concrete", "abstract", or "object"

  • typicality. typicality rating (1-5 Likert for Banks & Connell; 0-1 for THINGSplus)

  • production_frequency. number of participants who produced this exemplar (Banks & Connell only; NA for THINGSplus)

  • rank. mean ordinal position in production (Banks & Connell only; NA for THINGSplus)

  • first_rank_freq. frequency of being produced first (Banks & Connell only; NA for THINGSplus)

  • response_time. mean response latency in milliseconds (source-dependent: seconds for Banks & Connell, milliseconds for THINGSplus)

  • source. data source attribution ("banks_connell_2022" or "thingsplus_2023")