A dataset containing predicted familiarity, age of acquisition, concreteness, and imagery scores for nearly 85,000 words obtained from Paetzold and Specia 2016

bootstrap_mrc

Format

A data frame with 85,942 rows and 5 variables.

Source

https://doi.org/10.18653/v1/N16-1050

Variables

Variables:

  • term. unique word

  • familiarity. score indicating the predicted how commonly a term is seen, heard or used daily

  • acquisition_age. score indicating the predicted age at which a term believed to be learned

  • concreteness. score indicating how "palpable" the object the word refers to is.

  • imagery. score indicating the intensity with which a term arouses images.

References

Paetzold, Gustavo Henrique and Specia, Lucia (2016) Inferring Psycholinguistic Properties of Words. Proceedings of the 2016 NAACL.