A dataset of 24,000 short stories. Each story has five components: (1) norm, (2) situation, (3) intention, (4) normative/deviant action, and (5) normative/deviant consequence. Columns can be combined to form full text stories. See details
Usage
data("corpus_moral_stories")
Details
Each line has a short story with a sentence in five columns, each referring to the basic components. The line will either refer to a moral action with a moral consequence, or an immoral action with an immoral consequence. Each moral story has a nearly identical immoral story. For example:
It is good to earn income to support your family. Phil was trying to find ways to help his family finances. Phil wants to help the bottom line.
In the deviant scenario:
Phil decides he and his family need to spend less money. Phil manages to cut the water bill in half before his family complains about the shower time limits.
In the normative scenario:
Phil decides that he and his family need to earn more money. Phil signs up for Mturk tasks and starts working on HITs all day to earn money.
The data split strategy followed the collators "norm distance" strategy. Stories were clustered by their norms into 1k clusters which were ordered by their isolation from other norms. Stories with norms from the most isolated clusters are assigned to testing and validation sets, with the rest forming the training set.
References
Emelin, Denis, Ronan Le Bras, Jena D. Hwang,
Maxwell Forbes, and Yejin Choi. (2020).
"Moral stories: Situated reasoning about norms,
intents, actions, and their consequences."
https://arxiv.org/abs/2012.15738
doc_id. Unique identifier for each story and condition
norm. The norm to be followed
situation. The situation in which the norm is relevant
intention. The actor's intention regarding the norm
moral_action. The moral action the actor takes
moral_consequence. The moral consquence of the moral action
immoral_action. The immoral action the actor takes
immoral_consequence. The immoral consquence of the moral action
label. Whether the action is moral (1) or immoral (0)
split. Whether the line is part of the training (train) testing (test) or validation (valid) sets