A dataset of 30,965 emails. Includes those contained in the "inbox" folders, and only those that were internal (sent to and from Enron email addresses). The original Enron Email Dataset was collected by the CALO Project (A Cognitive Assistant that Learns and Organizes), and contains 500,000 emails.
data("corpus_enron")
A data frame with 30,965 rows and 7 variables.
https://www.cs.cmu.edu/~enron/
doc_id. Unique identifier for each email
folder. Identifies the employee's account of each email
from. Who the email is from
to. Who the email is to (multiple email addresses)
date_time. Time and date the email was sent
subject. Subject of the email
text. Main text of the email