Baselight

DailyDialog: Multi-Turn Dialog+Intention+Emotion

Human-written dialogues with communication intention and emotion labels

@kaggle.thedevastator_dailydialog_multi_turn_dialog_with_intention_and

Loading...
Loading...

About this Dataset

DailyDialog: Multi-Turn Dialog+Intention+Emotion


DailyDialog: Multi-Turn Dialog+Intention+Emotion

Human-written dialogues with communication intention and emotion labels

By daily_dialog (From Huggingface) [source]


About this dataset

The DailyDialog dataset is a meticulously curated collection of multi-turn dialogues that aims to accurately represent the way we communicate in our daily lives. It covers a wide range of topics that are relevant to our everyday experiences. What sets this dataset apart is that it includes human-written conversations, which means the language used is more natural and realistic, resulting in less noise and higher quality data.

Each dialogue in the dataset consists of two or more participants engaging in a conversation. The conversations are provided in textual form, allowing for easy analysis and processing. Alongside the dialogues, there are also corresponding labels for communication intention and emotion attached to each utterance.

The communication intention labels categorize each utterance based on its intended purpose or goal within the conversation. These categories provide valuable insights into how different participants express their intentions through speech.

In addition to the communication intention labels, there are also emotion labels assigned to each utterance in the dialogues. These emotion labels capture the emotional state or sentiment expressed by participants during various points in the conversation.

To facilitate model evaluation and testing, DailyDialog provides three separate files: validation.csv, train.csv, and test.csv. The validation set (validation.csv) contains dialogues with their respective communication intention and emotion labels for assessing model performance during development stages. The train set (train.csv) includes dialogues paired with corresponding communication intention and emotion labels for training purposes. Lastly, test.csv serves as an independent test set that enables evaluating models' proficiency by providing unseen dialogues along with their associated communication intention and emotion labels.

Overall, DailyDialog stands out as a high-quality dataset due to its accurate representation of daily life conversations paired with comprehensive labeling of both communication intentions and emotions expressed throughout these dialogues. This makes it an invaluable resource for developing robust dialogue systems capable of understanding human interactions on a deeper level while being able to identify diverse intentions behind speech acts alongside various emotional states encountered during daily life exchanges

How to use the dataset

Welcome to the DailyDialog dataset! This high-quality multi-turn dialog dataset has been curated to reflect our daily communication style and covers a wide range of topics related to our everyday lives. The dataset consists of human-written conversations, making it less noisy and more realistic. Each conversation in the dataset has been manually labeled with communication intention and emotion information, providing valuable insights into the dialogues.

To make the most of this dataset, here is a step-by-step guide on how you can use it effectively:

  • Understanding the columns:
    • dialog: This column contains the actual conversation between two or more participants. It is in text format.
    • act: The act column represents the communication intention labels for each utterance in the dialogue. These labels categorize each utterance based on its intention.
    • emotion: The emotion column contains emotion labels for each utterance in the dialogue. These labels represent the emotions expressed during that particular utterance.
  • Familiarize yourself with validation.csv:
    • The validation.csv file serves as a validation set for evaluating your model's performance. It contains pre-labeled conversations along with their corresponding communication intentions and emotion labels.
  • Explore train.csv for training purposes:
    • The train.csv file is meant for training purposes and provides conversations along with their communication intentions and emotion labels.
  • Test your model using test.csv:
    • Test.csv file has conversation along ithentensions or emotional label which can be addressed once program is recreated.

Finally, remember that this DailyDialog dataset offers an excellent opportunity to develop models capable of understanding multi-turn dialogues in a wide range of everyday scenarios. By utilizing both communication intention and emotion information provided, you can gain valuable insights into analyzing human conversations.

So dive into this rich resource, experiment with different techniques such as natural language processing and machine learning, and discover new ways of understanding and modeling human dialogues!

Research Ideas

  • Natural Language Processing: This dataset can be used for training NLP models to understand and generate more realistic and human-like dialogues. The communication intention labels can help in identifying the purpose or goal of each utterance, while the emotion labels can add emotional context to the conversations.
  • Sentiment Analysis: With the emotion labels, this dataset can be used for sentiment analysis tasks to classify the overall sentiment of a conversation or individual utterances. It can be valuable for understanding customer feedback, social media discussions, and other text-based conversations.
  • Dialogue Generation: Using this dataset, one can train dialogue generation models that are capable of creating realistic and engaging conversations on various daily life topics. The communication intention labels can guide the model in generating appropriate responses based on different intents expressed in the dialogues

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name Description
dialog This column contains the conversation between two or more participants in text format. (Text)
act The act column represents the communication intention labels for each utterance in the dialogue. These labels categorize the purpose behind each participant's speech, such as asking a question, making a statement, or making a request. (Categorical)
emotion The emotion column contains categorical labels that represent the emotions expressed by each participant during their utterances, such as anger, happiness, or sadness. (Categorical)

File: train.csv

Column name Description
dialog This column contains the conversation between two or more participants in text format. (Text)
act The act column represents the communication intention labels for each utterance in the dialogue. These labels categorize the purpose behind each participant's speech, such as asking a question, making a statement, or making a request. (Categorical)
emotion The emotion column contains categorical labels that represent the emotions expressed by each participant during their utterances, such as anger, happiness, or sadness. (Categorical)

File: test.csv

Column name Description
dialog This column contains the conversation between two or more participants in text format. (Text)
act The act column represents the communication intention labels for each utterance in the dialogue. These labels categorize the purpose behind each participant's speech, such as asking a question, making a statement, or making a request. (Categorical)
emotion The emotion column contains categorical labels that represent the emotions expressed by each participant during their utterances, such as anger, happiness, or sadness. (Categorical)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit daily_dialog (From Huggingface).

Tables

Test

@kaggle.thedevastator_dailydialog_multi_turn_dialog_with_intention_and.test
  • 318.26 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE test (
  "dialog" VARCHAR,
  "act" VARCHAR,
  "emotion" VARCHAR
);

Train

@kaggle.thedevastator_dailydialog_multi_turn_dialog_with_intention_and.train
  • 3.31 MB
  • 11118 rows
  • 3 columns
Loading...

CREATE TABLE train (
  "dialog" VARCHAR,
  "act" VARCHAR,
  "emotion" VARCHAR
);

Validation

@kaggle.thedevastator_dailydialog_multi_turn_dialog_with_intention_and.validation
  • 321.97 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE validation (
  "dialog" VARCHAR,
  "act" VARCHAR,
  "emotion" VARCHAR
);

Share link

Anyone who has the link will be able to view this.