Name: Helpful-Harmless Assistant Dataset (For RLHF)
Creator: Kaggle
License: https://creativecommons.org/publicdomain/zero/1.0/

About this Dataset

Helpful-Harmless Assistant Dataset (For RLHF)

17k Train, 9000 Test

By Huggingface Hub [source]

About this dataset

This dataset is a collection of helpful and harmless training and testing data from the Anthropic paper Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. It includes an impressive array of 87,096 samples in the helpful set, including 43,722 in the training set and 23,346 in the testing set. Additionally, this dataset contains 65,298 samples for the harmless assistant — 42,394 for training and 2,304 for testing. Each sample contains 6 columns: context which represents the context of conversation between user and assistant; chosen which holds details of what action was chosen by user; rejected which holds details about what action was rejected by user; policy_sim which stores policy similarity; rewards which tracks rewards given to assistant actions after completion; and values consisting of confidence values given to each recommendation to be presented as candidates list to user. With these columns this dataset provides a comprehensive basis for further designing intelligent text-based assistants while reducing potential risks associated with machines' decisions making process via human feedback integration

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Research Ideas

Using the dataset to identify which types of conversation context are more likely to lead to helpful/harmless behavior from a conversational AI assistant.

Employing different machine learning algorithms on the dataset, such as clustering, neural networks or reinforcement learning, in order to determine what contexts and interactions yield the best results with a Helpful and Harmless Assistant.

Analyzing the data for patterns that can be used to inform strategies when designing computer agents that train and interact with users through natural language conversations

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: train.csv

Column name	Description
context	The context of the conversation. (String)
chosen	The chosen response from the user. (String)
rejected	The rejected response from the user. (String)

File: test.csv

Column name	Description
context	The context of the conversation. (String)
chosen	The chosen response from the user. (String)
rejected	The rejected response from the user. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Test

@kaggle.thedevastator_dh_rlhf_helpful_harmless_assistant_dataset.test

10 MB
18,592 rows
3 columns

CREATE TABLE test (
  "context" VARCHAR,
  "chosen" VARCHAR,
  "rejected" VARCHAR
);

Train

@kaggle.thedevastator_dh_rlhf_helpful_harmless_assistant_dataset.train

181.59 MB
344,317 rows
3 columns

CREATE TABLE train (
  "context" VARCHAR,
  "chosen" VARCHAR,
  "rejected" VARCHAR
);