Helpful-Harmless Assistant Dataset (For RLHF)
17k Train, 9000 Test
By Huggingface Hub [source]
About this dataset
This dataset is a collection of helpful and harmless training and testing data from the Anthropic paper Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. It includes an impressive array of 87,096 samples in the helpful set, including 43,722 in the training set and 23,346 in the testing set. Additionally, this dataset contains 65,298 samples for the harmless assistant — 42,394 for training and 2,304 for testing. Each sample contains 6 columns: context which represents the context of conversation between user and assistant; chosen which holds details of what action was chosen by user; rejected which holds details about what action was rejected by user; policy_sim which stores policy similarity; rewards which tracks rewards given to assistant actions after completion; and values consisting of confidence values given to each recommendation to be presented as candidates list to user. With these columns this dataset provides a comprehensive basis for further designing intelligent text-based assistants while reducing potential risks associated with machines' decisions making process via human feedback integration
More Datasets
For more datasets, click here.
Featured Notebooks
- 🚨 Your notebook can be here! 🚨!
How to use the dataset
Research Ideas
- Using the dataset to identify which types of conversation context are more likely to lead to helpful/harmless behavior from a conversational AI assistant.
- Employing different machine learning algorithms on the dataset, such as clustering, neural networks or reinforcement learning, in order to determine what contexts and interactions yield the best results with a Helpful and Harmless Assistant.
- Analyzing the data for patterns that can be used to inform strategies when designing computer agents that train and interact with users through natural language conversations
Acknowledgements
If you use this dataset in your research, please credit the original authors.
Data Source
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: train.csv
Column name |
Description |
context |
The context of the conversation. (String) |
chosen |
The chosen response from the user. (String) |
rejected |
The rejected response from the user. (String) |
File: test.csv
Column name |
Description |
context |
The context of the conversation. (String) |
chosen |
The chosen response from the user. (String) |
rejected |
The rejected response from the user. (String) |
Acknowledgements
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.