Baselight

Helpful-Harmless Assistant Dataset (For RLHF)

17k Train, 9000 Test

@kaggle.thedevastator_dh_rlhf_helpful_harmless_assistant_dataset

Loading...
Loading...

About this Dataset

Helpful-Harmless Assistant Dataset (For RLHF)


Helpful-Harmless Assistant Dataset (For RLHF)

17k Train, 9000 Test

By Huggingface Hub [source]


About this dataset

This dataset is a collection of helpful and harmless training and testing data from the Anthropic paper Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. It includes an impressive array of 87,096 samples in the helpful set, including 43,722 in the training set and 23,346 in the testing set. Additionally, this dataset contains 65,298 samples for the harmless assistant — 42,394 for training and 2,304 for testing. Each sample contains 6 columns: context which represents the context of conversation between user and assistant; chosen which holds details of what action was chosen by user; rejected which holds details about what action was rejected by user; policy_sim which stores policy similarity; rewards which tracks rewards given to assistant actions after completion; and values consisting of confidence values given to each recommendation to be presented as candidates list to user. With these columns this dataset provides a comprehensive basis for further designing intelligent text-based assistants while reducing potential risks associated with machines' decisions making process via human feedback integration

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

Research Ideas

  • Using the dataset to identify which types of conversation context are more likely to lead to helpful/harmless behavior from a conversational AI assistant.
  • Employing different machine learning algorithms on the dataset, such as clustering, neural networks or reinforcement learning, in order to determine what contexts and interactions yield the best results with a Helpful and Harmless Assistant.
  • Analyzing the data for patterns that can be used to inform strategies when designing computer agents that train and interact with users through natural language conversations

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: train.csv

Column name Description
context The context of the conversation. (String)
chosen The chosen response from the user. (String)
rejected The rejected response from the user. (String)

File: test.csv

Column name Description
context The context of the conversation. (String)
chosen The chosen response from the user. (String)
rejected The rejected response from the user. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Test

@kaggle.thedevastator_dh_rlhf_helpful_harmless_assistant_dataset.test
  • 9.54 MB
  • 18592 rows
  • 3 columns
Loading...

CREATE TABLE test (
  "context" VARCHAR,
  "chosen" VARCHAR,
  "rejected" VARCHAR
);

Train

@kaggle.thedevastator_dh_rlhf_helpful_harmless_assistant_dataset.train
  • 173.17 MB
  • 344317 rows
  • 3 columns
Loading...

CREATE TABLE train (
  "context" VARCHAR,
  "chosen" VARCHAR,
  "rejected" VARCHAR
);

Share link

Anyone who has the link will be able to view this.