All GPT-4 Conversations
All chat datasets generated by GPT-4 from Huggingface in the same format
@kaggle.thedevastator_all_gpt_4_synthetic_chat_datasets
All chat datasets generated by GPT-4 from Huggingface in the same format
@kaggle.thedevastator_all_gpt_4_synthetic_chat_datasets
The dataset includes all chat conversations generated by GPT-4 that are hosted on open Huggingface datasets.
Everything is converted to the same format so the datasets can be easily merged and used for large scale training of LLMs.
This dataset is a collection of several single chat datasets.
If you use this dataset in your research, please credit the original authors of the internal datasets.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
CREATE TABLE alpaca_data_cleaned (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE code_alpaca_data (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE conala_mined (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE conala_paired_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE conala_paired_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE glaive_function_calling (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE goat (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE gorilla_16k (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE gsm8k_main_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE gsm8k_main_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE gsm8k_socratic_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE gsm8k_socratic_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE lima_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE lima_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE med_alpaca_data (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE puffin (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE riddle_sense_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE riddle_sense_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE riddle_sense_validation (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE science_qa_txt_only_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE science_qa_txt_only_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE science_qa_txt_only_validation (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE sciq_test (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE sciq_train (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);CREATE TABLE sciq_validation (
"message" VARCHAR,
"message_type" VARCHAR,
"message_id" BIGINT,
"conversation_id" BIGINT
);Anyone who has the link will be able to view this.