Dataset
Provider
Updated at
Tables
Size
Stars
Chemistry Problem-Solution
Chemistry Problem-Solution Dataset: 20K pairs across 25 topics and subtopics
OtherOpenerotica/basilisk-v0.2 Conversations Dataset
Annotated Conversations from openerotica and freedom-rp
OtherGPT Roleplay Realm: Enhanced Character
Character Cards and Dialogues for immersive role-playing experiences
OtherSmithsonian Butterfly Dataset
Butterfly images and information from the Smithsonian Institution
OtherGeneral Language Understanding Evaluation (GLUE)
The Famous General Language Understanding Evaluation benchmark
OtherDailyDialog (Multi-turn Dialog)
Dialogues that reflect our daily communication way and cover various topics
OtherConversations On Coding, Debugging, Storytelling
Conversations on Coding, Debugging, Storytelling & Science
OtherProsocialDialog - Problematic Content Dialogue
Teach conversation agents to respond to problematic topics
OtherOpenBookQA (Multi-step Reasoning)
Multi-step Reasoning, Commonsense Knowledge, and Rich Text Comprehension
OtherAll GPT-4 Conversations
All chat datasets generated by GPT-4 from Huggingface in the same format
OtherMcDonalds Ice Cream Machines Breaking - Timeseries
Is the mcdonald’s ice cream machine broken? [locations & times]
OtherEnglish Monarchs & Marriages
Names, ages, and marriages of English royals from 850 till current time
OtherTornado Tracks
Tornado tracks in the US, Puerto Rico, and the U.S Virgin Islands from 1950-2013
OtherRoom Occupancy Estimation
Estimate the precise number of occupants in a room using multiple env. sensors
OtherNational Poll On Healthy Aging (NPHA)
A subset of the NPHA dataset filtered down to develop and validate ML algorithms
Other