Dataset
Provider
Updated at
Tables
Size
Stars
BoolQ - Question-Answer-Passage Consistency
BoolQ Dataset: Question-Answer-Passage Consistency
OtherWikipedia Biographies Text Generation Dataset
Wikipedia Biographies: Infobox and First Paragraphs Texts
OtherHellaSwag: Commonsense NLI
ACL2019 Dataset for Testing Machine's Sentence Completion Abilities
OtherFile Validation And Training Statistics
Validation, Training, and Testing Statistics for tasksource/leandojo Files
OtherTruthfulQA: Benchmark For Evaluating Language
Evaluating truthfulness in language models' answers
OtherSymbolic Correlation Dataset For LLMs
Exploring the Relationship between Knowledge and Language
OtherWikiSQL (Questions And SQL Queries)
80654 hand-annotated questions and SQL queries on 24241 Wikipedia tables
OtherASLG-PC12 (English-ASL Gloss Parallel Corpus 2012)
Interactions between Corpus and Lexicon LREC
OtherCmrc2018 - Chinese Machine Reading Comprehension
Chinese MRC Dataset with Language Diversities
OtherEnglish-Darija Bilingual Text (Moroccan Arabic)
English-Darija Bilingual Corpus for Machine Translation
OtherTokenBender: Alpaca Code Generation Instructions
Generating Alpaca-style code from natural language instructions
OtherKnowledge Symbolic Correlation With LLMs
Building a Bridge Between Prompts and Knowledge for Large Language Models
OtherCommonsenseQA (Multiple-Choice Q&A)
12,102 questions with one correct answer and four distractor answers
OtherSciTail (Multiple-choice Science Exams)
27,026 Multiple-choice science exams and web sentences
OtherLarge-Scale Preference Dataset
Training Powerful Reward & Critic Models with Aligned Language Models
Other