Data Catalog
Explore, analyze, and share quality data.
Explore, analyze, and share quality data.
Discovering Answers with Expertise
3 tables
26.61 MB0
Multi-Genre Natural Language Inference (MultiNLI)
3 tables
205.18 MB0
Chinese MRC Dataset with Language Diversities
3 tables
5.22 MB0
English-Darija Bilingual Corpus for Machine Translation
1 table
22.2 MB0
Enhanced erotica dataset with longer context samples
1 table
95.36 MB0
Synthetic training data for LLM development
1 table
122.33 MB0
kubectl commands and descriptions for Kubernetes
1 table
3.48 MB0
Performance Validation for Cricket Commentary Model
3 tables
6.62 MB0
Text classification dataset for question answering
3 tables
12.7 MB0
Accurate Medical Translation Dataset
1 table
2.34 MB0
Textual Entailment Dataset with Labelled Text Pairs
3 tables
49.44 MB0
Gender-biased coreference dataset focused on occupation stereotypes in WinoBias
8 tables
265.21 KB0
Multilingual named entity recognition for LLM training
528 tables
130.87 MB0
Multilingual Question-Answering Dataset
116 tables
247.54 MB0
Portuguese NER Corpus with 10 Classes
3 tables
432.18 KB0
Text Classification Dataset with 14 Classes
2 tables
110.63 MB0
Language-guided Generalist Agents for Web Tasks
1 table
776.76 MB0
Biology Problem-Solution Pairs for Synthetic Biology
1 table
20.85 MB0
A curated dataset for math instruction tuning models
1 table
93.14 MB0
Generating Alpaca-style code from natural language instructions
1 table
67.48 MB0
Building a Bridge Between Prompts and Knowledge for Large Language Models
1 table
127.1 KB0
Instruct dataset generated from starcoder
4 tables
10.33 MB0
Predicting Binary Preferences with SFT, PPO and DPO
6 tables
614.3 MB0
Conversation, Prompts, and Tags
3 tables
7.13 MB0
Anyone who has the link will be able to view this.