Baselight
Sign In
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Other
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Kaggle
1292 results

SQL Create Context

Uncovering Implications and Insights

Other
11 months ago
1
6.39 MB
0

OpenAI Summarization Corpus

Training and Validation Data from TL;DR, CNN, and Daily Mail

Other
11 months ago
4
68.93 MB
0

Anthropic Helpfulness-Harmlessness Preference

Iterative Human-in-the-Loop Solutions

Other
11 months ago
2
181.66 MB
0

Android Games

Games released for the android os

Other
11 months ago
1
16.74 kB
0

Alpaca Cleaned

Improving Pretrained Language Model Understanding

Other
11 months ago
1
23.83 MB
0

Humans Interaction Choice Rejection

Investigating Responses Through Selection and Rejection

Other
11 months ago
2
134.38 MB
0

European Alps Snow Depth Observations

Spatial and Long-term Trends 1971-2019

Other
11 months ago
20
66.96 MB
0

South Park Scripts Dataset

All the Words, All the Time

Other
11 months ago
20
6.39 MB
0

Miss America Titleholders

Miss america over the years

Other
11 months ago
2
29.16 kB
0

Games By Ubisoft

Games released by Ubisoft

Other
11 months ago
1
8.82 kB
0

Rocket Launch Sites

Sites used for rocket launches

Other
11 months ago
16
142.57 kB
0

Mammal Species & Taxonomic Changes

Taxonomic Changes & Type Specimen Metadata

Other
11 months ago
4
2.17 MB
0

Evol Codealpaca V1

An Innovative Augmentation Strategy for NLP

Other
11 months ago
1
135.42 MB
0

Web-Harvested Image And Caption Dataset

Web-Harvested Image and Caption Dataset

Other
11 months ago
2
361.36 MB
0

NER Tagged Text Dataset

NER Tagged Text Dataset

Other
11 months ago
3
174.25 MB
0

Short Jokes Dataset

Humorous Short Jokes

Other
11 months ago
1
15.95 MB
0

WebGL Model-based QA

WebGL Model-based Questions and Answering

Other
11 months ago
3
70.35 MB
0

Korean Translation Dataset For NLP Models

Translated Instructions and Input-Output Pairs in Korean

Other
11 months ago
1
125.95 MB
0

BSARD: French Belgian Law Dataset For IR

Retrieving Relevant Statutes for Legal Questions

Other
11 months ago
3
4.01 MB
0

LongAlpaca 12K

LongAlpaca - Generating instruct datasets from language models (longform)

Other
11 months ago
1
265.85 MB
0

Open-Platypus Logical Reasoning

Keyword Search and Sentence Transformation for Training

Other
11 months ago
1
15.33 MB
0

Atari 2600 Games

A Comprehensive Collection of Game Data

Other
11 months ago
3
43.98 kB
0

Covers Of Michael Jackson

A Comprehensive Collection of the King of Pop's Best and Most Memorable Covers

Other
11 months ago
1
16.8 kB
0

Human Judgments On Model Conversations

Human Judgments on Conversational Models

Other
11 months ago
2
1.41 MB
0

CIFAR-10: Color Images, 10 Classes

CIFAR-10: Color Images, 10 Classes

Other
11 months ago
2
272.69 MB
0

DistillChat V1: Mixture Of Conversations

Conversational Dataset with Diverse Sources

Other
11 months ago
1
224.37 MB
0

Fictional Worlds

Immersive insights into diverse fictional realms

Other
11 months ago
1
16.68 MB
0

Multilingual NER Dataset

Multilingual NER Dataset for Named Entity Recognition

Other
11 months ago
27
113.07 MB
0

WIDER FACE: Face Detection Benchmark

Face Detection Dataset with Image IDs and Number of Faces Detected

Other
11 months ago
3
4.21 MB
0

Open Assistant

Over 10,000 Annotated Trees in 35 Languages

Other
11 months ago
2
48.77 MB
0

Sigfox And LoRaWAN Localization Tool

Evaluating Fingerprinting Localization Algorithms in Large Outdoor Areas

Other
11 months ago
4
6.13 MB
0

Soil Texture Classes (USDA) By Depth, 250m

A Refined Global Mapping for 1950-2017

Other
11 months ago
1
3.54 kB
0

Mental Illness Disparities In Vets

Comparative Rates of Diagnoses Among Vulnerable Veteran Groups

Other
11 months ago
1
32.77 kB
0

Global Hotspots Of Sharks And Longline Fishing

Machine-Learning-Assisted Spatial Distribution of At-Risk Species

Other
11 months ago
12
17.61 MB
0

Allegro Articles Summarization Dataset

Allegro Articles Summarization Source-Target Dataset

Other
11 months ago
3
199.86 MB
0

Rag Instruct Benchmark Tester

200 Samples for Enterprise Core Q&A Tasks

Other
11 months ago
1
46.2 kB
0

Alpaca GPT-4

High-Performance NLP for Instruction-Following Reasoning

Other
11 months ago
1
47.83 MB
0

Databricks Dolly (15K)

Over 15,000 Language Models and Dialogues for Interactive Chat Applications

Other
11 months ago
1
7.68 MB
0

MetXBioDB Metabolite Biotransformations

Enzyme-Catalyzed Metabolism Insights

Other
11 months ago
1
446.83 kB
0

Wikipedia Molecules Properties Dataset

Molecular Properties Dataset from Wikipedia

Other
11 months ago
1
1.83 MB
0

LAMBADA Word Prediction

Evaluating text understanding through word prediction

Other
11 months ago
3
552.45 MB
0

Question-Answering Training And Testing Data

A dataset for training and testing question-answering models

Other
11 months ago
2
83.38 MB
0

Orca DPO Dialogue Pairs

Orca style for preference training (Intel's DPO dataset)

Other
11 months ago
1
18.88 MB
0

PAWS (Paraphrase Word Scrambling)

A dataset for modeling structure, context, and word order information

Other
11 months ago
6
124.19 MB
0

TinyShakespeare (Shakespeare's Plays)

40,000 lines of Shakespeare from a variety of Shakespeare's plays

Other
11 months ago
2
75.81 kB
0

220k-GPT4Vision Image Captions

220k-GPT4Vision Image Captions

Other
11 months ago
1
44.1 MB
0

RSICD Image Caption Dataset

RSICD Image Caption Dataset

Other
11 months ago
3
1.04 GB
0

Glaive Function Calling V2

A Knowledge Base for Trainable Natural Language Processing

Other
11 months ago
1
97.04 MB
0

Alpaca

Alpaca - Training LLMs to follow instructions

Other
11 months ago
1
70.75 MB
0

Tulu V2 Dataset

Assisting Assistive Tasks with Language Data Mixtures

Other
11 months ago
1
561.5 MB
0

Share link

Anyone who has the link will be able to view this.