Baselight

Data Catalog

Explore, analyze, and share quality data.

Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
No options selected
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Kaggle
Showing 8660 Datasets

Germeval18 - Text Classification Dataset

Text Classification Dataset with Binary and Multi-class Labels

Technology and IT
8 months ago
2
851.69 kB
0

Allegro Articles Summarization Dataset

Allegro Articles Summarization Source-Target Dataset

Other
8 months ago
3
199.86 MB
0

Rag Instruct Benchmark Tester

200 Samples for Enterprise Core Q&A Tasks

Other
8 months ago
1
46.2 kB
0

PubMed Article Summarization Dataset

PubMed Summarization Dataset

Academic Research
8 months ago
3
1.16 GB
0

Alpaca GPT-4

High-Performance NLP for Instruction-Following Reasoning

Other
8 months ago
1
47.83 MB
0

High-Quality Multilingual Translation Data

13 Languages for Machine Learning

Technology and IT
8 months ago
62
208.17 MB
0

Databricks Dolly (15K)

Over 15,000 Language Models and Dialogues for Interactive Chat Applications

Other
8 months ago
1
7.68 MB
0

MetXBioDB Metabolite Biotransformations

Enzyme-Catalyzed Metabolism Insights

Other
8 months ago
1
446.83 kB
0

Occupational Skills And Tasks

Understanding the Role of Skills in Online Job Ads

Demographics and Population Studies
8 months ago
1
26.3 kB
0

Electronic Card Transactions From 2017-2020

Exploring Retail Spending Trends

Finance and Economics
8 months ago
72
6.62 MB
0

Relato Business Graph Database

Visualizing Company Relationships & Market Trends

Finance and Economics
8 months ago
2
10.95 MB
0

Tomato Gene Expression Data

Non-Organic Imprints

Healthcare
8 months ago
1
37.5 MB
0

HTTP Header Fields Dataset

How information is encoded and sent/received on the internet

Technology and IT
8 months ago
5
47.15 kB
0

Wikipedia Molecules Properties Dataset

Molecular Properties Dataset from Wikipedia

Other
8 months ago
1
1.83 MB
0

LAMBADA Word Prediction

Evaluating text understanding through word prediction

Other
8 months ago
3
552.45 MB
0

Question-Answering Training And Testing Data

A dataset for training and testing question-answering models

Other
8 months ago
2
83.38 MB
0

LLM Feedback Collection

Induce fine-grained evaluation capabilities into language models

Technology and IT
8 months ago
1
459.52 MB
0

UltraChat 200K

200K Dialogues of Diverse Topics for NLG Research

Academic Research
8 months ago
4
1.63 GB
0

Orca DPO Dialogue Pairs

Orca style for preference training (Intel's DPO dataset)

Other
8 months ago
1
18.88 MB
0

OpenHermes

GPT-4 AI Dataset - 242K Entries

Technology and IT
8 months ago
1
141.81 MB
0

QSAR Molecular Descriptor Predictions

Analyzing Activation Energy in Chemical Compounds

Environmental and Climate Sciences
8 months ago
3
260.56 kB
0

PAWS (Paraphrase Word Scrambling)

A dataset for modeling structure, context, and word order information

Other
8 months ago
6
124.19 MB
0

TinyShakespeare (Shakespeare's Plays)

40,000 lines of Shakespeare from a variety of Shakespeare's plays

Other
8 months ago
2
75.81 kB
0

Reddit: /r/EatCheapAndHealthy

Cost-Effective Nutritional Solutions from the Community

Finance and Economics
8 months ago
1
526.31 kB
0

Share link

Anyone who has the link will be able to view this.