Baselight

Data Catalog

Explore, analyze, and share quality data.

Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
No options selected
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Kaggle
Showing 8660 Datasets

LongAlpaca-Yukang ML Instructional Outputs

Unlocking the Power of AI

Technology and IT
8 months ago
1
265.85 MB
0

Objaverse-XL: 10M+ 3D Objects, Zero123-XL

For Training AI-Powered 3D Rendering

Technology and IT
8 months ago
1
1.38 GB
0

Synthia-v1.3

Orca-style dataset for following directions and conducting in-depth discussions

Other
8 months ago
1
128.27 MB
0

Air Pollution And Mental Health

Identifying Short-Term Human Impacts of Air Pollution

Healthcare
8 months ago
1
646.57 kB
0

Regional Water Temperatures Over Time

Historical Records of Berlin, Brandenburg and Altmark Lakes

Environmental and Climate Sciences
8 months ago
1
5.29 kB
0

Predicting Portuguese Bank Term Deposit

Identifying Likely Customers for Conversion Optimization

Finance and Economics
8 months ago
2
423.05 kB
0

Smithsonian Butterfly Dataset

Butterfly images and information from the Smithsonian Institution

Other
8 months ago
1
483.38 MB
0

GSM8K - Grade School Math 8K Q&A

A Linguistically Diverse Dataset for Multi-Step Reasoning Question Answering

Demographics and Population Studies
8 months ago
4
5.81 MB
0

MetaMath QA

Mathematical Questions for Large Language Models

Other
8 months ago
1
138.79 MB
0

HelpSteer: AI Alignment Dataset

Real-World Helpfulness Annotated for AI Alignment

Technology and IT
8 months ago
2
30.85 MB
0

Women's Crimes In India

Characteristics, Frequency, and Motives

Demographics and Population Studies
8 months ago
76
5.17 MB
0

Mental Health Chatbot Pairs

AI-based Tailored Support for Mental Health Conversation

Healthcare
8 months ago
1
103.88 kB
0

General Language Understanding Evaluation (GLUE)

The Famous General Language Understanding Evaluation benchmark

Other
8 months ago
34
151.72 MB
0

India Air Quality Trend

Comparing 2 Years of Air Quality Data from 2018 - 2020

Environmental and Climate Sciences
8 months ago
1
959.45 kB
0

Pokemon Gen 9 Stats

Understanding the Impact of Each Stat on Pokemon Performance

Media and Entertainment
8 months ago
1
18.35 kB
0

Job Postings In Europe

Exploring Salaries, Job Types and Locations

Finance and Economics
8 months ago
1
37.08 MB
0

Opera Performances

Opera performances and associated data (Composers, Year written, etc)

Other
8 months ago
1
618.08 kB
0

GoodReads Best Books

Ratings, Genres, Awards, and More

Media and Entertainment
8 months ago
1
42.19 MB
0

Evol-Instruct-Code-80k-v1

Instructional code snippets with corresponding outputs

Other
8 months ago
1
53.72 MB
0

DailyDialog (Multi-turn Dialog)

Dialogues that reflect our daily communication way and cover various topics

Other
8 months ago
3
4.13 MB
0

Online Influencer Marketing

Influencer Engagement and Performance

Ecommerce and Consumer Trends
8 months ago
1
62.88 kB
0

Belgian Statutory Article Retrieval Dataset

Legal Q&A Dataset for Law Information Retrieval

Other
8 months ago
3
4.01 MB
0

ViGGO: Video Game Chatbot Dataset

Conversational data-to-text for video game chatbots

Technology and IT
8 months ago
8
1.2 MB
0

Medical Conversation Corpus (100k+)

Generative Language Modeling for Medical Applications

Healthcare
8 months ago
2
75.7 MB
0

Share link

Anyone who has the link will be able to view this.