Baselight
Sign In

Datasets

Total public datasets added

8,801

Rows

Total rows contributed

5,589,826,419

Popularity

Total times datasets used in queries

307

Stars

Total stars received

37

MedMCQA: Medical MCQ Dataset

Deep Learning & AI for Improving Healthcare

Healthcare
1 year ago
3
82.59 MB
0

Sciphi Textbooks Are All You Need

650,000 Unique Samples from K-12 to Grad School

Demographics and Population Studies
1 year ago
1
1.26 GB
0

AI Research Instructions And Outputs

Driving Innovation in Machine Learning and AI Exploration

Academic Research
1 year ago
1
53.72 MB
0

Airoboros LLMs Math Dataset

Mastering Complex Mathematical Operations in Machine Learning

Technology and IT
1 year ago
1
57.55 MB
0

Laion-Pop Image Classification Dataset

Accurately Predicting and Classifying Images with Alt Texts and NSFW Predictions

Technology and IT
1 year ago
1
298.51 MB
0

Open Assistant

Over 10,000 Annotated Trees in 35 Languages

Other
1 year ago
2
48.77 MB
0

Sigfox And LoRaWAN Localization Tool

Evaluating Fingerprinting Localization Algorithms in Large Outdoor Areas

Other
1 year ago
4
6.13 MB
0

Soil Texture Classes (USDA) By Depth, 250m

A Refined Global Mapping for 1950-2017

Other
1 year ago
1
3.54 kB
0

Hippocampal Gene Expression For Long-Term Memory

Understanding Transcription and Synaptic Regulation

Healthcare
1 year ago
9
145.24 kB
0

Global Health Outcomes Data

Impact on Mortality Rates and Malnutrition in Countries Around the World

Healthcare
1 year ago
1
28.41 kB
0

Mental Illness Disparities In Vets

Comparative Rates of Diagnoses Among Vulnerable Veteran Groups

Other
1 year ago
1
32.77 kB
0

Impact Of Living Standards On Dry Forest

Tribal and Marginalized Households in Central Indian Highlands

Environmental and Climate Sciences
1 year ago
1
90.55 kB
0

Global Hotspots Of Sharks And Longline Fishing

Machine-Learning-Assisted Spatial Distribution of At-Risk Species

Other
1 year ago
12
17.61 MB
0

Hunt Prices For North American Mammals

Investigating Costly Signaling Theory

Finance and Economics
1 year ago
1
15.5 kB
0

Popular Products From NewChic.com E-Commerce

Product, Brand, and User Interaction Analytics

Finance and Economics
1 year ago
9
20.63 MB
0

Women's Football (European Leagues)

Team and Player Performance Statistics

Sports
1 year ago
7
728.82 kB
0

Most Popular GitHub Projects

Popularity Factors and Growth Patterns

Technology and IT
1 year ago
1
491.4 kB
0

California Residents' ZEV Attitudes

Drivers' Preferences, Experiences, and Environmental Concerns

Environmental and Climate Sciences
1 year ago
9
3.62 MB
0

NYC Subway Entrance And Exit

Entrance & Exit locations of the NYC subway

Transportation and Logistics
1 year ago
1
122.18 kB
0

Pokemon Images And Text Descriptions

Pokemon Llava: Images and Text Descriptions

Media and Entertainment
1 year ago
1
692.48 MB
0

Germeval18 - Text Classification Dataset

Text Classification Dataset with Binary and Multi-class Labels

Technology and IT
1 year ago
2
851.69 kB
0

Allegro Articles Summarization Dataset

Allegro Articles Summarization Source-Target Dataset

Other
1 year ago
3
199.86 MB
0

Rag Instruct Benchmark Tester

200 Samples for Enterprise Core Q&A Tasks

Other
1 year ago
1
46.2 kB
0

PubMed Article Summarization Dataset

PubMed Summarization Dataset

Academic Research
1 year ago
3
1.16 GB
0

Alpaca GPT-4

High-Performance NLP for Instruction-Following Reasoning

Other
1 year ago
1
47.83 MB
0

High-Quality Multilingual Translation Data

13 Languages for Machine Learning

Technology and IT
1 year ago
62
208.17 MB
0

Databricks Dolly (15K)

Over 15,000 Language Models and Dialogues for Interactive Chat Applications

Other
1 year ago
1
7.68 MB
0

MetXBioDB Metabolite Biotransformations

Enzyme-Catalyzed Metabolism Insights

Other
1 year ago
1
446.83 kB
0

Occupational Skills And Tasks

Understanding the Role of Skills in Online Job Ads

Demographics and Population Studies
1 year ago
1
26.3 kB
0

Electronic Card Transactions From 2017-2020

Exploring Retail Spending Trends

Finance and Economics
1 year ago
72
6.62 MB
0

Relato Business Graph Database

Visualizing Company Relationships & Market Trends

Finance and Economics
1 year ago
2
10.95 MB
0

Tomato Gene Expression Data

Non-Organic Imprints

Healthcare
1 year ago
1
37.5 MB
0

HTTP Header Fields Dataset

How information is encoded and sent/received on the internet

Technology and IT
1 year ago
5
47.15 kB
0

Wikipedia Molecules Properties Dataset

Molecular Properties Dataset from Wikipedia

Other
1 year ago
1
1.83 MB
0

LAMBADA Word Prediction

Evaluating text understanding through word prediction

Other
1 year ago
3
552.45 MB
0

Question-Answering Training And Testing Data

A dataset for training and testing question-answering models

Other
1 year ago
2
83.38 MB
0

LLM Feedback Collection

Induce fine-grained evaluation capabilities into language models

Technology and IT
1 year ago
1
459.52 MB
0

UltraChat 200K

200K Dialogues of Diverse Topics for NLG Research

Academic Research
1 year ago
4
1.63 GB
0

Orca DPO Dialogue Pairs

Orca style for preference training (Intel's DPO dataset)

Other
1 year ago
1
18.88 MB
0

OpenHermes

GPT-4 AI Dataset - 242K Entries

Technology and IT
1 year ago
1
141.81 MB
0

QSAR Molecular Descriptor Predictions

Analyzing Activation Energy in Chemical Compounds

Environmental and Climate Sciences
1 year ago
3
260.56 kB
0

PAWS (Paraphrase Word Scrambling)

A dataset for modeling structure, context, and word order information

Other
1 year ago
6
124.19 MB
0

TinyShakespeare (Shakespeare's Plays)

40,000 lines of Shakespeare from a variety of Shakespeare's plays

Other
1 year ago
2
75.81 kB
0

Reddit: /r/EatCheapAndHealthy

Cost-Effective Nutritional Solutions from the Community

Finance and Economics
1 year ago
1
526.31 kB
0

Lovoo V3 Dating App User Profiles And Statistics

Revealing popular user traits and behavior

Media and Entertainment
1 year ago
3
846.63 kB
0

Crypto, Web3 And Blockchain Jobs

Scraped active crypto jobs listed on cryptojobslist.com

Crypto and Blockchain
1 year ago
110
1.11 MB
0

NFT Top Collections (Timeseries)

Historical data of the top NFT collections

Crypto and Blockchain
1 year ago
2
201.03 kB
0

220k-GPT4Vision Image Captions

220k-GPT4Vision Image Captions

Other
1 year ago
1
44.1 MB
0

RSICD Image Caption Dataset

RSICD Image Caption Dataset

Other
1 year ago
3
1.04 GB
0

Psychedelic Drug Database

Psychotropic and psychedelics drugs database with molecular descriptors

Healthcare
1 year ago
1
237.57 kB
0

Share link

Anyone who has the link will be able to view this.