Baselight
Sign In
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Technology and IT
Multi-select dropdown. Use arrow keys to navigate, Enter to select, and Escape to close.
1 option selected: Kaggle
397 results

Aeslc (Email Subject Generation Task)

A collection of email messages of employees in the Enron Corporation.

Technology and IT
11 months ago
3
7.82 MB
0

Camera Trap Performance For Nocturnal Mammals

Insights into Flying Squirrel Populations in Urban Environments

Technology and IT
11 months ago
2
18.06 kB
0

Reddit: /r/technology (Submissions & Comments)

Title, Score, ID, URL, Comment Number, and Timestamp

Technology and IT
11 months ago
1
280.78 kB
0

MIDAS Hand-Annotated News

A Corpus of Physician-Defined Topics for Data Science and Machine Learning

Technology and IT
11 months ago
20
244.19 kB
0

Acquiring Pragmalinguistic Competences Through

Investigating Language Acquisition Through Computer-Mediated Communication

Technology and IT
11 months ago
2
189.79 kB
0

CoEdIT Text Editing

A curated dataset for training text editing models

Technology and IT
11 months ago
2
10.44 MB
0

AI-Based Job Site Matching

Leveraging 400k+ Hours of Resource & Performance Data

Technology and IT
11 months ago
3
2.35 MB
0

Enterprises Electric Load Profiles (Germany)

Exploring the Benefits of Battery Storage Technology

Technology and IT
11 months ago
2
3.41 MB
0

AI-Shift Ameba FAQ Search

Queries and difficulty levels for AI-based FAQ search

Technology and IT
11 months ago
3
111.43 kB
0

US Jobs On Dice.com

22,000 technology job listings

Technology and IT
11 months ago
1
30.81 MB
0

Synthia-v1.3

Synthetic training data for LLM development

Technology and IT
11 months ago
1
128.27 MB
0

Text Classification For QA Dataset

Text classification dataset for question answering

Technology and IT
11 months ago
3
13.32 MB
0

WikiANN

Multilingual named entity recognition for LLM training

Technology and IT
11 months ago
528
137.22 MB
0

DBpedia Ontology

Text Classification Dataset with 14 Classes

Technology and IT
11 months ago
2
116 MB
0

CAMEL AI: Biology Problems / Solutions

Biology Problem-Solution Pairs for Synthetic Biology

Technology and IT
11 months ago
1
21.86 MB
0

MathInstruct Dataset: Hybrid Math Instruction

A curated dataset for math instruction tuning models

Technology and IT
11 months ago
1
97.66 MB
0

Newsgroups (Text Classification)

Comprehensive Collection of Text Classification Datasets

Technology and IT
11 months ago
77
64.61 MB
0

Urban Ecology Over Time

Exploring Cameras Traps, Scans and Surveys

Technology and IT
11 months ago
6
105.16 kB
0

CoEdIT

Enhancing AI Text Editing Through 69,000 Instances

Technology and IT
11 months ago
2
10.44 MB
0

Cricket Commentary Analysis

Text Classification and Natural Language Processing for Commentary Insights

Technology and IT
11 months ago
3
6.95 MB
0

Airoboros LLMs Math Dataset

Mastering Complex Mathematical Operations in Machine Learning

Technology and IT
11 months ago
1
57.55 MB
0

Laion-Pop Image Classification Dataset

Accurately Predicting and Classifying Images with Alt Texts and NSFW Predictions

Technology and IT
11 months ago
1
298.51 MB
0

Most Popular GitHub Projects

Popularity Factors and Growth Patterns

Technology and IT
11 months ago
1
491.4 kB
0

Germeval18 - Text Classification Dataset

Text Classification Dataset with Binary and Multi-class Labels

Technology and IT
11 months ago
2
851.69 kB
0

High-Quality Multilingual Translation Data

13 Languages for Machine Learning

Technology and IT
11 months ago
62
208.17 MB
0

HTTP Header Fields Dataset

How information is encoded and sent/received on the internet

Technology and IT
11 months ago
5
47.15 kB
0

LLM Feedback Collection

Induce fine-grained evaluation capabilities into language models

Technology and IT
11 months ago
1
459.52 MB
0

OpenHermes

GPT-4 AI Dataset - 242K Entries

Technology and IT
11 months ago
1
141.81 MB
0

Logical Reasoning Improvement Dataset

Enhancing LLM Logical Reasoning Skills with Platypus2 Models

Technology and IT
11 months ago
1
15.33 MB
0

LongAlpaca-Yukang ML Instructional Outputs

Unlocking the Power of AI

Technology and IT
11 months ago
1
265.85 MB
0

Objaverse-XL: 10M+ 3D Objects, Zero123-XL

For Training AI-Powered 3D Rendering

Technology and IT
11 months ago
1
1.38 GB
0

HelpSteer: AI Alignment Dataset

Real-World Helpfulness Annotated for AI Alignment

Technology and IT
11 months ago
2
30.85 MB
0

ViGGO: Video Game Chatbot Dataset

Conversational data-to-text for video game chatbots

Technology and IT
11 months ago
8
1.2 MB
0

AG News (News Articles)

News Articles Text Classification

Technology and IT
11 months ago
2
19.35 MB
0

OpenAI HumanEval Code Gen

Handcrafted Python Programming Problems for Accurate Model Evaluation

Technology and IT
11 months ago
1
85.24 kB
0

Cybersecurity Risk (2022 CISA Vulnerability)

Severity, CVSS Score, and National Security Vulnerability Types

Technology and IT
11 months ago
5
532.91 kB
0

Hate Speech And Offensive Language Detection

Hate Speech and Offensive Language Detection on Twitter

Technology and IT
11 months ago
1
1.55 MB
0

Malware Attacks

Synthetic Dataset About Malware Attacks

Technology and IT
11 months ago
1
71.87 MB
0

Spotify Charts

A dataset of all daily hit charts curated by Spotify

Technology and IT
11 months ago
1
304.66 MB
0

Dota2 Games Results

Dota 2 is a popular computer game with two teams of 5 players

Technology and IT
11 months ago
2
1.94 MB
0

PhiUSIIL Phishing URLs

134,850 legitimate and 100,945 phishing URLs

Technology and IT
11 months ago
1
24.91 MB
0

TUNADROMD Malware Detection

4465 instances and 241 attributes. Classify Malware vs Goodware

Technology and IT
11 months ago
1
230.24 kB
0

Cars Yallamotors

This data was scrapped from YallaMotors website with Python and Requests-html.

Technology and IT
11 months ago
1
112.15 kB
0

Chatbots In Education

The rapid development of artificial intelligence (AI).

Technology and IT
11 months ago
1
49.62 kB
0

Newly-funded

AI Index Report tracks, collates, distills, and visualizes data related.

Technology and IT
11 months ago
1
3.26 kB
0

Bioinformatics Simulated

This synthetic dataset was created to explore and develop machine learning.

Technology and IT
11 months ago
1
3.73 MB
0

Automating Talent

Automating Talent Acquisition: Leveraging Machine Learning for Resume Screening.

Technology and IT
11 months ago
1
264.46 kB
0

Internet Usage

History of Internet and evolution of Broadband

Technology and IT
11 months ago
4
229.13 kB
0

Speech Offensive

An annotated dataset for hate speech and offensive language detection on tweets

Technology and IT
11 months ago
1
1.55 MB
0

Melody Metrics:Popularity

Dive deep into the rhythm of data with "Melody Metrics", a curated dataset desig

Technology and IT
11 months ago
3
397.79 kB
0

Share link

Anyone who has the link will be able to view this.