Name: LLM: 7 Prompt Training Dataset
Creator: Kaggle
License: https://cdla.dev/sharing-1-0/

(for use in the LLM - Detect AI Generated Text competition)

Version 4: Adding the data from "LLM-generated essay using PaLM from Google Gen-AI" kindly generated by Kingki19 / Muhammad Rizqi.
File: train_essays_RDizzl3_seven_v2.csv
Human texts: 14247 LLM texts: 3004

See also: a new dataset of an additional 4900 LLM generated texts: LLM: Mistral-7B Instruct texts
Version 3: "The RDizzl3 Seven"
File: train_essays_RDizzl3_seven_v1.csv
"Car-free cities"
"Does the electoral college work?"
"Exploring Venus"
"The Face on Mars"
"Facial action coding system"
"A Cowboy Who Rode the Waves"
"Driverless cars"

How this dataset was made: see the notebook "LLM: Make 7 prompt train dataset"

Version 2: (train_essays_7_prompts_v2.csv) This dataset is composed of 13,712 human texts and 1638 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.

Namely:

This dataset is a derivative of the datasets

Version 1:This dataset is composed of 13,712 human texts and 1165 AI-LLM generated texts originating from 7 of the PERSUADE 2.0 corpus prompts.

Related Datasets