Baselight

South Park Scripts Dataset

All the Words, All the Time

@kaggle.thedevastator_south_park_scripts_dataset

Loading...
Loading...

About this Dataset

South Park Scripts Dataset


South Park Scripts Dataset

All the Words, All the Time

By [source]


About this dataset

This dataset contains every word spoken by a character in the first 16 seasons of the TV show South Park. That's over 1 million words in all! Whether you're a fan of South Park or not, this is an interesting dataset to explore natural language processing and see what insights can be gleaned from such a large corpus of text

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains all of the words spoken by characters in the South Park TV show. It is divided into seasons, with each season containing a number of episodes. For each episode, there is a transcript of what was said by each character.

This dataset can be used to study the language used in the South Park TV show, as well as to study how the dialogue changes over time

Research Ideas

  • Sentiment analysis of the South Park scripts
  • Word clouds for each character
  • Finding the most common words used in each season

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: All-seasons.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-1.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-10.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-11.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-12.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-13.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-14.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-15.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-16.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .

Tables

All Seasons

@kaggle.thedevastator_south_park_scripts_dataset.all_seasons
  • 3.05 MB
  • 70,896 rows
  • 4 columns
Loading...
CREATE TABLE all_seasons (
  "season" VARCHAR,
  "episode" VARCHAR,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 1

@kaggle.thedevastator_south_park_scripts_dataset.season_1
  • 162.4 kB
  • 4,170 rows
  • 4 columns
Loading...
CREATE TABLE season_1 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 10

@kaggle.thedevastator_south_park_scripts_dataset.season_10
  • 157.32 kB
  • 3,471 rows
  • 4 columns
Loading...
CREATE TABLE season_10 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 11

@kaggle.thedevastator_south_park_scripts_dataset.season_11
  • 156.18 kB
  • 3,478 rows
  • 4 columns
Loading...
CREATE TABLE season_11 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 12

@kaggle.thedevastator_south_park_scripts_dataset.season_12
  • 150.24 kB
  • 3,307 rows
  • 4 columns
Loading...
CREATE TABLE season_12 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 13

@kaggle.thedevastator_south_park_scripts_dataset.season_13
  • 168.62 kB
  • 3,257 rows
  • 4 columns
Loading...
CREATE TABLE season_13 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 14

@kaggle.thedevastator_south_park_scripts_dataset.season_14
  • 166.6 kB
  • 3,346 rows
  • 4 columns
Loading...
CREATE TABLE season_14 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 15

@kaggle.thedevastator_south_park_scripts_dataset.season_15
  • 166.04 kB
  • 3,101 rows
  • 4 columns
Loading...
CREATE TABLE season_15 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 16

@kaggle.thedevastator_south_park_scripts_dataset.season_16
  • 163.14 kB
  • 3,120 rows
  • 4 columns
Loading...
CREATE TABLE season_16 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 17

@kaggle.thedevastator_south_park_scripts_dataset.season_17
  • 124.95 kB
  • 2,305 rows
  • 4 columns
Loading...
CREATE TABLE season_17 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 18

@kaggle.thedevastator_south_park_scripts_dataset.season_18
  • 126.08 kB
  • 2,522 rows
  • 4 columns
Loading...
CREATE TABLE season_18 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 19

@kaggle.thedevastator_south_park_scripts_dataset.season_19
  • 119.64 kB
  • 2,260 rows
  • 4 columns
Loading...
CREATE TABLE season_19 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 2

@kaggle.thedevastator_south_park_scripts_dataset.season_2
  • 255.55 kB
  • 6,416 rows
  • 4 columns
Loading...
CREATE TABLE season_2 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 3

@kaggle.thedevastator_south_park_scripts_dataset.season_3
  • 230.6 kB
  • 5,798 rows
  • 4 columns
Loading...
CREATE TABLE season_3 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 4

@kaggle.thedevastator_south_park_scripts_dataset.season_4
  • 242.6 kB
  • 5,680 rows
  • 4 columns
Loading...
CREATE TABLE season_4 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 5

@kaggle.thedevastator_south_park_scripts_dataset.season_5
  • 195.33 kB
  • 4,414 rows
  • 4 columns
Loading...
CREATE TABLE season_5 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 6

@kaggle.thedevastator_south_park_scripts_dataset.season_6
  • 229.05 kB
  • 5,131 rows
  • 4 columns
Loading...
CREATE TABLE season_6 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 7

@kaggle.thedevastator_south_park_scripts_dataset.season_7
  • 189.35 kB
  • 4,236 rows
  • 4 columns
Loading...
CREATE TABLE season_7 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 8

@kaggle.thedevastator_south_park_scripts_dataset.season_8
  • 166.59 kB
  • 3,601 rows
  • 4 columns
Loading...
CREATE TABLE season_8 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 9

@kaggle.thedevastator_south_park_scripts_dataset.season_9
  • 160.97 kB
  • 3,526 rows
  • 4 columns
Loading...
CREATE TABLE season_9 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Share link

Anyone who has the link will be able to view this.