Baselight

South Park Scripts Dataset

All the Words, All the Time

@kaggle.thedevastator_south_park_scripts_dataset

Loading...
Loading...

About this Dataset

South Park Scripts Dataset


South Park Scripts Dataset

All the Words, All the Time

By [source]


About this dataset

This dataset contains every word spoken by a character in the first 16 seasons of the TV show South Park. That's over 1 million words in all! Whether you're a fan of South Park or not, this is an interesting dataset to explore natural language processing and see what insights can be gleaned from such a large corpus of text

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains all of the words spoken by characters in the South Park TV show. It is divided into seasons, with each season containing a number of episodes. For each episode, there is a transcript of what was said by each character.

This dataset can be used to study the language used in the South Park TV show, as well as to study how the dialogue changes over time

Research Ideas

  • Sentiment analysis of the South Park scripts
  • Word clouds for each character
  • Finding the most common words used in each season

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: All-seasons.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-1.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-10.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-11.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-12.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-13.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-14.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-15.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-16.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .

Tables

All Seasons

@kaggle.thedevastator_south_park_scripts_dataset.all_seasons
  • 2.91 MB
  • 70896 rows
  • 4 columns
Loading...

CREATE TABLE all_seasons (
  "season" VARCHAR,
  "episode" VARCHAR,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 1

@kaggle.thedevastator_south_park_scripts_dataset.season_1
  • 158.59 KB
  • 4170 rows
  • 4 columns
Loading...

CREATE TABLE season_1 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 10

@kaggle.thedevastator_south_park_scripts_dataset.season_10
  • 153.64 KB
  • 3471 rows
  • 4 columns
Loading...

CREATE TABLE season_10 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 11

@kaggle.thedevastator_south_park_scripts_dataset.season_11
  • 152.52 KB
  • 3478 rows
  • 4 columns
Loading...

CREATE TABLE season_11 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 12

@kaggle.thedevastator_south_park_scripts_dataset.season_12
  • 146.71 KB
  • 3307 rows
  • 4 columns
Loading...

CREATE TABLE season_12 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 13

@kaggle.thedevastator_south_park_scripts_dataset.season_13
  • 164.67 KB
  • 3257 rows
  • 4 columns
Loading...

CREATE TABLE season_13 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 14

@kaggle.thedevastator_south_park_scripts_dataset.season_14
  • 162.7 KB
  • 3346 rows
  • 4 columns
Loading...

CREATE TABLE season_14 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 15

@kaggle.thedevastator_south_park_scripts_dataset.season_15
  • 162.15 KB
  • 3101 rows
  • 4 columns
Loading...

CREATE TABLE season_15 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 16

@kaggle.thedevastator_south_park_scripts_dataset.season_16
  • 159.31 KB
  • 3120 rows
  • 4 columns
Loading...

CREATE TABLE season_16 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 17

@kaggle.thedevastator_south_park_scripts_dataset.season_17
  • 122.02 KB
  • 2305 rows
  • 4 columns
Loading...

CREATE TABLE season_17 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 18

@kaggle.thedevastator_south_park_scripts_dataset.season_18
  • 123.13 KB
  • 2522 rows
  • 4 columns
Loading...

CREATE TABLE season_18 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 19

@kaggle.thedevastator_south_park_scripts_dataset.season_19
  • 116.84 KB
  • 2260 rows
  • 4 columns
Loading...

CREATE TABLE season_19 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 2

@kaggle.thedevastator_south_park_scripts_dataset.season_2
  • 249.56 KB
  • 6416 rows
  • 4 columns
Loading...

CREATE TABLE season_2 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 3

@kaggle.thedevastator_south_park_scripts_dataset.season_3
  • 225.19 KB
  • 5798 rows
  • 4 columns
Loading...

CREATE TABLE season_3 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 4

@kaggle.thedevastator_south_park_scripts_dataset.season_4
  • 236.92 KB
  • 5680 rows
  • 4 columns
Loading...

CREATE TABLE season_4 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 5

@kaggle.thedevastator_south_park_scripts_dataset.season_5
  • 190.75 KB
  • 4414 rows
  • 4 columns
Loading...

CREATE TABLE season_5 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 6

@kaggle.thedevastator_south_park_scripts_dataset.season_6
  • 223.68 KB
  • 5131 rows
  • 4 columns
Loading...

CREATE TABLE season_6 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 7

@kaggle.thedevastator_south_park_scripts_dataset.season_7
  • 184.91 KB
  • 4236 rows
  • 4 columns
Loading...

CREATE TABLE season_7 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 8

@kaggle.thedevastator_south_park_scripts_dataset.season_8
  • 162.68 KB
  • 3601 rows
  • 4 columns
Loading...

CREATE TABLE season_8 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Season 9

@kaggle.thedevastator_south_park_scripts_dataset.season_9
  • 157.2 KB
  • 3526 rows
  • 4 columns
Loading...

CREATE TABLE season_9 (
  "season" BIGINT,
  "episode" BIGINT,
  "character" VARCHAR,
  "line" VARCHAR
);

Share link

Anyone who has the link will be able to view this.