Baselight
Sign In
kaggle

South Park Scripts Dataset

@kaggle.thedevastator_south_park_scripts_dataset

Loading...
Loading...

All the Words, All the Time


South Park Scripts Dataset

All the Words, All the Time

By [source]


About this dataset

This dataset contains every word spoken by a character in the first 16 seasons of the TV show South Park. That's over 1 million words in all! Whether you're a fan of South Park or not, this is an interesting dataset to explore natural language processing and see what insights can be gleaned from such a large corpus of text

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains all of the words spoken by characters in the South Park TV show. It is divided into seasons, with each season containing a number of episodes. For each episode, there is a transcript of what was said by each character.

This dataset can be used to study the language used in the South Park TV show, as well as to study how the dialogue changes over time

Research Ideas

  • Sentiment analysis of the South Park scripts
  • Word clouds for each character
  • Finding the most common words used in each season

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: All-seasons.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-1.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-10.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-11.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-12.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-13.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-14.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-15.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

File: Season-16.csv

Column name Description
Season The season the episode is from. (Numeric)
Episode The episode number. (Numeric)
Character The character who spoke the line. (String)
Line The line spoken by the character. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .


Related Datasets

Share link

Anyone who has the link will be able to view this.