Baselight

The Office (US) - Complete Dialogue/Transcript

45,000+ lines of dialogue from 9 seasons of The Office

@kaggle.nasirkhalid24_the_office_us_complete_dialoguetranscript

Loading...
Loading...

About this Dataset

The Office (US) - Complete Dialogue/Transcript

Data mined from transcripts of the show.

Content

Season: Season Number
Episode: Episode Number
Title: Name of the Episode
Scene: Scene number (running value from start of dataset)
Speaker: Character name
Line: Dialogue of character

Version 3 Updates - based on feedback from @saradata

  • Added the missing lines from S9E4 and S7E17
  • Fixed the issue with the running scene number
  • Remove some censoring and special characters (ex: *, Ä etc)
  • Cleaned up some lines that had scene context artifacts (ex: [on phone])

Tables

The Office Lines V4

@kaggle.nasirkhalid24_the_office_us_complete_dialoguetranscript.the_office_lines_v4
  • 2.09 MB
  • 54626 rows
  • 7 columns
Loading...

CREATE TABLE the_office_lines_v4 (
  "season" BIGINT,
  "episode" BIGINT,
  "title" VARCHAR,
  "scene" BIGINT,
  "speaker" VARCHAR,
  "line" VARCHAR,
  "unnamed_6" VARCHAR
);

Share link

Anyone who has the link will be able to view this.