The data in this dataset was manually collected as part of my Master's Degree thesis. My thesis was an update on similar studies done in the past. Thus, you will see it lacking in some areas, like only coding for two genders (female, male). The focus of this study was the creation of gender identities though language (verbal and non-verbal). Thus, I looked at the relationship between television characters’ demographics, behaviors, problems, and their speaking time, to create a better picture of how gender was being portrayed.
The dataset is divided in two parts. The first one comes from primetime television. The second, from the streaming service Netflix. Both sets needed to meet certain general criteria in order to be used. The sample was limited to scripted, fictional live-action stories, aimed at adults and young adults. Reality shows, talk shows, cartoons, live action shows aimed at children, sports broadcasts, movies, documentaries, and news programs were excluded from the sample. The selected shows were divided into two categories: drama and situational comedy.
A sample of primetime television shows were selected from the top five broadcasting stations. These are: ABC, CBS, The CW, FOX, and NBC. From each station the ten most popular shows, according to Nielsen ratings, were selected. Only television shows with more than five episodes released before the midseason break (December 2016) were selected, to allow for the ratings to stabilize after the extensive promotion that accompanies a show’s premiere and after curious viewers decide if they want to watch the series or not. From each show, one episode was chosen at random. In series that share a universe, “crossover” episodes were omitted. This resulted in a total of 50 shows from primetime television being coded for this study.
Very specific restrictions were set when selecting the Netflix sample, for multiple reasons. Therefore, only shows originally produced for, and by, Netflix – whether on its own or in conjunction with other production companies – who have Netflix as their original network, and which premiered before December 31, 2016 were selected. One episode per series was selected. As with the broadcast television sample, in series that share a fictional universe, “crossover” episodes were not omitted. A total of 31 Netflix shows were coded.
NOTE: Characters sexual orientation was not assumed and only coded when alluded to through behaviors or language. Thus, a characters sexuality might not be completely accurate as it does not take into account their whole history or representation, merely what happened in the selected episode.