ETHOS Hate Speech Dataset
ETHOS: multi-labEl haTe speecH detectiOn dataSet
@kaggle.konradb_ethos_hate_speech_dataset
ETHOS: multi-labEl haTe speecH detectiOn dataSet
@kaggle.konradb_ethos_hate_speech_dataset
From the project repo: https://github.com/intelligence-csd-auth-gr/Ethos-Hate-Speech-Dataset
ETHOS: multi-labEl haTe speecH detectiOn dataSet. This repository contains a dataset for hate speech detection on social media platforms, called Ethos. There are two variations of the dataset:
Ethos_Dataset_Binary.csv[Ethos_Dataset_Binary.csv] contains 998 comments in the dataset alongside with a label about hate speech presence or absence. 565 of them do not contain hate speech, while the rest of them, 433, contain.
Ethos_Dataset_Multi_Label.csv [Ethos_Dataset_Multi_Label.csv] which contains 8 labels for the 433 comments with hate speech content. These labels are violence (if it incites (1) or not (0) violence), directed_vs_general (if it is directed to a person (1) or a group (0)), and 6 labels about the category of hate speech like, gender, race, national_origin, disability, religion and sexual_orientation.
CREATE TABLE en_dataset_with_stop_words (
"hitid" BIGINT,
"tweet" VARCHAR,
"sentiment" VARCHAR,
"directness" VARCHAR,
"annotator_sentiment" VARCHAR,
"target" VARCHAR,
"group" VARCHAR
);CREATE TABLE ethos_dataset_binary (
"comment" VARCHAR,
"ishate" DOUBLE
);CREATE TABLE ethos_dataset_multi_label (
"comment" VARCHAR,
"violence" DOUBLE,
"directed_vs_generalized" DOUBLE,
"gender" DOUBLE,
"race" DOUBLE,
"national_origin" DOUBLE,
"disability" DOUBLE,
"religion" DOUBLE,
"sexual_orientation" DOUBLE
);CREATE TABLE hate_speech_and_offensive_language (
"unnamed_0" BIGINT -- Unnamed: 0,
"count" BIGINT,
"hate_speech" BIGINT,
"offensive_language" BIGINT,
"neither" BIGINT,
"class" BIGINT,
"tweet" VARCHAR
);Anyone who has the link will be able to view this.