Baselight

Emoji Sentiment

Are people that use emoji happier?

@kaggle.harriken_emoji_sentiment

Loading...
Loading...

About this Dataset

Emoji Sentiment

Are people that use emoji happier?

How to cite: "Berengueres, J., & Castro, D. (2017, December). Differences in emoji sentiment perception between readers and writers. In 2017 IEEE International Conference on Big Data (Big Data) (pp. 4321-4328). IEEE."

paper --> https://arxiv.org/abs/1710.00888

At ASONAM2017, PydataDubai vol 1.0 @ AWOK, PyDataBCN2017 @ EASDE we have presented the paper Happiness inside a job?... Many people in the various audiences asked why we avoid using emojis to predict and profile employees. The answer is that we prefer to use links of likes because they are more authentic than words or emojis. In the same way that google page rank is more effective when it looks at links between pages rather than content inside the pages. ... Still people keep asking about it. But there is one thing emoji are good at estimating: author sentiment and that is just possible thanks to the unique characteristics of the dataset at hand.

Previous research has traditionally analyzed emoji sentiment from the point of view of the reader of the content not the author. Here, we analyze emoji sentiment from the author point of view and present a benchmark that was built from an employee happiness dataset where emoji happen to be annotated with daily happiness of the author of the comment. We also found out that people that use emoji are happier!?, muuch happier... But the question is, what did we miss?

Content

The main table contains columns named after emoji hex codes, a 1 means the emoji appears one time in the comment (row). This dataset is an expanded version of this one, but has different formats, columns and one different table, that is why we decided to release it as separate dataset. as he scripts are not compatible.

Other stuff

The R script written on MAC OS does not work in the kaggle platform (because numbers become factors and other little changes in how the code is interpreted...), the full working script (tested on R studio MAC OS) can be found at https://github.com/orioli/emoji-writer-sentiment

Thank you to Lewis Michel

Tables

Commentinteractions Cleaned

@kaggle.harriken_emoji_sentiment.commentinteractions_cleaned
  • 4.6 MB
  • 580582 rows
  • 7 columns
Loading...

CREATE TABLE commentinteractions_cleaned (
  "unnamed_0" BIGINT,
  "employee" BIGINT,
  "companyalias" VARCHAR,
  "liked" BOOLEAN,
  "disliked" BOOLEAN,
  "commentid" VARCHAR,
  "likes" BIGINT
);

Comments2emoji Frequency Matrix Cleaned

@kaggle.harriken_emoji_sentiment.comments2emoji_frequency_matrix_cleaned
  • 6.69 MB
  • 63699 rows
  • 358 columns
Loading...

CREATE TABLE comments2emoji_frequency_matrix_cleaned (
  "unnamed_0" BIGINT,
  "id" BIGINT,
  "coa" VARCHAR,
  "commentid" VARCHAR,
  "txt" VARCHAR,
  "like" BIGINT,
  "nolike" BIGINT,
  "date" TIMESTAMP,
  "uid" VARCHAR,
  "nchar" BIGINT,
  "emoji_count" BIGINT,
  "emoji_negative_sum" BIGINT,
  "emoji_positive_sum" BIGINT,
  "x1f53a" BIGINT,
  "x1f356" BIGINT,
  "x1f357" BIGINT,
  "x1f354" BIGINT,
  "x1f355" BIGINT,
  "x1f352" BIGINT,
  "x1f353" BIGINT,
  "x1f351" BIGINT,
  "x1f451" BIGINT,
  "x1f1ea" BIGINT,
  "x1f1eb" BIGINT,
  "x1f455" BIGINT,
  "x1f1ee" BIGINT,
  "x1f1ef" BIGINT,
  "x1f924" BIGINT,
  "x1f927" BIGINT,
  "x1f926" BIGINT,
  "x1f921" BIGINT,
  "x1f923" BIGINT,
  "x1f922" BIGINT,
  "x1f534" BIGINT,
  "x1f35f" BIGINT,
  "x1f35e" BIGINT,
  "x1f1e6" BIGINT,
  "x1f3de" BIGINT,
  "x1f3dd" BIGINT,
  "x1f3c3" BIGINT,
  "x1f3c4" BIGINT,
  "x1f3c5" BIGINT,
  "x1f3c6" BIGINT,
  "x1f3c7" BIGINT,
  "x1f38a" BIGINT,
  "x1f38f" BIGINT,
  "x1f69c" BIGINT,
  "x1f69a" BIGINT,
  "x1f628" BIGINT,
  "x1f629" BIGINT,
  "x1f626" BIGINT,
  "x1f627" BIGINT,
  "x1f624" BIGINT,
  "x1f625" BIGINT,
  "x1f622" BIGINT,
  "x1f623" BIGINT,
  "x1f621" BIGINT,
  "x1f381" BIGINT,
  "x1f459" BIGINT,
  "x1f383" BIGINT,
  "x1f382" BIGINT,
  "x1f385" BIGINT,
  "x1f384" BIGINT,
  "x1f386" BIGINT,
  "x1f389" BIGINT,
  "x1f388" BIGINT,
  "x1f3ca" BIGINT,
  "x1f3cb" BIGINT,
  "x1f3d6" BIGINT,
  "x1f3cd" BIGINT,
  "x1f3ce" BIGINT,
  "x1f5e3" BIGINT,
  "x1f692" BIGINT,
  "x1f691" BIGINT,
  "x1f58d" BIGINT,
  "x1f697" BIGINT,
  "x1f62f" BIGINT,
  "x1f62d" BIGINT,
  "x1f62e" BIGINT,
  "x1f62b" BIGINT,
  "x1f62c" BIGINT,
  "x1f62a" BIGINT,
  "x1f499" BIGINT,
  "x1f498" BIGINT,
  "x1f495" BIGINT,
  "x1f494" BIGINT,
  "x1f496" BIGINT,
  "x1f491" BIGINT,
  "x1f493" BIGINT,
  "x1f319" BIGINT,
  "x1f315" BIGINT,
  "x1f49e" BIGINT,
  "x1f49d" BIGINT,
  "x1f49a" BIGINT,
  "x1f619" BIGINT,
  "x1f49c" BIGINT,
  "x1f49b" BIGINT,
  "x1f593" BIGINT,
  "x1f31b" BIGINT,
  "x1f31c" BIGINT
);

Ijstable

@kaggle.harriken_emoji_sentiment.ijstable
  • 39.05 KB
  • 752 rows
  • 12 columns
Loading...

CREATE TABLE ijstable (
  "char" VARCHAR,
  "unnamed_1" VARCHAR,
  "unicode" VARCHAR,
  "occurrences" VARCHAR,
  "position" VARCHAR,
  "neg" VARCHAR,
  "neut" VARCHAR,
  "pos" VARCHAR,
  "sentiment_score" VARCHAR,
  "sentiment_bar" VARCHAR,
  "unicode_name" VARCHAR,
  "unicode_block" VARCHAR
);

N, Readervswriter

@kaggle.harriken_emoji_sentiment.n__readervswriter
  • 8.42 KB
  • 37 rows
  • 9 columns
Loading...

CREATE TABLE n__readervswriter (
  "unnamed_0" VARCHAR,
  "emoji" VARCHAR,
  "s_writer" DOUBLE,
  "s_reader" DOUBLE,
  "sd" DOUBLE,
  "count" BIGINT,
  "description" VARCHAR,
  "unnamed_7" VARCHAR,
  "diff" DOUBLE
);

Votes Cleaned

@kaggle.harriken_emoji_sentiment.votes_cleaned
  • 5 MB
  • 398238 rows
  • 8 columns
Loading...

CREATE TABLE votes_cleaned (
  "unnamed_0" BIGINT,
  "employee" BIGINT,
  "companyalias" VARCHAR,
  "votedate" TIMESTAMP,
  "vote" DOUBLE,
  "uid" VARCHAR,
  "vote_original" BIGINT,
  "uiddate" VARCHAR
);

Share link

Anyone who has the link will be able to view this.