Baselight

Yahoo! Answers Topic Classification

The Yahoo! dataset is constructed using 10 largest main categories.

@kaggle.bhavikardeshna_yahoo_email_classification

Loading...
Loading...

About this Dataset

Yahoo! Answers Topic Classification

The Yahoo! Answers topic classification dataset is constructed using the 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000, and testing samples are 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.

  • Society & Culture
  • Science & Mathematics
  • Health
  • Education & Reference
  • Computers & Internet
  • Sports
  • Business & Finance
  • Entertainment & Music
  • Family & Relationships
  • Politics & Government

The Yahoo! Answers topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015)

Tables

Test

@kaggle.bhavikardeshna_yahoo_email_classification.test
  • 21.3 MB
  • 59,999 rows
  • 4 columns
Loading...
CREATE TABLE test (
  "n_9" BIGINT  -- 9,
  "what_makes_friendship_click" VARCHAR  -- What Makes Friendship Click?,
  "how_does_the_spark_keep_going" VARCHAR  -- How Does The Spark Keep Going?,
  "good_communication_is_what_does_it_can_you_move_beyond_71f3920b" VARCHAR  -- Good Communication Is What Does It. Can You Move Beyond Small Talk And Say What\u0027s Really On Your Mind. If You Start Doing This, My Expereince Is That Potentially Good Friends Will Respond Or Shun You. Then You Know Who The Really Good Friends Are.
);

Train

@kaggle.bhavikardeshna_yahoo_email_classification.train
  • 495.63 MB
  • 1,399,999 rows
  • 4 columns
Loading...
CREATE TABLE train (
  "n_5" BIGINT  -- 5,
  "why_doesn_t_an_optical_mouse_work_on_a_glass_table" VARCHAR  -- Why Doesn\u0027t An Optical Mouse Work On A Glass Table?,
  "or_even_on_some_surfaces" VARCHAR  -- Or Even On Some Surfaces?,
  "optical_mice_use_an_led_and_a_camera_to_rapidly_captur_76243c37" VARCHAR  -- Optical Mice Use An LED And A Camera To Rapidly Capture Images Of The Surface Beneath The Mouse. The Infomation From The Camera Is Analyzed By A DSP (Digital Signal Processor) And Used To Detect Imperfections In The Underlying Surface And Determine Motion. Some Materials, Such As Glass, Mirrors Or Other Very Shiny, Uniform Surfaces Interfere With The Ability Of The DSP To Accurately Analyze The Surface Beneath The Mouse. Since Glass Is Transparent And Very Uniform, The Mouse Is Unable To Pick Up Enough Imperfections In The Underlying Surface To Determine Motion. Mirrored Surfaces Are Also A Problem, Since They Constantly Reflect Back The Same Image, Causing The DSP Not To Recognize Motion Properly. When The System Is Unable To See Surface Changes Associated With Movement, The Mouse Will Not Work Properly.
);

Share link

Anyone who has the link will be able to view this.