Baselight

HanziDB

List of simplified Chinese characters ordered by frequency rank.

@kaggle.ruddfawcett_hanzidb

Loading...
Loading...

About this Dataset

HanziDB

Content

A ranked list (by frequency) of over 9k simplified Chinese characters.

Acknowledgements

All data scraped from HanziDB.org, which is based on Jun Da's Modern Chinese Character Frequency List.

Inspiration

Some possible questions:

  • What is the distribution of radicals through the 100 most popular characters? 500? 1,000?
  • Does stroke count affect usage?
  • Is there an association between the number of strokes and the HSK level of characters?

Tables

Hanzidb

@kaggle.ruddfawcett_hanzidb.hanzidb
  • 394.37 KB
  • 10000 rows
  • 9 columns
Loading...

CREATE TABLE hanzidb (
  "frequency_rank" BIGINT,
  "charcter" VARCHAR,
  "pinyin" VARCHAR,
  "definition" VARCHAR,
  "radical" VARCHAR,
  "radical_code" DOUBLE,
  "stroke_count" VARCHAR,
  "hsk_levl" DOUBLE,
  "general_standard_num" DOUBLE
);

Share link

Anyone who has the link will be able to view this.