Baselight

Persian News Dataset

Ideal for NLP tasks, sentiment analysis, topic modeling, and more.

@kaggle.amirzenoozi_persian_news_dataset

Loading...
Loading...

About this Dataset

Persian News Dataset

Persian News Dataset

By Using This Dataset You Will Have Access To 391,749 News From FarsNews Agency (178,480), MehrNews Agency (87,471), MashreghNews Agency (53,414), ISNA Agency (51,779), and KhabarOnline Agency (20,605). This Dataset Includes Title, Description, Publish Date, Service, Category, and Tags. In The Future, We Will Update This Dataset

Supported Agencies

  • FarsNews
  • MehrNews
  • KhabarOnline
  • ISNA
  • MashreghNews

Changelog

Version 1

  • It just only has FarsNews Agency Records

Version 2

  • MehrNews Agency Records are Added

Version 3

  • KhabarOnline Agency Records are Added

Version 4

  • ISNA Agency Records are Added
  • PublishDateTime format is Updated For FarsNews Records

Version 5

  • MashreghNews Agency Records are Added

Tables

Archive V2

@kaggle.amirzenoozi_persian_news_dataset.archive_v2
  • 472.71 MB
  • 265991 rows
  • 10 columns
Loading...

CREATE TABLE archive_v2 (
  "id" BIGINT,
  "title" VARCHAR,
  "short_link" VARCHAR,
  "service" VARCHAR,
  "subgroup" VARCHAR,
  "abstract" VARCHAR,
  "body" VARCHAR,
  "tags" VARCHAR,
  "published_datetime" VARCHAR,
  "agency_name" VARCHAR
);

Archive V3

@kaggle.amirzenoozi_persian_news_dataset.archive_v3
  • 506.32 MB
  • 287129 rows
  • 10 columns
Loading...

CREATE TABLE archive_v3 (
  "id" VARCHAR,
  "title" VARCHAR,
  "short_link" VARCHAR,
  "service" VARCHAR,
  "subgroup" VARCHAR,
  "abstract" VARCHAR,
  "body" VARCHAR,
  "tags" VARCHAR,
  "published_datetime" VARCHAR,
  "agency_name" VARCHAR
);

Archive V4

@kaggle.amirzenoozi_persian_news_dataset.archive_v4
  • 591 MB
  • 339011 rows
  • 10 columns
Loading...

CREATE TABLE archive_v4 (
  "id" VARCHAR,
  "title" VARCHAR,
  "short_link" VARCHAR,
  "service" VARCHAR,
  "subgroup" VARCHAR,
  "abstract" VARCHAR,
  "body" VARCHAR,
  "tags" VARCHAR,
  "published_datetime" VARCHAR,
  "agency_name" VARCHAR
);

Archive V5

@kaggle.amirzenoozi_persian_news_dataset.archive_v5
  • 671.37 MB
  • 392532 rows
  • 10 columns
Loading...

CREATE TABLE archive_v5 (
  "id" VARCHAR,
  "title" VARCHAR,
  "short_link" VARCHAR,
  "service" VARCHAR,
  "subgroup" VARCHAR,
  "abstract" VARCHAR,
  "body" VARCHAR,
  "tags" VARCHAR,
  "published_datetime" VARCHAR,
  "agency_name" VARCHAR
);

N, Output

@kaggle.amirzenoozi_persian_news_dataset.n__output
  • 348.57 MB
  • 178480 rows
  • 10 columns
Loading...

CREATE TABLE n__output (
  "id" BIGINT,
  "title" VARCHAR,
  "short_link" VARCHAR,
  "service" VARCHAR,
  "subgroup" VARCHAR,
  "abstract" VARCHAR,
  "body" VARCHAR,
  "tags" VARCHAR,
  "published_datetime" TIMESTAMP,
  "agency_name" VARCHAR
);

Share link

Anyone who has the link will be able to view this.