Baselight

Handwritten Text Recognition (Bongabdo)

Can be taught to recognise whole pages of handwritten Bangla (Bengali) text

@kaggle.joebeachcapital_handwritten_text_recognition_bongabdo

Loading...
Loading...

About this Dataset

Handwritten Text Recognition (Bongabdo)

Overview

An Offline Handwritten Text Recognition (HTR) model architecture based on Neural Networks that can be taught to recognise whole pages of handwritten Bangla (Bengali) text without image segmentation. Bengali being a resource-constrained Indic language, there is a lack of proper annotated dataset consisting scanned images of Bangla handwritten scripts. In this work, I have introduced a new dataset, `Bongabdo', which consists of full-page handwritten scripts collected from a wide variety of contributors of various age groups, occupation and gender. Further, recently proposed State-of-the-art Image-to-Sequence architecture with different settings of hyperparameters have been applied on these images and they have been evaluated in terms of Character Error Rate (CER), Word Error Rate (WER) and Sequence Error Rate (SER) to finally come up with a comparative study.

Introductory Paper

Towards Full-page Offline Bangla Handwritten Text Recognition using Image-to-Sequence Architecture
By Ayanabha Ghosh. 2023
Published in IEEE Silchar Subsection Conference, Silchar, Assam, India

Tables

Bongabdo Metadata

@kaggle.joebeachcapital_handwritten_text_recognition_bongabdo.bongabdo_metadata
  • 14.45 KB
  • 111 rows
  • 12 columns
Loading...

CREATE TABLE bongabdo_metadata (
  "sn" BIGINT,
  "filename" VARCHAR,
  "username" VARCHAR,
  "age" BIGINT,
  "gender" VARCHAR,
  "occupation" VARCHAR,
  "category" VARCHAR,
  "char_count" DOUBLE,
  "article_link" VARCHAR,
  "strike" BOOLEAN,
  "bangla_english" BOOLEAN,
  "multi_paragraph" BOOLEAN
);

Share link

Anyone who has the link will be able to view this.