Baselight

Arabic Handwritten Digits Dataset

Arabic Handwritten Digits Data-set

@kaggle.mloey1_ahdd1

Loading...
Loading...

About this Dataset

Arabic Handwritten Digits Dataset

Arabic Handwritten Digits Dataset

Please cite our papers:

• A. El-Sawy, M. Loey, and H. EL-Bakry, “Arabic handwritten characters recognition using convolutional neural network,” WSEAS Transactions on Computer Research, vol. 5, pp. 11–19, 2017.

https://doi.org/10.1007/978-3-319-48308-5_54

https://link.springer.com/chapter/10.1007/978-3-319-48308-5_54

• A. El-Sawy, H. EL-Bakry, and M. Loey, “CNN for handwritten arabic digits recognition based on lenet-5,” in Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016, vol. 533, pp. 566–575, Springer International Publishing, 2016.

https://www.wseas.org/multimedia/journals/computerresearch/2017/a045818-075.php

  • Loey, Mohamed, Ahmed El-Sawy, and Hazem El-Bakry. "Deep learning autoencoder approach for handwritten arabic digits recognition." arXiv preprint arXiv:1706.06720 (2017).

https://arxiv.org/abs/1706.06720

Abstract

In recent years, handwritten digits recognition has been an important area due to its applications in several fields. This work is focusing on the recognition part of handwritten Arabic digits recognition that face several challenges, including the unlimited variation in human handwriting and the large public databases. The paper provided a deep learning technique that can be effectively apply to recognizing Arabic handwritten digits. LeNet-5, a Convolutional Neural Network (CNN) trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. A comparison is held amongst the results, and it is shown by the end that the use of CNN was leaded to significant improvements across different machine-learning classification algorithms.

The Convolutional Neural Network was trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. Moreover, the CNN is giving an average recognition accuracy of 99.15%.

Context

The motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten digits recognition. In recent years, Arabic handwritten digits recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.

Content

The MADBase is modified Arabic handwritten digits database contains 60,000 training images, and 10,000 test images. MADBase were written by 700 writers. Each writer wrote each digit (from 0 -9) ten times. To ensure including different writing styles, the database was gathered from different institutions: Colleges of Engineering and Law, School of Medicine, the Open University (whose students span a wide range of ages), a high school, and a governmental institution.
MADBase is available for free and can be downloaded from (http://datacenter.aucegypt.edu/shazeem/) .

Acknowledgements

Cite our paper:

• A. El-Sawy, M. Loey, and H. EL-Bakry, “Arabic handwritten characters recognition using convolutional neural network,” WSEAS Transactions on Computer Research, vol. 5, pp. 11–19, 2017.

https://doi.org/10.1007/978-3-319-48308-5_54

https://link.springer.com/chapter/10.1007/978-3-319-48308-5_54

• A. El-Sawy, H. EL-Bakry, and M. Loey, “CNN for handwritten arabic digits recognition based on lenet-5,” in Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016, vol. 533, pp. 566–575, Springer International Publishing, 2016.

https://www.wseas.org/multimedia/journals/computerresearch/2017/a045818-075.php

  • Loey, Mohamed, Ahmed El-Sawy, and Hazem El-Bakry. "Deep learning autoencoder approach for handwritten arabic digits recognition." arXiv preprint arXiv:1706.06720 (2017).

https://arxiv.org/abs/1706.06720

Inspiration

Creating the proposed database presents more challenges because it deals with many issues such as style of writing, thickness, dots number and position. Some characters have different shapes while written in the same position. For example the teh character has different shapes in isolated position.

Arabic Handwritten Characters Dataset

https://www.kaggle.com/mloey1/ahcd1

Benha University

http://bu.edu.eg/staff/mloey

https://mloey.github.io/

Tables

Testimages 10k X 784

@kaggle.mloey1_ahdd1.testimages_10k_x_784
  • 3.58 MB
  • 9999 rows
  • 784 columns
Loading...

CREATE TABLE testimages_10k_x_784 (
  "n_0" BIGINT,
  "n_0_1" BIGINT,
  "n_0_2" BIGINT,
  "n_0_3" BIGINT,
  "n_0_4" BIGINT,
  "n_0_5" BIGINT,
  "n_0_6" BIGINT,
  "n_0_7" BIGINT,
  "n_0_8" BIGINT,
  "n_0_9" BIGINT,
  "n_0_10" BIGINT,
  "n_0_11" BIGINT,
  "n_0_12" BIGINT,
  "n_0_13" BIGINT,
  "n_0_14" BIGINT,
  "n_0_15" BIGINT,
  "n_0_16" BIGINT,
  "n_0_17" BIGINT,
  "n_0_18" BIGINT,
  "n_0_19" BIGINT,
  "n_0_20" BIGINT,
  "n_0_21" BIGINT,
  "n_0_22" BIGINT,
  "n_0_23" BIGINT,
  "n_0_24" BIGINT,
  "n_0_25" BIGINT,
  "n_0_26" BIGINT,
  "n_0_27" BIGINT,
  "n_0_28" BIGINT,
  "n_0_29" BIGINT,
  "n_0_30" BIGINT,
  "n_0_31" BIGINT,
  "n_0_32" BIGINT,
  "n_0_33" BIGINT,
  "n_0_34" BIGINT,
  "n_0_35" BIGINT,
  "n_0_36" BIGINT,
  "n_0_37" BIGINT,
  "n_0_38" BIGINT,
  "n_0_39" BIGINT,
  "n_0_40" BIGINT,
  "n_0_41" BIGINT,
  "n_0_42" BIGINT,
  "n_0_43" BIGINT,
  "n_0_44" BIGINT,
  "n_0_45" BIGINT,
  "n_0_46" BIGINT,
  "n_0_47" BIGINT,
  "n_0_48" BIGINT,
  "n_0_49" BIGINT,
  "n_0_50" BIGINT,
  "n_0_51" BIGINT,
  "n_0_52" BIGINT,
  "n_0_53" BIGINT,
  "n_0_54" BIGINT,
  "n_0_55" BIGINT,
  "n_0_56" BIGINT,
  "n_0_57" BIGINT,
  "n_0_58" BIGINT,
  "n_0_59" BIGINT,
  "n_0_60" BIGINT,
  "n_0_61" BIGINT,
  "n_0_62" BIGINT,
  "n_0_63" BIGINT,
  "n_0_64" BIGINT,
  "n_0_65" BIGINT,
  "n_0_66" BIGINT,
  "n_0_67" BIGINT,
  "n_0_68" BIGINT,
  "n_0_69" BIGINT,
  "n_0_70" BIGINT,
  "n_0_71" BIGINT,
  "n_0_72" BIGINT,
  "n_0_73" BIGINT,
  "n_0_74" BIGINT,
  "n_0_75" BIGINT,
  "n_0_76" BIGINT,
  "n_0_77" BIGINT,
  "n_0_78" BIGINT,
  "n_0_79" BIGINT,
  "n_0_80" BIGINT,
  "n_0_81" BIGINT,
  "n_0_82" BIGINT,
  "n_0_83" BIGINT,
  "n_0_84" BIGINT,
  "n_0_85" BIGINT,
  "n_0_86" BIGINT,
  "n_0_87" BIGINT,
  "n_0_88" BIGINT,
  "n_0_89" BIGINT,
  "n_0_90" BIGINT,
  "n_0_91" BIGINT,
  "n_0_92" BIGINT,
  "n_0_93" BIGINT,
  "n_1" BIGINT,
  "n_0_94" BIGINT,
  "n_0_95" BIGINT,
  "n_0_96" BIGINT,
  "n_215" BIGINT,
  "n_150" BIGINT
);

Testlabel 10k X 1

@kaggle.mloey1_ahdd1.testlabel_10k_x_1
  • 4.05 KB
  • 9999 rows
  • 2 columns
Loading...

CREATE TABLE testlabel_10k_x_1 (
  "unnamed_0" DOUBLE,
  "unnamed_1" VARCHAR
);

Trainimages 60k X 784

@kaggle.mloey1_ahdd1.trainimages_60k_x_784
  • 16.79 MB
  • 59999 rows
  • 784 columns
Loading...

CREATE TABLE trainimages_60k_x_784 (
  "n_0" BIGINT,
  "n_0_1" BIGINT,
  "n_0_2" BIGINT,
  "n_0_3" BIGINT,
  "n_0_4" BIGINT,
  "n_0_5" BIGINT,
  "n_0_6" BIGINT,
  "n_0_7" BIGINT,
  "n_0_8" BIGINT,
  "n_0_9" BIGINT,
  "n_0_10" BIGINT,
  "n_0_11" BIGINT,
  "n_0_12" BIGINT,
  "n_0_13" BIGINT,
  "n_0_14" BIGINT,
  "n_0_15" BIGINT,
  "n_0_16" BIGINT,
  "n_0_17" BIGINT,
  "n_0_18" BIGINT,
  "n_0_19" BIGINT,
  "n_0_20" BIGINT,
  "n_0_21" BIGINT,
  "n_0_22" BIGINT,
  "n_0_23" BIGINT,
  "n_0_24" BIGINT,
  "n_0_25" BIGINT,
  "n_0_26" BIGINT,
  "n_0_27" BIGINT,
  "n_0_28" BIGINT,
  "n_0_29" BIGINT,
  "n_0_30" BIGINT,
  "n_0_31" BIGINT,
  "n_0_32" BIGINT,
  "n_0_33" BIGINT,
  "n_0_34" BIGINT,
  "n_0_35" BIGINT,
  "n_0_36" BIGINT,
  "n_0_37" BIGINT,
  "n_0_38" BIGINT,
  "n_0_39" BIGINT,
  "n_0_40" BIGINT,
  "n_0_41" BIGINT,
  "n_0_42" BIGINT,
  "n_0_43" BIGINT,
  "n_0_44" BIGINT,
  "n_0_45" BIGINT,
  "n_0_46" BIGINT,
  "n_0_47" BIGINT,
  "n_0_48" BIGINT,
  "n_0_49" BIGINT,
  "n_0_50" BIGINT,
  "n_0_51" BIGINT,
  "n_0_52" BIGINT,
  "n_0_53" BIGINT,
  "n_0_54" BIGINT,
  "n_0_55" BIGINT,
  "n_0_56" BIGINT,
  "n_0_57" BIGINT,
  "n_0_58" BIGINT,
  "n_0_59" BIGINT,
  "n_0_60" BIGINT,
  "n_0_61" BIGINT,
  "n_0_62" BIGINT,
  "n_0_63" BIGINT,
  "n_0_64" BIGINT,
  "n_0_65" BIGINT,
  "n_0_66" BIGINT,
  "n_0_67" BIGINT,
  "n_0_68" BIGINT,
  "n_0_69" BIGINT,
  "n_0_70" BIGINT,
  "n_0_71" BIGINT,
  "n_0_72" BIGINT,
  "n_0_73" BIGINT,
  "n_0_74" BIGINT,
  "n_0_75" BIGINT,
  "n_0_76" BIGINT,
  "n_0_77" BIGINT,
  "n_0_78" BIGINT,
  "n_0_79" BIGINT,
  "n_0_80" BIGINT,
  "n_0_81" BIGINT,
  "n_0_82" BIGINT,
  "n_0_83" BIGINT,
  "n_0_84" BIGINT,
  "n_0_85" BIGINT,
  "n_0_86" BIGINT,
  "n_0_87" BIGINT,
  "n_0_88" BIGINT,
  "n_0_89" BIGINT,
  "n_0_90" BIGINT,
  "n_0_91" BIGINT,
  "n_0_92" BIGINT,
  "n_0_93" BIGINT,
  "n_0_94" BIGINT,
  "n_0_95" BIGINT,
  "n_0_96" BIGINT,
  "n_0_97" BIGINT,
  "n_0_98" BIGINT,
  "n_0_99" BIGINT
);

Trainlabel 60k X 1

@kaggle.mloey1_ahdd1.trainlabel_60k_x_1
  • 13.65 KB
  • 59999 rows
  • 2 columns
Loading...

CREATE TABLE trainlabel_60k_x_1 (
  "unnamed_0" DOUBLE,
  "unnamed_1" VARCHAR
);

Share link

Anyone who has the link will be able to view this.