To transform the prescription images into a structured dataset suitable for machine learning, a specialized word detection code was employed. This code segmented the prescription images into individual words, effectively converting the data into a format amenable to our recognition task. This step is pivotal in enabling the machine learning model to analyze and classify prescription elements accurately.
The pharmaceutical names are the only terms that remain after each word has been manually screened. Other names that don’t involve medicine are dropped. Next, the names of the medications are labelled using the standard eye view comprehension. Multiple team members independently crosschecked the labelling to make sure the terms are labelled correctly.
The resultant dataset consists of a total of 4,680 individual words that were extracted from the prescription images. For each of those words, we crafted separate Excel and CSV files.
This dataset contains total 78 data classes. These classes contains following names: Beklo, Maxima, Leptic, Esoral, Omastin, Esonix, Canazole, Fixal, Progut, Diflu, Montair, Flexilax, Maxpro, Vifas, Conaz, Fexofast, Fenadin, Telfast, Dinafex, Ritch, Renova, Flugal, Axodin, Sergel, Nexum, Opton, Nexcap, Fexo, Montex, Exium, Lumona, Napa, Azithrocin, Atrizin, Monas, Nidazyl, Metsina, Baclon, Rozith, Bicozin, Ace, Amodis, Alatrol, Napa Extend, Rivotril, Montene, Filmet, Aceta, Tamen, Bacmax, Disopan, Rhinil, Flamyd, Metro, Zithrin, Candinil, Lucan-R, Backtone, Bacaid, Etizin, Az, Romycin, Azyth, Cetisoft, Dancel, Tridosil, Nizoder, Ketoral, Ketocon, Ketotab, Ketozol, Denixil, Provair, Odmon, Baclofen, MKast, Trilock, Flexibac.
Any one of the 78 drug names listed above will probably be recognized by the model once it has been trained using this dataset.
This dataset is free to use for educational practice in machine learning algorithms.