This dataset contain various skin disease symptoms. Some row of a disease contain direct symptoms text and some row contain text which people used to express the symptoms to the doctor. This is a binary text classification problem dataset.
To read the file use the following code:
** Just change the file path. **
encodings_to_try = ['utf-8', 'Latin-1', 'ISO-8859-1']
for encoding in encodings_to_try:
try:
df = pd.read_csv('F:/Skin text classifier.csv', encoding=encoding)
print("File read successfully with encoding:", encoding)
break
except UnicodeDecodeError:
pass
df.head()
The disease are:
1.'Vitiligo', 'Scabies'
2. 'Hives (Urticaria)'
3. 'Folliculitis',
4. .'Eczema'
5. 'Ringworm (Tinea Corporis)'
6. "Athlete's Foot (Tinea Pedis)"
7. 'Rosacea', 'Psoriasis'
8. 'Shingles (Herpes Zoster)'
9. 'Impetigo'
10. 'Contact Dermatitis'
11. 'Acne'