Chinese Medical Dialogue by Kaggle | Healthcare

About this Dataset

Chinese Medical Dialogue

Deep Learning for Intelligent Healthcare

By Huggingface Hub [source]

About this dataset

This dataset is designed to train a deep learning language model for intelligent healthcare using Chinese medical dialogue. It includes different components such as pretraining, finetuning and reward data which allows the model to learn how to produce more accurate answers in the medical context. The dataset consists of columns containing questions, chosen responses and rejected responses allowing it to view multiple perspectives when constructing a conversation. This makes the model not only more precise but also reinforces its ability to engage with medical dialogue at an advanced level, making it great resource for businesses, researchers or any individual looking into developing their own intelligent healthcare system

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset can be used to train an intelligent language model for medical dialogue in Chinese. To use this dataset, one would need to get familiar with the following steps:

Pretraining - Use the pretraining data provided in the dataset to build and fine-tune a language model. This will help the model understand basic elements of the medical dialogue and acquire general knowledge about Chinese medicine.

Finetuning - Use the finetune data and apply transfer learning techniques such as distrust learning or multi-task learning to further improve model accuracy on specific tasks such as medical related questions and responses.

Reward - Make use of rewards from patient or doctor for correct responses, which will help boost performance of AI systems by guiding them with real feedback from experienced healthcare professionals or patients themselves based on their understanding of medicine knowledge in long dialogue flows interviews or discussions .

Evaluation - After training with pretraining/finetuning/reward datasets, make sure you evaluate your trained models on unseen data using reward_validation file which is provided along with the dataset itself to assess its performance level effectively

Research Ideas

Utilizing reinforcement learning with the reward data for training a dialogue model that rewards correct responses.

Employing few-shot learning methods to quickly adapt the pretraining data for new and unseen medical dialogues.

Exploring transfer learning techniques to apply knowledge learned from one medical domain to another

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: reward_train.csv

Column name	Description
question	The question asked in the medical dialogue. (String)
response_chosen	The response chosen by the model as the correct answer. (String)
response_rejected	The response rejected by the model as the incorrect answer. (String)

File: reward_test.csv

Column name	Description
question	The question asked in the medical dialogue. (String)
response_chosen	The response chosen by the model as the correct answer. (String)
response_rejected	The response rejected by the model as the incorrect answer. (String)

File: reward_validation.csv

Column name	Description
question	The question asked in the medical dialogue. (String)
response_chosen	The response chosen by the model as the correct answer. (String)
response_rejected	The response rejected by the model as the incorrect answer. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Finetune Test

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_test

503.5 KB
1000 rows
3 columns


CREATE TABLE finetune_test (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Finetune Train

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_train

844.08 MB
2066589 rows
3 columns


CREATE TABLE finetune_train (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Finetune Validation

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_validation

512.83 KB
1000 rows
3 columns


CREATE TABLE finetune_validation (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Reward Test

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_test

80.58 KB
100 rows
3 columns


CREATE TABLE reward_test (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);

Reward Train

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_train

1.78 MB
3800 rows
3 columns


CREATE TABLE reward_train (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);

Reward Validation

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_validation

55.15 KB
100 rows
3 columns


CREATE TABLE reward_validation (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);