Baselight

Chinese Medical Dialogue

Deep Learning for Intelligent Healthcare

@kaggle.thedevastator_chinese_medical_dialogue_model

Loading...
Loading...

About this Dataset

Chinese Medical Dialogue


Chinese Medical Dialogue

Deep Learning for Intelligent Healthcare

By Huggingface Hub [source]


About this dataset

This dataset is designed to train a deep learning language model for intelligent healthcare using Chinese medical dialogue. It includes different components such as pretraining, finetuning and reward data which allows the model to learn how to produce more accurate answers in the medical context. The dataset consists of columns containing questions, chosen responses and rejected responses allowing it to view multiple perspectives when constructing a conversation. This makes the model not only more precise but also reinforces its ability to engage with medical dialogue at an advanced level, making it great resource for businesses, researchers or any individual looking into developing their own intelligent healthcare system

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset can be used to train an intelligent language model for medical dialogue in Chinese. To use this dataset, one would need to get familiar with the following steps:

  • Pretraining - Use the pretraining data provided in the dataset to build and fine-tune a language model. This will help the model understand basic elements of the medical dialogue and acquire general knowledge about Chinese medicine.

  • Finetuning - Use the finetune data and apply transfer learning techniques such as distrust learning or multi-task learning to further improve model accuracy on specific tasks such as medical related questions and responses.

  • Reward - Make use of rewards from patient or doctor for correct responses, which will help boost performance of AI systems by guiding them with real feedback from experienced healthcare professionals or patients themselves based on their understanding of medicine knowledge in long dialogue flows interviews or discussions .

  • Evaluation - After training with pretraining/finetuning/reward datasets, make sure you evaluate your trained models on unseen data using reward_validation file which is provided along with the dataset itself to assess its performance level effectively

Research Ideas

  • Utilizing reinforcement learning with the reward data for training a dialogue model that rewards correct responses.
  • Employing few-shot learning methods to quickly adapt the pretraining data for new and unseen medical dialogues.
  • Exploring transfer learning techniques to apply knowledge learned from one medical domain to another

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: reward_train.csv

Column name Description
question The question asked in the medical dialogue. (String)
response_chosen The response chosen by the model as the correct answer. (String)
response_rejected The response rejected by the model as the incorrect answer. (String)

File: reward_test.csv

Column name Description
question The question asked in the medical dialogue. (String)
response_chosen The response chosen by the model as the correct answer. (String)
response_rejected The response rejected by the model as the incorrect answer. (String)

File: reward_validation.csv

Column name Description
question The question asked in the medical dialogue. (String)
response_chosen The response chosen by the model as the correct answer. (String)
response_rejected The response rejected by the model as the incorrect answer. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Finetune Test

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_test
  • 503.5 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE finetune_test (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Finetune Train

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_train
  • 844.08 MB
  • 2066589 rows
  • 3 columns
Loading...

CREATE TABLE finetune_train (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Finetune Validation

@kaggle.thedevastator_chinese_medical_dialogue_model.finetune_validation
  • 512.83 KB
  • 1000 rows
  • 3 columns
Loading...

CREATE TABLE finetune_validation (
  "instruction" VARCHAR,
  "input" VARCHAR,
  "output" VARCHAR
);

Reward Test

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_test
  • 80.58 KB
  • 100 rows
  • 3 columns
Loading...

CREATE TABLE reward_test (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);

Reward Train

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_train
  • 1.78 MB
  • 3800 rows
  • 3 columns
Loading...

CREATE TABLE reward_train (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);

Reward Validation

@kaggle.thedevastator_chinese_medical_dialogue_model.reward_validation
  • 55.15 KB
  • 100 rows
  • 3 columns
Loading...

CREATE TABLE reward_validation (
  "question" VARCHAR,
  "response_chosen" VARCHAR,
  "response_rejected" VARCHAR
);

Share link

Anyone who has the link will be able to view this.