PMC-LLaMA: Further Finetuning LLaMA on Medical Papers
AI-generated Key Points
- Large Language Models (LLMs) have shown remarkable capabilities in natural language understanding across various domains
- In areas that require precision, such as medical applications, these models often exhibit unsatisfactory performance due to a lack of domain-specific knowledge
- PMC-LLaMA is an open-source language model that is fine-tuned on 4.8 million biomedical academic papers to inject medical knowledge and enhance its capability in the medical domain
- The authors conducted a preliminary investigation by fine-tuning the existing LLaMA model with the aforementioned dataset and demonstrated that PMC-LLaMA is more suitable for medical tasks compared to LLaMA
- The evaluation was conducted on three biomedical QA datasets: PubMedQA, MedMCQA, and USMLE, showing better understanding of biomedical domain-specific concepts and achieving high performance on QA benchmarks after fine-tuning
- The authors outline their fine-tuning procedure using S2ORC datasets with specific training details such as max context length set at 512, batch size at 128, AdamW optimizer with learning rate 2e-5 and Fully Sharded Data Parallel (FSDP) acceleration strategy and bf16 (Brain Floating Point) data format.
- The model is trained for five epochs with eight A100 GPUs in around seven days, and in each epoch they randomly sample 512 continuous tokens per paper for training.
- The authors also provide a detailed description of the evaluation benchmark which includes three QA datasets: PubMedQA, MedMCQA and UMLSE.
- Overall , PMC - LLaMA offers an open - source language model that enhances LLaMA's capability in the medical domain by injecting domain - specific knowledge through fine - tuning on biomedical academic papers .
- The model and codes are publicly available along with an online demo for further exploration.
Authors: Chaoyi Wu, Xiaoman Zhang, Ya Zhang, Yanfeng Wang, Weidi Xie
Abstract: Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding in various domains. These models can usually behave well on daily dialog, or question answering scenarios, however, in areas that value precision, for example, in medical applications, they often exhibit unsatisfactory performance due to a lack of domain-specific knowledge. In this report, we introduce PMC-LLaMA, an open-source language model that is acquired by fine-tuning an open-source language model on a total of 4.8 million biomedical academic papers for further injecting medical knowledge, enhancing its capability in medical domain. Our preliminary evaluations are conducted on three biomedical QA datasets, including PubMedQA, MedMCQA, and USMLE, showing that the our model after finetuning, i.e., PMC-LLaMA, demonstrates better understanding of biomedical domain-specific concepts, thus achieving high performance on QA benchmarks. The model and codes, along with an online demo, are publicly available.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.