Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature

AI-generated keywords: Lay Summarisation Automatic Approaches Datasets Public Access Scientific Literature

AI-generated Key Points

  • Lay summarisation simplifies complex texts for non-experts
  • Automatic approaches are crucial for broadening access to scientific literature
  • Current datasets for lay summarisation are limited in size and scope
  • Introduction of new lay summarisation datasets: PLOS (large-scale) and eLife (medium-scale)
  • Characterization of lay summaries in the datasets, noting differences in readability and abstractiveness
  • Benchmarking of datasets using mainstream summarisation approaches and manual evaluation with domain experts
  • Public availability of code and datasets provided by researchers
  • Discussion on previous attempts at automatically summarising scientific content for non-experts
  • Highlighting limitations in existing datasets and models, emphasizing the need for comprehensive resources like PLOS and eLife
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Tomas Goldsack, Zhihao Zhang, Chenghua Lin, Carolina Scarton

16 pages, 9 figures. Accepted to EMNLP 2022
License: CC BY 4.0

Abstract: Lay summarisation aims to jointly summarise and simplify a given text, thus making its content more comprehensible to non-experts. Automatic approaches for lay summarisation can provide significant value in broadening access to scientific literature, enabling a greater degree of both interdisciplinary knowledge sharing and public understanding when it comes to research findings. However, current corpora for this task are limited in their size and scope, hindering the development of broadly applicable data-driven approaches. Aiming to rectify these issues, we present two novel lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale), each of which contains biomedical journal articles alongside expert-written lay summaries. We provide a thorough characterisation of our lay summaries, highlighting differing levels of readability and abstractiveness between datasets that can be leveraged to support the needs of different applications. Finally, we benchmark our datasets using mainstream summarisation approaches and perform a manual evaluation with domain experts, demonstrating their utility and casting light on the key challenges of this task.

Submitted to arXiv on 18 Oct. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2210.09932v1

In this study, the researchers focus on lay summarisation, which involves summarising and simplifying complex texts to make them more understandable for non-experts. They highlight the importance of automatic approaches for lay summarisation in broadening access to scientific literature and facilitating interdisciplinary knowledge sharing and public understanding of research findings. However, they note that current datasets for this task are limited in size and scope, hindering the development of effective data-driven approaches. To address these limitations, the researchers introduce two new lay summarisation datasets: PLOS (large-scale) and eLife (medium-scale). These datasets contain biomedical journal articles along with expert-written lay summaries. The researchers thoroughly characterize the lay summaries in these datasets, noting differences in readability and abstractiveness that can cater to different application needs. The researchers then benchmark their datasets using mainstream summarisation approaches and conduct a manual evaluation with domain experts. Through this evaluation, they demonstrate the utility of their datasets and shed light on key challenges in the task of lay summarisation. Additionally, they provide their code and datasets for public access. In related work, the researchers discuss previous attempts at automatically summarising scientific content for non-experts. They mention the LaySumm subtask of CL-SciSumm 2020 shared task series as well as other efforts using sources like The Cochrane Database of Systematic Reviews and science news websites. They highlight limitations in existing datasets and models for lay summarisation, emphasizing the need for more comprehensive resources like PLOS and eLife. Overall, this study contributes valuable insights into lay summarisation by introducing new datasets, evaluating existing approaches, and addressing key challenges in making scientific literature more accessible to a wider audience.
Created on 19 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.