Text Summarization Using Large Language Models: A Comparative Study of MPT-7b-instruct, Falcon-7b-instruct, and OpenAI Chat-GPT Models

AI-generated keywords: Text summarization Large Language Models NLP applications Generative AI solutions LLM performance

AI-generated Key Points

  • The paper explores text summarization with Large Language Models (LLMs), focusing on their capabilities and limitations.
  • Various LLMs are investigated, and different hyperparameters are experimented with to evaluate the quality of generated summaries using metrics like BLEU Score, Rouge Score, and Bert Score.
  • Text summarization methods are categorized into abstractive (rephrasing content) and extractive (selecting important sentences/phrases) approaches.
  • Supervised summarization relies on labeled training data, while unsupervised summarization extracts information based on factors like sentence importance and coherence.
  • Performance comparisons of LLMs such as MPT-7b-instruct, falcon-7b-instruct, and OpenAI ChatGPT text-davinci-003 models show that text-davinci-003 outperformed others in experiments on datasets like CNN Daily Mail and XSum.
  • The research provides valuable insights for leveraging LLMs in NLP applications and lays the groundwork for advanced Generative AI solutions to address diverse business challenges.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Lochan Basyal, Mihir Sanghvi

4 pages, 2 tables
License: CC BY 4.0

Abstract: Text summarization is a critical Natural Language Processing (NLP) task with applications ranging from information retrieval to content generation. Leveraging Large Language Models (LLMs) has shown remarkable promise in enhancing summarization techniques. This paper embarks on an exploration of text summarization with a diverse set of LLMs, including MPT-7b-instruct, falcon-7b-instruct, and OpenAI ChatGPT text-davinci-003 models. The experiment was performed with different hyperparameters and evaluated the generated summaries using widely accepted metrics such as the Bilingual Evaluation Understudy (BLEU) Score, Recall-Oriented Understudy for Gisting Evaluation (ROUGE) Score, and Bidirectional Encoder Representations from Transformers (BERT) Score. According to the experiment, text-davinci-003 outperformed the others. This investigation involved two distinct datasets: CNN Daily Mail and XSum. Its primary objective was to provide a comprehensive understanding of the performance of Large Language Models (LLMs) when applied to different datasets. The assessment of these models' effectiveness contributes valuable insights to researchers and practitioners within the NLP domain. This work serves as a resource for those interested in harnessing the potential of LLMs for text summarization and lays the foundation for the development of advanced Generative AI applications aimed at addressing a wide spectrum of business challenges.

Submitted to arXiv on 16 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.10449v1

This paper delves into the realm of text summarization with Large Language Models (LLMs), providing a comprehensive exploration of their capabilities and limitations. The study investigates various LLMs and experiments with different hyperparameters to evaluate the quality of generated summaries using established metrics like BLEU Score, Rouge Score, and Bert Score. The primary focus is to offer valuable insights for leveraging LLMs in NLP applications and laying the groundwork for advanced Generative AI solutions that can address diverse business challenges. The paper is structured to include detailed explanations of text summarization methods, supervised and unsupervised techniques, datasets, evaluation metrics, inference with different LLMs, and suggestions for future enhancements. Text summarization methods are categorized into abstractive and extractive approaches. Abstractive summarization involves generating concise summaries by understanding context and rephrasing content using advanced language models like LLMs. On the other hand, extractive summarization selects important sentences or phrases directly from the source text without rephrasing. Supervised summarization relies on labeled training data where human annotators provide summaries for source texts. Machine learning models are trained on this data to learn mappings between texts and summaries. Unsupervised summarization does not require labeled data; it extracts relevant information based on factors like sentence importance and coherence. The paper also discusses the performance of different LLMs such as MPT-7b-instruct, falcon-7b-instruct, and OpenAI ChatGPT text-davinci-003 models in text summarization experiments conducted on datasets like CNN Daily Mail and XSum. Results show that text-davinci-003 outperformed others based on metrics like BLEU Score, Rouge Score, and Bert Score. Overall, this research serves as a valuable resource for researchers and practitioners in NLP by providing insights into the effectiveness of LLMs in text summarization across various datasets. It sets a foundation for developing advanced Generative AI applications that can tackle a wide range of business challenges effectively.
Created on 03 Feb. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.