Application of Large Language Models in Automated Question Generation: A Case Study on ChatGLM's Structured Questions for National Teacher Certification Exams

AI-generated keywords: Large Language Models ChatGLM Automated Question Generation National Teacher Certification Exams (NTCE) Educational Assessment

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Study by Ling He, Yanxin Chen, and Xiaoqiang Hu explores application potential of large language models (LLMs) like ChatGLM in automated question generation for National Teacher Certification Exams (NTCE)
  • ChatGLM generated simulated questions compared with past examinee questions, evaluated by education experts
  • Results show high rationality, scientificity, and practicality of ChatGLM-generated questions similar to real exam questions
  • Model demonstrates accuracy and reliability in question generation but identified limitations in considering different rating criteria
  • Research validates ChatGLM's potential in educational assessment and supports development of more efficient automated generation systems
  • Findings contribute to advancing field of automated question generation and improving educational assessment processes
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Ling He, Yanxin Chen, Xiaoqiang Hu

Abstract: This study delves into the application potential of the large language models (LLMs) ChatGLM in the automatic generation of structured questions for National Teacher Certification Exams (NTCE). Through meticulously designed prompt engineering, we guided ChatGLM to generate a series of simulated questions and conducted a comprehensive comparison with questions recollected from past examinees. To ensure the objectivity and professionalism of the evaluation, we invited experts in the field of education to assess these questions and their scoring criteria. The research results indicate that the questions generated by ChatGLM exhibit a high level of rationality, scientificity, and practicality similar to those of the real exam questions across most evaluation criteria, demonstrating the model's accuracy and reliability in question generation. Nevertheless, the study also reveals limitations in the model's consideration of various rating criteria when generating questions, suggesting the need for further optimization and adjustment. This research not only validates the application potential of ChatGLM in the field of educational assessment but also provides crucial empirical support for the development of more efficient and intelligent educational automated generation systems in the future.

Submitted to arXiv on 19 Aug. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2408.09982v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The study by Ling He, Yanxin Chen, and Xiaoqiang Hu explores the application potential of large language models (LLMs) like ChatGLM in automated question generation for National Teacher Certification Exams (NTCE). Through careful prompt design, ChatGLM was able to generate a series of simulated questions that were then compared comprehensively with questions from past examinees. To ensure objectivity, experts in education evaluated these questions and their scoring criteria. The results demonstrate that the questions generated by ChatGLM exhibit high levels of rationality, scientificity, and practicality similar to real exam questions across various evaluation criteria. This highlights the accuracy and reliability of the model in question generation. However, limitations were also identified in terms of considering different rating criteria during question generation, indicating the need for further optimization. Overall, this research not only validates ChatGLM's potential in educational assessment but also provides empirical support for developing more efficient and intelligent educational automated generation systems in the future. These findings contribute to advancing the field of automated question generation and improving educational assessment processes.
Created on 20 Oct. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.