Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence

AI-generated keywords: Large Language Models (LLMs)

AI-generated Key Points

  • Large Language Models (LLMs) can contribute to improving legal services, AI governance, and identifying inconsistencies in tax law.
  • Tax law was chosen as the focus for experiments due to its relevance and the ability to set up automated validation pipelines.
  • LLMs have emerging legal understanding capabilities, with improved performance in each subsequent model release by OpenAI.
  • Providing additional legal context and using few-shot prompting techniques enhances the performance of LLMs.
  • LLMs can perform at high levels of accuracy when provided with correct legal texts but are not yet at expert tax lawyer levels.
  • Advancements in LLMs could have significant implications for the legal profession and AI governance.
  • Other studies highlight various applications of LLMs, such as generating code, improving models with external knowledge, and open-domain question answering techniques.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: John J. Nay, David Karamardian, Sarah B. Lawsky, Wenting Tao, Meghana Bhat, Raghav Jain, Aaron Travis Lee, Jonathan H. Choi, Jungo Kasai

License: CC BY 4.0

Abstract: Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence, and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test LLM capabilities in a manner relevant to real-world economic lives of citizens and companies. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. We experiment with retrieving and utilising the relevant legal authority to assess the impact of providing additional legal context to LLMs. Few-shot prompting, presenting examples of question-answer pairs, is also found to significantly enhance the performance of the most advanced model, GPT-4. The findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels. As LLMs continue to advance, their ability to reason about law autonomously could have significant implications for the legal profession and AI governance.

Submitted to arXiv on 12 Jun. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2306.07075v1

This paper explores the capabilities of Large Language Models (LLMs) in applying tax law and how it can contribute to improving legal services, AI governance, and identifying inconsistencies in the law. The authors chose tax law as their focus because it allows them to set up automated validation pipelines, requires logical reasoning and math skills, and is relevant to the real-world economic lives of citizens and companies. Through experiments, they demonstrate that LLMs have emerging legal understanding capabilities, with improved performance in each subsequent model release by OpenAI. They also find that providing additional legal context to LLMs enhances their performance, particularly when combined with few-shot prompting techniques. However, while LLMs can perform at high levels of accuracy when provided with the correct legal texts, they are not yet at expert tax lawyer levels. The advancement of LLMs could have significant implications for the legal profession and AI governance as they continue to improve their ability to reason about law autonomously. Several related studies have been conducted on language models including research on discovering distributional differences through language descriptions, improving science question-answering through supervised reasoning processes, active prompting with chain-of-thought for large language models, scalable prompt generation for semi-supervised learning with language models, using language models for computer tasks, measuring and narrowing the compositionality gap in language models, synergizing reasoning and acting in language models (ReAct), exploring language models as accounts of human moral judgment, bounding the capabilities of large language models in open text generation with prompt constraints. Other studies focus on augmented language models through surveys or specific applications such as generating code by retrieving documentation (DocPrompting), improving large language models with external knowledge and automated feedback, learning to play Atari games with the help of instruction manuals (Read and Reap the Rewards), open-domain question answering techniques. There are also studies on recitation-augmented language models (Recitation-Augmented Language Models), composing retrieval and language models for knowledge-intensive NLP (Demonstrate-Search-Predict), compositional exemplars for in-context learning, and in-context retrieval-augmented language models. Overall these studies highlight the potential of LLMs in various applications and demonstrate a need to further explore their capabilities and limitations. The advancements made so far show promise towards revolutionizing the legal profession and AI governance but there is still progress needed before they can match human professionals' expertise.
Created on 01 Jul. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.