Skip to main content

Transitioning Tax Law into Computational Systems

Transitioning Tax Law into Computational Systems:

From AI-Generated Inputs to Deterministic Decisions using Knowledge Graphs

Borchuluun Yadamsuren, PhD, Assistant Director of Analytics, Quinlan School of Business, Loyola University Chicago

Preparing income taxes in the United States is becoming an increasingly complex process. As rules expand and interact in intricate ways, manual compliance is becoming more difficult for individuals and organizations. As a result, taxpayers are increasingly relying on automated tools to manage this complexity. However, ensuring that these systems produce consistent and verifiable decisions remains a significant challenge. In this article, we demonstrate that combining generative AI with knowledge graph–based reasoning offers a more reliable and transparent approach for handling complex, rule-based decisions in tax law. Our results show that this approach produces consistent, transparent, and deterministic outcomes across tax scenarios.

According to the Internal Revenue Service (IRS), more than 165 million individual income tax returns were processed in 2025, with about 94% submitted electronically, reflecting the widespread use of tax preparation tools and the shift toward digital filing [1]. Tax preparation has grown into a multi-billion-dollar industry supported by both professional services and software platforms.

However, this widespread reliance on automated tools also carries risks that are often overlooked. Most tax software, whether commercial or open-source, is delivered on an “as-is” basis. These systems may not undergo rigorous validation or formal verification. While these tools are generally effective for simple tax cases, prior research shows that errors in logic, implementation, or updates can remain hidden until they produce incorrect outcomes, particularly in complex scenarios [2].

This creates a fundamental mismatch: decisions are increasingly automated, but accountability remains human. Most users cannot inspect or verify how these systems arrive at their conclusions because the logic of legal reasoning is hidden, and step-by-step checks are limited. Taxpayers remain responsible for errors, even if those errors originate from software. This leads to the need to make this process efficient, accurate, and transparent.

According to the 2025 State of Tax Professionals report by Thomson Reuters, improving efficiency through technology remains a top priority for tax and accounting firms, as they navigate increasing complexity, workload pressure, and regulatory change [3].

The report also highlights growing interest in generative AI across the tax profession, with firms exploring its potential to improve efficiency, automate routine tasks, and support research. At the same time, it points to ongoing concerns around accuracy, reliability, and the need for human oversight when applying these tools in complex, rule-based contexts.

If a taxpayer turns to a generative AI tool to check whether it is possible to exclude the gain from selling his or her home under Internal Revenue Code (IRC) Section 121, for example, the system may respond with a clear and confident answer. However, when the same question is asked again with slightly different wording or details, the response changes. The difference is not dramatic, but it is enough to raise doubt. One answer suggests full eligibility, another points to only a partial exclusion, and a third introduces an exception that had not been considered before.

This behavior reflects how generative AI works. These systems generate responses based on probabilistic patterns rather than strict rule execution. As a result, the same input can produce different outputs, reasoning paths can vary, and consistency is not guaranteed. This stochastic behavior of LLMs is risky for rule-based decisions, such as those in the legal field. Small inconsistencies can change eligibility, outcomes, and financial consequences.

Our work with LLMs and knowledge graphs have overcome this problem. A knowledge graph is a graph-based representation of real-world information, where entities are represented as nodes and their relationships as connections between them. Beyond simply storing data, it provides a structured way to accumulate, organize, and derive knowledge by combining factual statements with rules and relationships [4]. It includes entities such as people, places, or concepts, along with relationships that define how they are connected. It can also incorporate attributes that describe these entities and an ontology or schema that provides structure and ensures consistency. In practice, knowledge graphs allow organizations to combine structured, semi-structured, and unstructured data into a unified representation that supports reasoning, search, and analytics.

This approach is particularly useful for rule-based domains such as tax law, where decisions depend on the consistent application of complex rules. In our work, initially focusing on IRC §121, which governs the exclusion of gain from the sale of a primary residence, we first convert a taxpayer’s situation into structured data by extracting key facts such as ownership, residency, and timing. These facts are then organized into a knowledge graph, where entities and their relationships are explicitly represented. Figure 1 illustrates how these structured elements, such as the taxpayer, property, ownership period, and sale, are connected and used to evaluate eligibility conditions and determine the allowable exclusion amount.

Knowledge Graph

Figure 1

Knowledge graph representation of IRC §121 home sale exclusion, illustrating key entities and relationships used to evaluate eligibility and determine the allowable exclusion amount (created by the author with AI assistance).

The system separates the automation process into two layers. In the first layer, a large language model is used to reduce manual coding by translating real-world taxpayer scenarios into structured data and generating SPARQL queries to check eligibility conditions and apply rule logic. This means capturing the relevant facts, such as ownership, residency, sale timing, and filing status, from user input in natural language, and organizing the key inputs required for evaluating eligibility into TTL files.

In the second layer, the system determines whether a taxpayer qualifies for the exclusion and calculates the allowable amount by applying rule-based logic and structured queries to the knowledge graph. The process evaluates each condition step by step, ensuring consistent and repeatable outcomes. While the rules appear simple, applying them correctly requires evaluating multiple interdependent conditions, including ownership duration, residency, prior use, and potential exceptions. The system applies the eligibility rules step by step, making each stage of the process explicit, traceable, and repeatable.

The system evaluates the rules using predefined queries, checking eligibility conditions, applying thresholds, and accounting for exceptions in a consistent sequence. Because the process is structured, each decision is traceable, reproducible, and directly tied to the underlying facts.

We evaluated the system using 77 structured scenarios derived from real tax guidance, including examples from IRS publications, Treasury Regulations under § 121, and federal regulatory examples. The system consistently applied the relevant rules, correctly determined whether taxpayers qualified for the exclusion, and calculated the allowable exclusion amount based on ownership, residency, and filing status. It also correctly applied statutory thresholds and handled exceptions, such as partial exclusions, producing consistent and reproducible outcomes across all cases. In addition, the use of SPARQL enabled visibility into intermediate reasoning steps by explicitly identifying which rules were applied at each stage, allowing the system to flag applicable conditions and provide a transparent, step-by-step trace of how each outcome was derived.

The potential of combining knowledge graphs with generative AI extends well beyond tax applications. In areas such as compliance, finance, and legal operations, organizations must apply complex rules consistently and at scale. Knowledge graphs provide the structure needed for reliable decision-making, while generative AI helps translate real-world scenarios into usable inputs. This combination offers a practical direction for building systems that are not only automated but also transparent, consistent, and trustworthy.

Dr. Borchuluun Yadamsuren

Author

Dr. Borchuluun Yadamsuren is an Assistant Director of Analytics at the Quinlan School of Business, Loyola University Chicago. Her research focuses on applying large language models to legal and medical domains, including tax law, FHIR-based healthcare ontologies, and hybrid AI approaches that combine generative models with symbolic logic for reliable decision-making.

Article References

[1] Internal Revenue Service, “National Taxpayer Advocate delivers Annual Report to Congress,” Jan. 28, 2026. Available here.

[2] S. Tizpaz-Niari, Y. Zhang, et al., “Metamorphic Testing and Debugging of Tax Preparation Software," in Proc. ACM Joint European Software Engineering Conference  and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2023.

[3] Thomson Reuters, “2025 State of Tax Professionals Report,” 2025. Available here.

[4] A. Hogan, E. Blomqvist, M. Cochez, C. d’Amato, G. de Melo, C. Gutierrez, S. Kirrane, J. E. L. Gayo, R. Navigli, S. Neumaier, A.-C. N. Ngomo, A. Polleres, S. M. Rashid, A. Rula, L.Schmelzeisen, J. Sequeda, S. Staab, and A. Zimmermann,  “Knowledge graphs,” ACM Computing Surveys, vol. 54, no. 4, pp. 1–37, 2021.

Preparing income taxes in the United States is becoming an increasingly complex process. As rules expand and interact in intricate ways, manual compliance is becoming more difficult for individuals and organizations. As a result, taxpayers are increasingly relying on automated tools to manage this complexity. However, ensuring that these systems produce consistent and verifiable decisions remains a significant challenge. In this article, we demonstrate that combining generative AI with knowledge graph–based reasoning offers a more reliable and transparent approach for handling complex, rule-based decisions in tax law. Our results show that this approach produces consistent, transparent, and deterministic outcomes across tax scenarios.

According to the Internal Revenue Service (IRS), more than 165 million individual income tax returns were processed in 2025, with about 94% submitted electronically, reflecting the widespread use of tax preparation tools and the shift toward digital filing [1]. Tax preparation has grown into a multi-billion-dollar industry supported by both professional services and software platforms.

However, this widespread reliance on automated tools also carries risks that are often overlooked. Most tax software, whether commercial or open-source, is delivered on an “as-is” basis. These systems may not undergo rigorous validation or formal verification. While these tools are generally effective for simple tax cases, prior research shows that errors in logic, implementation, or updates can remain hidden until they produce incorrect outcomes, particularly in complex scenarios [2].

This creates a fundamental mismatch: decisions are increasingly automated, but accountability remains human. Most users cannot inspect or verify how these systems arrive at their conclusions because the logic of legal reasoning is hidden, and step-by-step checks are limited. Taxpayers remain responsible for errors, even if those errors originate from software. This leads to the need to make this process efficient, accurate, and transparent.

According to the 2025 State of Tax Professionals report by Thomson Reuters, improving efficiency through technology remains a top priority for tax and accounting firms, as they navigate increasing complexity, workload pressure, and regulatory change [3].

The report also highlights growing interest in generative AI across the tax profession, with firms exploring its potential to improve efficiency, automate routine tasks, and support research. At the same time, it points to ongoing concerns around accuracy, reliability, and the need for human oversight when applying these tools in complex, rule-based contexts.

If a taxpayer turns to a generative AI tool to check whether it is possible to exclude the gain from selling his or her home under Internal Revenue Code (IRC) Section 121, for example, the system may respond with a clear and confident answer. However, when the same question is asked again with slightly different wording or details, the response changes. The difference is not dramatic, but it is enough to raise doubt. One answer suggests full eligibility, another points to only a partial exclusion, and a third introduces an exception that had not been considered before.

This behavior reflects how generative AI works. These systems generate responses based on probabilistic patterns rather than strict rule execution. As a result, the same input can produce different outputs, reasoning paths can vary, and consistency is not guaranteed. This stochastic behavior of LLMs is risky for rule-based decisions, such as those in the legal field. Small inconsistencies can change eligibility, outcomes, and financial consequences.

Our work with LLMs and knowledge graphs have overcome this problem. A knowledge graph is a graph-based representation of real-world information, where entities are represented as nodes and their relationships as connections between them. Beyond simply storing data, it provides a structured way to accumulate, organize, and derive knowledge by combining factual statements with rules and relationships [4]. It includes entities such as people, places, or concepts, along with relationships that define how they are connected. It can also incorporate attributes that describe these entities and an ontology or schema that provides structure and ensures consistency. In practice, knowledge graphs allow organizations to combine structured, semi-structured, and unstructured data into a unified representation that supports reasoning, search, and analytics.

This approach is particularly useful for rule-based domains such as tax law, where decisions depend on the consistent application of complex rules. In our work, initially focusing on IRC §121, which governs the exclusion of gain from the sale of a primary residence, we first convert a taxpayer’s situation into structured data by extracting key facts such as ownership, residency, and timing. These facts are then organized into a knowledge graph, where entities and their relationships are explicitly represented. Figure 1 illustrates how these structured elements, such as the taxpayer, property, ownership period, and sale, are connected and used to evaluate eligibility conditions and determine the allowable exclusion amount.

The system separates the automation process into two layers. In the first layer, a large language model is used to reduce manual coding by translating real-world taxpayer scenarios into structured data and generating SPARQL queries to check eligibility conditions and apply rule logic. This means capturing the relevant facts, such as ownership, residency, sale timing, and filing status, from user input in natural language, and organizing the key inputs required for evaluating eligibility into TTL files.

In the second layer, the system determines whether a taxpayer qualifies for the exclusion and calculates the allowable amount by applying rule-based logic and structured queries to the knowledge graph. The process evaluates each condition step by step, ensuring consistent and repeatable outcomes. While the rules appear simple, applying them correctly requires evaluating multiple interdependent conditions, including ownership duration, residency, prior use, and potential exceptions. The system applies the eligibility rules step by step, making each stage of the process explicit, traceable, and repeatable.

The system evaluates the rules using predefined queries, checking eligibility conditions, applying thresholds, and accounting for exceptions in a consistent sequence. Because the process is structured, each decision is traceable, reproducible, and directly tied to the underlying facts.

We evaluated the system using 77 structured scenarios derived from real tax guidance, including examples from IRS publications, Treasury Regulations under § 121, and federal regulatory examples. The system consistently applied the relevant rules, correctly determined whether taxpayers qualified for the exclusion, and calculated the allowable exclusion amount based on ownership, residency, and filing status. It also correctly applied statutory thresholds and handled exceptions, such as partial exclusions, producing consistent and reproducible outcomes across all cases. In addition, the use of SPARQL enabled visibility into intermediate reasoning steps by explicitly identifying which rules were applied at each stage, allowing the system to flag applicable conditions and provide a transparent, step-by-step trace of how each outcome was derived.

The potential of combining knowledge graphs with generative AI extends well beyond tax applications. In areas such as compliance, finance, and legal operations, organizations must apply complex rules consistently and at scale. Knowledge graphs provide the structure needed for reliable decision-making, while generative AI helps translate real-world scenarios into usable inputs. This combination offers a practical direction for building systems that are not only automated but also transparent, consistent, and trustworthy.