Bridging Accuracy and Interpretability in Large Language Models: A Hybrid AI Approach

Zhang Hao; Parsa Mazaheri; Ece Arslan

PDF

Published: 2026-04-15

Keywords:

Accuracy, Interpretability, Large Language Models, Hybrid AI, Explainability, Machine Learning, Neural Networks

Zhang Hao

Johns Hopkins University, Baltimore, MD, USA

Parsa Mazaheri

University of California, Santa Cruz, CA, USA

Ece Arslan

Johns Hopkins University, Baltimore, MD, USA

Abstract

Large language models (LLMs) deliver strong performance across many natural language processing tasks, yet their deployment in high-stakes environments remains constrained by limited interpretability. This paper revisits that tension and develops a hybrid AI framework that combines a transformer encoder with a concept bottleneck, a lightweight rule engine, and a rationale-alignment objective. The goal is not merely to generate accurate predictions, but to provide compact decision traces that domain experts can inspect, challenge, and refine.
We organize the paper around three practical questions: whether a hybrid design can preserve the predictive strength of transformer-only systems, which architectural components contribute most to explanation quality, and how researchers should report the trade-off between accuracy and interpretability. To make the discussion concrete, we include a proof-of-concept pilot evaluation on three representative tasks---sentiment classification, named entity recognition, and extractive question answering---together with tables, charts, and an ablation analysis that can be updated as a full experimental benchmark becomes available.
In the pilot study, the proposed hybrid model achieves an average task score of 90.5, closely matching the strongest transformer baseline at 90.9, while substantially improving explanation fidelity, decision-trace coverage, and human-rated clarity. The aggregate interpretability score rises from 38 for the transformer-only baseline to 82 for the hybrid configuration, suggesting that much of the lost transparency in modern LLM pipelines can be recovered without a material reduction in task quality. The ablation results further show that the rule engine and rationale-alignment loss are the dominant contributors to explanation quality.
Beyond the numerical comparison, this manuscript provides a structured reporting template for future empirical studies on hybrid LLM systems. The paper argues that trustworthy LLM deployment requires paired evidence on predictive performance, explanation behavior, and operational constraints. By framing hybrid AI as both a modeling strategy and an evaluation discipline, we aim to support the design of language technologies that are accurate, auditable, and practical for real-world decision support.

Issue

Vol. 4 No. 1 (2026): Special Issue

Section

Articles

How to Cite

Bridging Accuracy and Interpretability in Large Language Models: A Hybrid AI Approach. (2026). International Journal of Computational Health & Machine Learning, 4(1). https://ijchml.com/index.php/ijchml/article/view/204

Bridging Accuracy and Interpretability in Large Language Models: A Hybrid AI Approach

Abstract

Issue

Section

How to Cite

References

Similar Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section

How to Cite

References

Similar Articles