TokenSHAP: Interpreting Large Language Models

About TokenSHAP

The method introduces an efficient way to estimate the importance of tokens based on Shapley values, providing interpretable, quantitative measures of token importance. It addresses the combinatorial complexity of language inputs and demonstrates efficacy across various prompts and LLM architectures.

TokenSHAP represents a significant advancement in making AI more transparent and trustworthy, particularly in critical applications such as healthcare diagnostics, legal analysis, and automated decision-making systems.

Key Features

Interpretability for LLMs

Delivers a methodical approach to understanding how individual components of input affect LLM outputs.

Monte Carlo Shapley Estimation

Utilizes a Monte Carlo approach to efficiently compute Shapley values for tokens, suitable for extensive texts and large models.

Versatile Application

Applicable across various LLM architectures and prompt types, from factual questions to complex multi-sentence inputs.

Installation

Install from PyPI:

pip install tokenshap

Or install from source:

git clone https://github.com/ronigold/TokenSHAP.git
cd TokenSHAP
pip install -r requirements.txt

Usage Examples

Local Model Usage

# Import TokenSHAP
from token_shap import *

model_name_or_path = "meta-llama/Llama-3.2-3B-Instruct"
model = LocalModel(model_name_or_path)
splitter = StringSplitter()
token_shap = TokenSHAP(model, splitter)

# Analyze token importance
prompt = "Why is the sky blue?"
df = token_shap.analyze(prompt, sampling_ratio=0.0, print_highlight_text=True)

API Model Usage

# Import TokenSHAP
from token_shap import *
model_name = "llama3.2:3b"
api_url = "http://localhost:11434"

api_model = OllamaModel(model_name=model_name, api_url=api_url)
splitter = StringSplitter()
token_shap_api = TokenSHAP(api_model, splitter, debug=False)

# Analyze token importance
prompt = "Why is the sky blue?"
df = token_shap_api.analyze(prompt, sampling_ratio=0.0, print_highlight_text=True)

Citation

If you use TokenSHAP in your research, please cite our paper:

@article{goldshmidt2024tokenshap,
  title={TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation},
  author={Goldshmidt, Roni and Horovicz, Miriam},
  journal={arXiv preprint arXiv:2407.10114},
  year={2024}
}