A Surprisingly Effective Way to Estimate Token Importance in LLM Prompts

cs702 · on Sept 12, 2023

Simple, and in hindsight, obvious:

1. Run the text through a document embeddding model and save the embedding.

2. Remove one token at a time, run the text through the model, and compute the cosine similarity of the each new embedding with the original one.

3. Compute importance as a function of the change in cosine similarity.

Nice. I like it and expect it will work well in many scenarios.

Also check out https://github.com/glassroom/heinsen_routing . It takes N embeddings and outputs M embeddings (instead of one), and can optionally give you an N×M matrix with credit assignments, without having to remove tokens one by one, which can be prohibitively slow for long texts.