MIT researchers developed Attention Matching to reduce AI memory use by up to 50 times without losing accuracy LLMs store conversation data in large KV caches, causing high memory demands and costs Attention Matching identifies and keeps only key information, shrinking memory from 1 GB to about 20 MB