Infini-Attention and Compressive Memory: Unbounded Context with Bounded Memory
Google's Infini-Attention combines standard attention with a compressive memory that persists across segments — enabling theoretically infinite context at O(1) memory.
1 post found