Step by step: How the Lightning Indexer works – from Query-Token through Index-Score to final Sparse Attention.
Lightning Indexer is the heart of DSA: It identifies the most important token pairs in O(n log n) before the actual attention is computed. Pre-filtering instead of post-filtering.
Step 4/5 in Chapter 2 "Modern Architecture Variants"
Detailed view of DSA implementation. Shows how the indexer decides which tokens are relevant.
The indexer uses Locality-Sensitive Hashing (LSH) or learned routing for efficient candidate search. This enables O(n log n) instead of O(n²).