igormolybog
's Collections
Long context
updated
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
•
2312.00752
•
Published
•
140
SparQ Attention: Bandwidth-Efficient LLM Inference
Paper
•
2312.04985
•
Published
•
39
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
Lengths in Large Language Models
Paper
•
2401.04658
•
Published
•
27
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper
•
2401.06951
•
Published
•
26
Extending LLMs' Context Window with 100 Samples
Paper
•
2401.07004
•
Published
•
16
Vision Mamba: Efficient Visual Representation Learning with
Bidirectional State Space Model
Paper
•
2401.09417
•
Published
•
60
LongAlign: A Recipe for Long Context Alignment of Large Language Models
Paper
•
2401.18058
•
Published
•
21
Scavenging Hyena: Distilling Transformers into Long Convolution Models
Paper
•
2401.17574
•
Published
•
16
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning
Tasks
Paper
•
2402.04248
•
Published
•
31
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax
Mimicry
Paper
•
2402.04347
•
Published
•
14
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
115
Training-Free Long-Context Scaling of Large Language Models
Paper
•
2402.17463
•
Published
•
21
Resonance RoPE: Improving Context Length Generalization of Large
Language Models
Paper
•
2403.00071
•
Published
•
23