Submitted by akhaliq 34 Kosmos-2: Grounding Multimodal Large Language Models to the World · 7 authors 9
Submitted by akhaliq 20 DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing · 6 authors 5
Submitted by akhaliq 15 Faster Segment Anything: Towards Lightweight SAM for Mobile Applications · 7 authors 1
Submitted by akhaliq 12 H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models · 12 authors 1
Submitted by akhaliq 11 Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data · 3 authors 1
Submitted by akhaliq 8 Supervised Pretraining Can Learn In-Context Reinforcement Learning · 7 authors
Submitted by akhaliq 8 Thinking Like an Annotator: Generation of Dataset Labeling Instructions · 5 authors 1
Submitted by akhaliq 6 RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools · 5 authors
Submitted by akhaliq 6 DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data · 4 authors
Submitted by akhaliq 6 Zero-shot spatial layout conditioning for text-to-image diffusion models · 5 authors 1
Submitted by akhaliq 5 Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window · 5 authors
Submitted by akhaliq 1 SEEDS: Emulation of Weather Forecast Ensembles with Diffusion Models · 5 authors