-
Exponentially Faster Language Modelling
Paper ā¢ 2311.10770 ā¢ Published ā¢ 118 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video ā¢ Updated ā¢ 389k ā¢ 2.86k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper ā¢ 2311.13384 ā¢ Published ā¢ 51 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper ā¢ 2311.12454 ā¢ Published ā¢ 30
Collections
Discover the best community collections!
Collections including paper arxiv:2312.06550
-
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper ā¢ 2311.10093 ā¢ Published ā¢ 57 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper ā¢ 2311.12092 ā¢ Published ā¢ 22 -
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Paper ā¢ 2312.00210 ā¢ Published ā¢ 15 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper ā¢ 2312.00079 ā¢ Published ā¢ 15
-
DualMix: Unleashing the Potential of Data Augmentation for Online Class-Incremental Learning
Paper ā¢ 2303.07864 ā¢ Published ā¢ 1 -
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Paper ā¢ 2305.13547 ā¢ Published ā¢ 1 -
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Paper ā¢ 2304.09402 ā¢ Published ā¢ 2 -
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
Paper ā¢ 2305.18169 ā¢ Published ā¢ 1
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper ā¢ 2211.05100 ā¢ Published ā¢ 29 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper ā¢ 2201.11115 ā¢ Published -
Training language models to follow instructions with human feedback
Paper ā¢ 2203.02155 ā¢ Published ā¢ 16 -
FinGPT: Large Generative Models for a Small Language
Paper ā¢ 2311.05640 ā¢ Published ā¢ 28
-
A technical note on bilinear layers for interpretability
Paper ā¢ 2305.03452 ā¢ Published ā¢ 1 -
Interpreting Transformer's Attention Dynamic Memory and Visualizing the Semantic Information Flow of GPT
Paper ā¢ 2305.13417 ā¢ Published ā¢ 1 -
Explainable AI for Pre-Trained Code Models: What Do They Learn? When They Do Not Work?
Paper ā¢ 2211.12821 ā¢ Published ā¢ 1 -
The Linear Representation Hypothesis and the Geometry of Large Language Models
Paper ā¢ 2311.03658 ā¢ Published ā¢ 1
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper ā¢ 2310.19956 ā¢ Published ā¢ 10 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper ā¢ 2307.08621 ā¢ Published ā¢ 170 -
RWKV: Reinventing RNNs for the Transformer Era
Paper ā¢ 2305.13048 ā¢ Published ā¢ 15 -
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 50
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ā¢ 2401.02038 ā¢ Published ā¢ 63 -
Learning To Teach Large Language Models Logical Reasoning
Paper ā¢ 2310.09158 ā¢ Published ā¢ 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ā¢ 2311.00176 ā¢ Published ā¢ 9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ā¢ 2308.09583 ā¢ Published ā¢ 7
-
AlpaGasus: Training A Better Alpaca with Fewer Data
Paper ā¢ 2307.08701 ā¢ Published ā¢ 23 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper ā¢ 2303.03915 ā¢ Published ā¢ 7 -
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper ā¢ 2309.04662 ā¢ Published ā¢ 23 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper ā¢ 2309.10818 ā¢ Published ā¢ 10