-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 12 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
Collections
Discover the best community collections!
Collections including paper arxiv:2405.11143
-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Understanding and Diagnosing Deep Reinforcement Learning
Paper • 2406.16979 • Published • 9 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 61 -
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Paper • 2407.00617 • Published • 7
-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper • 2406.11839 • Published • 38 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper • 2406.09455 • Published • 15 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper • 2406.11827 • Published • 14 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper • 2406.11194 • Published • 15
-
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 67 -
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper • 2406.06469 • Published • 25 -
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
Paper • 2406.04271 • Published • 29 -
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper • 2406.02657 • Published • 38