6 45 44

Pengxiang Li

pengxiang

pixeli

AI & ML interests

Video generation, Image editing, AD

Recent Activity

updated a dataset about 2 hours ago

pengxiang/nuplan_maps

upvoted a paper about 17 hours ago

The Curse of Depth in Large Language Models

authored a paper about 18 hours ago

The Curse of Depth in Large Language Models

View all activity

Organizations

None yet

pengxiang's activity

upvoted a paper about 17 hours ago

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published 3 days ago • 9

upvoted a paper 20 days ago

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Paper • 2501.12326 • Published 21 days ago • 49

upvoted a paper 24 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 26 days ago • 67

upvoted a paper 29 days ago

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Paper • 2501.06842 • Published about 1 month ago • 15

upvoted 4 papers about 1 month ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 255

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

Paper • 2501.04575 • Published Jan 8 • 23

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 89

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 82

upvoted a paper about 2 months ago

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

Paper • 2412.13795 • Published Dec 18, 2024 • 19

upvoted a paper 2 months ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 77

upvoted 4 papers 3 months ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 80

DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting

Paper • 2411.17223 • Published Nov 26, 2024 • 5

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Paper • 2411.13807 • Published Nov 21, 2024 • 11

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Paper • 2411.08380 • Published Nov 13, 2024 • 25

upvoted a collection 3 months ago

🍃 MINT-1T

Collection

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 58

upvoted 3 papers 5 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6, 2024 • 44

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 123

upvoted 2 papers 6 months ago

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Paper • 2408.06072 • Published Aug 12, 2024 • 38

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 118