hallucinations-leaderboard

community

https://www.neuralnoise.com

pminervini

pminervini

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

pminervini authored a paper 1 day ago

FLARE: Faithful Logic-Aided Reasoning and Exploration

pminervini authored a paper 1 day ago

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

pminervini authored a paper 1 day ago

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

View all activity

hallucinations-leaderboard's activity

pminervini

authored 6 papers 1 day ago

FLARE: Faithful Logic-Aided Reasoning and Exploration

Paper • 2410.11900 • Published Oct 14, 2024 • 3

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

Paper • 2406.14425 • Published Jun 20, 2024 • 1

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7

Mixtures of In-Context Learners

Paper • 2411.02830 • Published Nov 5, 2024 • 1

Aligning Generalisation Between Humans and Machines

Paper • 2411.15626 • Published Nov 23, 2024

Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs

Paper • 2502.05092 • Published 4 days ago • 6

clefourrier

authored a paper 5 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 7 days ago • 153

pingnieuk

authored a paper 6 days ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published 8 days ago • 23

clefourrier

authored a paper 2 months ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published Dec 4, 2024 • 17

acDante

authored 4 papers 3 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8, 2024 • 8

Are We Done with MMLU?

Paper • 2406.04127 • Published Jun 6, 2024 • 38

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 19

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7

pminervini

updated 2 datasets 3 months ago

hallucinations-leaderboard/requests

Preview • Updated Oct 31, 2024 • 280k

hallucinations-leaderboard/results

Updated Oct 31, 2024 • 343k • 2

pminervini

authored a paper 3 months ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 19

pminervini

authored a paper 4 months ago

Adapting Neural Link Predictors for Data-Efficient Complex Query Answering

Paper • 2301.12313 • Published Jan 29, 2023

aryopg

authored 3 papers 4 months ago

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Paper • 2410.10336 • Published Oct 14, 2024 • 2

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 19

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7