
This is a high-level summary; analysis further below or you can read our detailed analysis in our docs.
Llama 3.1 8B Instruct has a MMLU 5 shot accuracy of 67.8% using a naive MMLU implementation. We find however Llama tokenizes "A" and "_A" (A with a space in front) as different token ids. If we consider both spaced and non spaced tokens, we get 68.2%(+0.4%)
_7DUoCJLGIeY-kXrMt9o7A.jpg?width=3840&quality=80&format=auto)