gemma-4-E2B-it-qx86-hi-mlx

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.389,0.465,0.762,0.486,0.372,0.707,0.641
mxfp8    0.376,0.464,0.743,0.490,0.378,0.709,0.622
q8-hi    0.392,0.462,0.762,0.487,0.376,0.706,0.636
qx86-hi  0.387,0.461,0.766,0.483,0.392,0.699,0.623
mxfp4    0.380,0.451,0.762,0.494,0.374,0.699,0.594

Perplexity               Peak Memory   Tokens/sec
mxfp8    170.519 ± 3.170  11.78 GB      2174
q8-hi    133.388 ± 2.383  11.21 GB      1889
qx86-hi  125.278 ± 2.215  11.87 GB      1856
mxfp4    140.693 ± 2.546   9.48 GB      2352

See parent model for instructions on install and use with Transformers.