gemma-4
Collection
in mxfp4, mxfp8, and Deckard(qx) • 28 items • Updated • 2
How to use nightmedia/gemma-4-E2B-it-qx86-hi-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir gemma-4-E2B-it-qx86-hi-mlx nightmedia/gemma-4-E2B-it-qx86-hi-mlx
Brainwaves
arc arc/e boolq hswag obkqa piqa wino
bf16 0.389,0.465,0.762,0.486,0.372,0.707,0.641
mxfp8 0.376,0.464,0.743,0.490,0.378,0.709,0.622
q8-hi 0.392,0.462,0.762,0.487,0.376,0.706,0.636
qx86-hi 0.387,0.461,0.766,0.483,0.392,0.699,0.623
mxfp4 0.380,0.451,0.762,0.494,0.374,0.699,0.594
Perplexity Peak Memory Tokens/sec
mxfp8 170.519 ± 3.170 11.78 GB 2174
q8-hi 133.388 ± 2.383 11.21 GB 1889
qx86-hi 125.278 ± 2.215 11.87 GB 1856
mxfp4 140.693 ± 2.546 9.48 GB 2352
See parent model for instructions on install and use with Transformers.
-G
8-bit