Gemma-4 E2B Gemini 3.1 Pro Reasoning Distill - GGUF

GGUF quantized versions of Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill.

Model Description

This is Google's Gemma-4 2B (E2B) instruction-tuned model, fine-tuned on Gemini 3.1 Pro reasoning datasets to improve chain-of-thought reasoning capabilities.

Training Data

Training Configuration

  • Base Model: google/gemma-4b-it (via unsloth/gemma-4-E2B-it)
  • Method: LoRA fine-tuning
  • LoRA Config: r=8, alpha=8, dropout=0.1
  • Learning Rate: 5e-5
  • Epochs: 0.5
  • Framework: Unsloth + TRL

Available Quantizations

Filename Quant Type Size Description
gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill.gguf BF16 ~5GB Full precision, best quality
gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q8_0.gguf Q8_0 ~2.5GB High quality
gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf Q5_K_M ~2GB Balanced (recommended)
gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q4_K_M.gguf Q4_K_M ~1.7GB Good quality, smaller
gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q3_K_M.gguf Q3_K_M ~1.4GB Smallest

Usage with llama.cpp

# Download a quantized model
wget https://huggingface.co/Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-GGUF/resolve/main/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf

# Run with llama.cpp
./llama-cli -m gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf \
  -p "What is the sum of all prime numbers between 1 and 50?" \
  -n 512

Usage with Ollama

Create a Modelfile:

FROM ./gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill-Q5_K_M.gguf

TEMPLATE """<bos><start_of_turn>user
{{ .Prompt }}<end_of_turn>
<start_of_turn>model
"""

PARAMETER stop "<end_of_turn>"
PARAMETER temperature 0.7

Then:

ollama create gemma4-e2b-reasoning -f Modelfile
ollama run gemma4-e2b-reasoning

License

This model is released under the Gemma License.

Related Models

Downloads last month
1,112
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support