Instructions to use azettaai/minilm-l6-yat-ffn-swap with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use azettaai/minilm-l6-yat-ffn-swap with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("azettaai/minilm-l6-yat-ffn-swap", dtype="auto") - sentence-transformers
How to use azettaai/minilm-l6-yat-ffn-swap with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("azettaai/minilm-l6-yat-ffn-swap") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
MiniLM-L6 Yat FFN Swap
This repo contains Yat student replacements for all 6 feed-forward blocks of sentence-transformers/all-MiniLM-L6-v2.
The checkpoints were produced on Kaggle with a three-phase pipeline:
- Phase 1: data-free random/on-shell Yat distillation for every FFN block.
- Phase 2: real-activation fine-tuning for every FFN block.
- Phase 3: patch all six blocks and run a small MTEB STS evaluation.
The published model is a lightweight patch over the base MiniLM model. Loader code in yat_minilm.py downloads the base model, loads the Phase-2 Yat checkpoints, and replaces every BERT feed_forward_chunk.
Results
Phase 1 mean rho: 0.005847
Phase 2 mean rho: 0.098501 -> 0.001715
MTEB STS scores:
| Task | Baseline | Yat-swapped |
|---|---|---|
| STSBenchmark | 0.820325 | 0.816818 |
| STS12 | 0.723690 | 0.720878 |
| STS16 | 0.789895 | 0.789110 |
Usage
from yat_minilm import load_model
model = load_model("azettaai/minilm-l6-yat-ffn-swap")
emb = model.encode(["hello world", "yat swapped minilm"])
print(emb.shape)
Files
phase2/block0.safetensors...phase2/block5.safetensors: final Yat FFN replacements.phase1/: random/on-shell warm-start checkpoints.scripts/: Kaggle scripts used to train and evaluate the model.yat_minilm.py: loader and patching code.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for azettaai/minilm-l6-yat-ffn-swap
Base model
nreimers/MiniLM-L6-H384-uncased Quantized
sentence-transformers/all-MiniLM-L6-v2