Sentence Similarity
sentence-transformers
PyTorch
Safetensors
bloom
feature-extraction
mteb
Eval Results (legacy)
Instructions to use bigscience-data/sgpt-bloom-1b7-nli with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use bigscience-data/sgpt-bloom-1b7-nli with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("bigscience-data/sgpt-bloom-1b7-nli") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
| pipeline_tag: sentence-similarity | |
| tags: | |
| - sentence-transformers | |
| - feature-extraction | |
| - sentence-similarity | |
| - mteb | |
| model-index: | |
| - name: sgpt-bloom-1b7-nli | |
| results: | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/amazon_reviews_multi | |
| name: MTEB AmazonReviewsClassification (fr) | |
| config: fr | |
| split: test | |
| revision: c379a6705fec24a2493fa68e011692605f44e119 | |
| metrics: | |
| - type: accuracy | |
| value: 39.286 | |
| - type: f1 | |
| value: 38.87078070073539 | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/amazon_reviews_multi | |
| name: MTEB AmazonReviewsClassification (zh) | |
| config: zh | |
| split: test | |
| revision: c379a6705fec24a2493fa68e011692605f44e119 | |
| metrics: | |
| - type: accuracy | |
| value: 37.634 | |
| - type: f1 | |
| value: 36.86046604093418 | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/mtop_domain | |
| name: MTEB MTOPDomainClassification (fr) | |
| config: fr | |
| split: test | |
| revision: a7e2a951126a26fc8c6a69f835f33a346ba259e3 | |
| metrics: | |
| - type: accuracy | |
| value: 83.79893517068588 | |
| - type: f1 | |
| value: 83.72326662566203 | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/mtop_intent | |
| name: MTEB MTOPIntentClassification (fr) | |
| config: fr | |
| split: test | |
| revision: 6299947a7777084cc2d4b64235bf7190381ce755 | |
| metrics: | |
| - type: accuracy | |
| value: 63.36047604134043 | |
| - type: f1 | |
| value: 44.261707019308126 | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/amazon_massive_intent | |
| name: MTEB MassiveIntentClassification (fr) | |
| config: fr | |
| split: test | |
| revision: 072a486a144adf7f4479a4a0dddb2152e161e1ea | |
| metrics: | |
| - type: accuracy | |
| value: 64.57632817753867 | |
| - type: f1 | |
| value: 62.60453982786661 | |
| - task: | |
| type: Classification | |
| dataset: | |
| type: mteb/amazon_massive_scenario | |
| name: MTEB MassiveScenarioClassification (fr) | |
| config: fr | |
| split: test | |
| revision: 7d571f92784cd94a019292a1f45445077d0ef634 | |
| metrics: | |
| - type: accuracy | |
| value: 69.59986550100874 | |
| - type: f1 | |
| value: 69.71803697939914 | |
| - task: | |
| type: STS | |
| dataset: | |
| type: mteb/sts22-crosslingual-sts | |
| name: MTEB STS22 (zh) | |
| config: zh | |
| split: test | |
| revision: 2de6ce8c1921b71a755b262c6b57fef195dd7906 | |
| metrics: | |
| - type: cos_sim_pearson | |
| value: 59.71781185663265 | |
| - type: cos_sim_spearman | |
| value: 58.538648447630514 | |
| - type: euclidean_pearson | |
| value: 53.53848180206165 | |
| - type: euclidean_spearman | |
| value: 56.33730262964236 | |
| - type: manhattan_pearson | |
| value: 54.62109820575505 | |
| - type: manhattan_spearman | |
| value: 57.223846291318914 | |
| - task: | |
| type: STS | |
| dataset: | |
| type: mteb/sts22-crosslingual-sts | |
| name: MTEB STS22 (fr) | |
| config: fr | |
| split: test | |
| revision: 2de6ce8c1921b71a755b262c6b57fef195dd7906 | |
| metrics: | |
| - type: cos_sim_pearson | |
| value: 73.44021434651606 | |
| - type: cos_sim_spearman | |
| value: 73.13412769502769 | |
| - type: euclidean_pearson | |
| value: 68.16368597409867 | |
| - type: euclidean_spearman | |
| value: 72.44964781564485 | |
| - type: manhattan_pearson | |
| value: 69.42307032478939 | |
| - type: manhattan_spearman | |
| value: 73.3523195012387 | |
| # sgpt-bloom-1b7-nli | |
| ## Usage | |
| For usage instructions, refer to: https://github.com/Muennighoff/sgpt#symmetric-semantic-search | |
| The model was trained with the command | |
| ```bash | |
| CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch examples/training/nli/training_nli_v2.py --model_name bigscience/bloom-1b3 --freezenonbias --train_batch_size 128 --lr 32e-5 --pooling weightedmean --wandb --wandbwatchlog gradients --gradcache --chunksize 4 | |
| ``` | |
| ## Evaluation Results | |
| `{'askubuntu': 57.44, 'cqadupstack': 14.18, 'twitterpara': 73.99, 'scidocs': 74.74, 'avg': 55.087500000000006}` | |
| ## Training | |
| The model was trained with the parameters: | |
| **DataLoader**: | |
| `sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 4403 with parameters: | |
| ``` | |
| {'batch_size': 128} | |
| ``` | |
| The model uses BitFit, weighted-mean pooling & GradCache, for details see: https://arxiv.org/abs/2202.08904 | |
| **Loss**: | |
| `sentence_transformers.losses.MultipleNegativesRankingLoss.MNRLGradCache` | |
| Parameters of the fit()-Method: | |
| ``` | |
| { | |
| "epochs": 1, | |
| "evaluation_steps": 440, | |
| "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator", | |
| "max_grad_norm": 1, | |
| "optimizer_class": "<class 'transformers.optimization.AdamW'>", | |
| "optimizer_params": { | |
| "lr": 0.00032 | |
| }, | |
| "scheduler": "WarmupLinear", | |
| "steps_per_epoch": null, | |
| "warmup_steps": 441, | |
| "weight_decay": 0.01 | |
| } | |
| ``` | |
| ## Full Model Architecture | |
| ``` | |
| SentenceTransformer( | |
| (0): Transformer({'max_seq_length': 75, 'do_lower_case': False}) with Transformer model: BloomModel | |
| (1): Pooling({'word_embedding_dimension': 2048, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': True, 'pooling_mode_lasttoken': False}) | |
| ) | |
| ``` | |
| ## Citing & Authors | |
| ```bibtex | |
| @article{muennighoff2022sgpt, | |
| title={SGPT: GPT Sentence Embeddings for Semantic Search}, | |
| author={Muennighoff, Niklas}, | |
| journal={arXiv preprint arXiv:2202.08904}, | |
| year={2022} | |
| } | |
| ``` | |