WpythonW
/

rubert-tiny2-vllm

Sentence Similarity

sentence-transformers

inference-optimized

text-embeddings-inference

Model card Files Files and versions

WpythonW commited on 12 days ago

Commit

e4ff9e0

·

verified ·

1 Parent(s): c53d4db

Update README.md

Files changed (1) hide show

README.md +1 -7

README.md CHANGED Viewed

@@ -108,10 +108,4 @@ Tested on Google Colab Tesla T4 with:
 ## Original Model
-For standard PyTorch/Transformers usage, see the original model: [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2)
-This vLLM version is optimized for deployment scenarios requiring:
-- High throughput batch processing
-- Low latency inference
-- OpenAI API compatibility
-- Production-grade serving infrastructure


108
109	## Original Model
110
111	+ For standard PyTorch/Transformers usage, see the original model: [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2)