Update README.md
Browse files
README.md
CHANGED
|
@@ -30,7 +30,7 @@ tags:
|
|
| 30 |
|
| 31 |
### Model Optimizations
|
| 32 |
|
| 33 |
-
This model was obtained by quantizing
|
| 34 |
This optimization reduces the number of bits used to represent weights and activations from 16 to 8, reducing GPU memory requirements (by approximately 50%) and increasing matrix-multiply compute throughput (by approximately 2x).
|
| 35 |
Weight quantization also reduces disk size requirements by approximately 50%.
|
| 36 |
|
|
|
|
| 30 |
|
| 31 |
### Model Optimizations
|
| 32 |
|
| 33 |
+
This model was obtained by quantizing activations and weights of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) to FP8 data type.
|
| 34 |
This optimization reduces the number of bits used to represent weights and activations from 16 to 8, reducing GPU memory requirements (by approximately 50%) and increasing matrix-multiply compute throughput (by approximately 2x).
|
| 35 |
Weight quantization also reduces disk size requirements by approximately 50%.
|
| 36 |
|