Do you have any plan to quantize such models to fp8/nvfp4?
#4
by
luoxiao9231
- opened
Just want to know if you are planing to quantize such models, because I'm using rtx5000 series and may benefit from nvfp4/fp8.
https://huggingface.co/docs/transformers/quantization/finegrained_fp8
Due to the limited space on hf.co, models can no longer be uploaded freely. Therefore, you will need to convert them yourself. If you really cannot perform the conversion, I will try again.