microsoft
/

Phi-4-multimodal-instruct-onnx

Automatic Speech Recognition

speech-summarization

speech-translation

visual-question-answering

phi-4-multimodal

Model card Files Files and versions

kvaishnavi commited on Feb 27

Commit

29e40c6

·

verified ·

1 Parent(s): ec0ed95

Update README.md

Files changed (1) hide show

README.md +6 -7

README.md CHANGED Viewed

@@ -15,13 +15,13 @@ tags:
   - phi-4-mini
 ---
-## microsoft/Phi-4- multimodal -instruct-onnx
 ### Introduction
-ONNX version of Phi4 multi modal to accelerate inference with ONNX Runtime.
-This modal is quantized to int4 precision and runs on CUDA devices.
 To run this model with ONNX Runtime:
@@ -40,13 +40,12 @@ curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/ma
 Run the script
 ```bash
-python phi4-mm.py -m Phi-4-multimodal-instruct-onnx/cuda/cuda-int4-rtn-blocksize-32 -e cuda
 ```
-You will be prompted for images, audio files and a prompt.
-The performance of the text component is similar the [phi4 mini ONNX models] (https://huggingface.co/microsoft/Phi-4-mini-instruct-onnx/blob/main/README.md)
 ### Model Description

   - phi-4-mini
 ---
+## Phi-4 Multimodal Instruct ONNX models
 ### Introduction
+This is an ONNX version of the Phi-4 multimodal model to accelerate inference with ONNX Runtime.
+This model is quantized to int4 precision and runs on GPU devices.
 To run this model with ONNX Runtime:
 Run the script
 ```bash
+python phi4-mm.py -m Phi-4-multimodal-instruct-onnx/gpu/gpu-int4-rtn-block-32 -e cuda
 ```
+You will be prompted to provide any images, audios, and a prompt.
+The performance of the text component is similar to the [Phi-4 mini ONNX models] (https://huggingface.co/microsoft/Phi-4-mini-instruct-onnx/blob/main/README.md)
 ### Model Description