Llama-3.2-1B-Executorch-SpinQuant
This repository contains the llama3_2_1b_spinquant.pte model, exported for use with ExecuTorch.
Details
- Model: Llama 3.2 Instruct
- Format:
.pte(ExecuTorch) - Quantization: Llama 3.2 1B Instruct model exported for ExecuTorch with SpinQuant (4-bit). Compatible with React Native.
Usage
This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.
- Download
tokenizer.modelandllama3_2_1b_spinquant.pte. - Place them in your app's asset folder.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support