Llama-3.2-1B-Executorch-SpinQuant

This repository contains the llama3_2_1b_spinquant.pte model, exported for use with ExecuTorch.

Details

  • Model: Llama 3.2 Instruct
  • Format: .pte (ExecuTorch)
  • Quantization: Llama 3.2 1B Instruct model exported for ExecuTorch with SpinQuant (4-bit). Compatible with React Native.

Usage

This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.

  1. Download tokenizer.model and llama3_2_1b_spinquant.pte.
  2. Place them in your app's asset folder.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support