blackcloud1199's picture
Upload README.md with huggingface_hub
9ba6135 verified
metadata
license: apache-2.0
library_name: executorch
tags:
  - android
  - ios
  - on-device
  - pytorch
  - react-native
  - qwen
  - qwen2
base_model: Qwen/Qwen2.5-1.5B-Instruct

Qwen2.5-1.5B-Executorch-Q8DA4W

This repository contains the qwen2_5_1_5b_q8da4w.pte model, exported for use with ExecuTorch.

Details

  • Base Model: Qwen/Qwen2.5-1.5B-Instruct
  • Format: .pte (ExecuTorch)
  • Quantization: Q8DA4W (4-bit linear weights, 8-bit dynamic activations)
  • Architecture: Qwen2
  • File Size: ~1.6 GB

Features

  • 🚀 Optimized for mobile/edge devices
  • 📱 Compatible with react-native-executorch
  • 🌍 Excellent multilingual support (including Vietnamese!)
  • 💬 Strong instruction-following capabilities
  • 🧠 Alibaba's Qwen 2.5 is known for exceptional reasoning

Usage

This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.

  1. Download qwen2_5_1_5b_q8da4w.pte and the tokenizer files (tokenizer.json, vocab.json, merges.txt).
  2. Place them in your app's asset folder.
  3. Load with ExecuTorch runtime.

Notes

  • Qwen2 uses byte-level BPE tokenizer (similar to GPT-2), not SentencePiece.
  • Tokenizer files are: tokenizer.json, vocab.json, merges.txt
  • Vocab size: 151,936 tokens