File size: 4,732 Bytes
6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 31e424e 6f1f755 31e424e 6f1f755 31e424e 6f1f755 31e424e 6f1f755 aa98ef8 31e424e 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 6f1f755 2639057 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
license: mit
tags:
- causal-lm
- instruction-following
- loRA
- QLoRA
- quantized
language: en
library_name: transformers
base_model: microsoft/phi-2
---
# Phi-2 QLoRA Fine-Tuned Model
**Model:** `mishrabp/phi2-custom-response-qlora-adapter`
**Base Model:** [`microsoft/phi-2`](https://huggingface.co/microsoft/phi-2)
**Fine-Tuning Method:** QLoRA (4-bit quantized LoRA)
**Task:** Instruction-following / Customer Support Responses
---
## Model Description
This repository contains a **Phi-2 language model fine-tuned using QLoRA** on a synthetic dataset of customer support instructions and responses. The fine-tuning uses **4-bit quantized LoRA adapters** for memory-efficient training and can run on GPU or CPU (slower on CPU).
The model is designed for **instruction-following tasks** like customer support, FAQs, or other dialog generation tasks.
---
## Training Data
The fine-tuning dataset is synthetic, consisting of 3000 instruction-response pairs:
**Example:**
```text
Instruction: "Customer asks about refund window #1"
Response: "Our refund window is 30 days from delivery."
```
Here is the dataset that was used for fine-tunning:
https://huggingface.co/datasets/mishrabp/customer-support-responses/resolve/main/train.csv
You can replace the dataset with your own CSV/JSON file to train on real-world data.
---
## Intended Use
* Generate responses to instructions in customer support scenarios.
* Small-scale instruction-following experiments.
* Educational or research purposes.
---
## How to Use
### Load the Fine-Tuned Model
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# -----------------------------
# Load fine-tuned model from HF
# -----------------------------
model_name = "mishrabp/phi2-custom-response-qlora-adapter"
tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained("microsoft/phi-2")
model = PeftModel.from_pretrained(base_model, model_name)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
# -----------------------------
# Sample evaluation dataset
# -----------------------------
eval_data = [
{"instruction": "Customer asks about refund window", "reference": "Our refund window is 30 days from delivery."},
{"instruction": "Order arrived late", "reference": "Sorry for the delay. A delivery credit has been applied."},
{"instruction": "Wrong item received", "reference": "We’ll ship the correct item and provide a return label."},
]
# -----------------------------
# Evaluation loop
# -----------------------------
for i, example in enumerate(eval_data, 1):
prompt = f"### Instruction:\n{example['instruction']}\n\n### Response:"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
output_ids = model.generate(**inputs, max_new_tokens=50)
generated = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(f"Example {i}")
print("Instruction:", example["instruction"])
print("Generated Response:", generated.split("### Response:")[-1].strip())
print("Reference Response:", example["reference"])
print("-" * 50)
# -----------------------------
# Optional: compute simple token-level accuracy or BLEU
# -----------------------------
from nltk.translate.bleu_score import sentence_bleu
bleu_scores = []
for example in eval_data:
prompt = f"### Instruction:\n{example['instruction']}\n\n### Response:"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
output_ids = model.generate(**inputs, max_new_tokens=50)
generated = tokenizer.decode(output_ids[0], skip_special_tokens=True).split("### Response:")[-1].strip()
reference_tokens = example["reference"].split()
generated_tokens = generated.split()
bleu = sentence_bleu([reference_tokens], generated_tokens)
bleu_scores.append(bleu)
print("Average BLEU score:", sum(bleu_scores)/len(bleu_scores))
```
---
## Training Script
The training script performs the following steps:
1. Loads the **Phi-2 base model**.
2. Creates a **synthetic dataset** of instruction-response pairs.
3. Tokenizes and formats the dataset for causal language modeling.
4. Applies a **LoRA adapter**.
5. Trains using **QLoRA** if GPU is available, otherwise full-precision LoRA on CPU.
6. Saves the adapter and tokenizer to `./phi2-qlora`.
7. Pushes the adapter and tokenizer to Hugging Face Hub.
### Requirements
```bash
pip install torch transformers peft datasets huggingface_hub python-dotenv
```
---
## Parameters
* `r=8`, `lora_alpha=16`, `lora_dropout=0.05`
* `target_modules=["q_proj","v_proj"]` (adjust for different base models)
* Learning rate: `2e-4`
* Batch si
|