DeepSeek PHI De-identification Adapter
This repository hosts a LoRA adapter fine-tuned for safe detection and redaction of Protected Health Information (PHI) in clinical text.
The model is trained on a large synthetic and de-identified corpus derived from MIMIC-III-style clinical notes and is designed to operate as part of a configurable, explainable medical text de-identification pipeline.
Model Details
- Developed by: Iftakhar Khandokar (Marquette University)
- Funded by: Academic research (EECE Department, Marquette University)
- Shared by: Iftakhar Khandokar
- Model type: LoRA adapter (PEFT)
- Base model:
deepseek-ai/deepseek-11m-7b-base - Language: English (clinical / biomedical NLP)
- License: Apache 2.0
Intended Use
This adapter is intended for:
✅ Research on medical data de-identification
✅ Benchmarking privacy-preserving NLP pipelines
✅ Safety and explainability evaluation for clinical LLM workflows
Not Intended For
❌ Automated medical diagnosis
❌ Direct patient care deployment without regulatory review
❌ Generating synthetic patient records for real-world use
Loading the Model
from transformers import AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"deepseek-ai/deepseek-11m-7b-base", trust_remote_code=True
)
model = PeftModel.from_pretrained(
base,
"Iftakhar/deepseek-phi-adapter"
)
- Downloads last month
- 10
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support