EpistemeAI
/

Episteme-gptoss-20b-RL

Text Generation

text-generation-inference

8-bit precision

Model card Files Files and versions

legolasyiu commited on Oct 20

Commit

9319889

·

verified ·

1 Parent(s): 39c3e22

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ language:
 # Model card
 ## Cookbook
 # Summary
 This EpistemeAI model is based on GPT-OSS-20B and has been fine-tuned using the Unsloth RL framework to optimize inference efficiency while mitigating vulnerabilities such as reward hacking during reinforcement learning from human feedback (RLHF)–style training. The fine-tuning process emphasizes alignment robustness and efficiency, ensuring the model preserves its reasoning depth without incurring excessive computational overhead.

 # Model card
 ## Cookbook
+[EpistemeAI Cookbook](https://github.com/tomtyiu/EpistemeAI-series-Cookbook-SDK)
 # Summary
 This EpistemeAI model is based on GPT-OSS-20B and has been fine-tuned using the Unsloth RL framework to optimize inference efficiency while mitigating vulnerabilities such as reward hacking during reinforcement learning from human feedback (RLHF)–style training. The fine-tuning process emphasizes alignment robustness and efficiency, ensuring the model preserves its reasoning depth without incurring excessive computational overhead.