Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ language:
|
|
| 13 |
# Model card
|
| 14 |
|
| 15 |
## Cookbook
|
| 16 |
-
|
| 17 |
|
| 18 |
# Summary
|
| 19 |
This EpistemeAI model is based on GPT-OSS-20B and has been fine-tuned using the Unsloth RL framework to optimize inference efficiency while mitigating vulnerabilities such as reward hacking during reinforcement learning from human feedback (RLHF)–style training. The fine-tuning process emphasizes alignment robustness and efficiency, ensuring the model preserves its reasoning depth without incurring excessive computational overhead.
|
|
|
|
| 13 |
# Model card
|
| 14 |
|
| 15 |
## Cookbook
|
| 16 |
+
[EpistemeAI Cookbook](https://github.com/tomtyiu/EpistemeAI-series-Cookbook-SDK)
|
| 17 |
|
| 18 |
# Summary
|
| 19 |
This EpistemeAI model is based on GPT-OSS-20B and has been fine-tuned using the Unsloth RL framework to optimize inference efficiency while mitigating vulnerabilities such as reward hacking during reinforcement learning from human feedback (RLHF)–style training. The fine-tuning process emphasizes alignment robustness and efficiency, ensuring the model preserves its reasoning depth without incurring excessive computational overhead.
|