rewardfm/jesse-alldata-rfm-qwen-4gpu-bs16-pref-prog-sim-succ
Model Details
- Base Model: Qwen/Qwen3-VL-4B-Instruct
- Model Type: qwen3_vl
Training Run
- Wandb Run: jesse_alldata_rfm_qwen_4gpu_bs16_pref_prog_sim_succ
- Wandb ID:
ng2lt4sm - Project: rfm
Citation
If you use this model, please cite:
- Downloads last month
- 71
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for rewardfm/jesse-alldata-rfm-qwen-4gpu-bs16-pref-prog-sim-succ
Base model
Qwen/Qwen3-VL-4B-Instruct