hindi-maths
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
# slices:
# - sources:
# - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# layer_range: [0, 23]
# - sources:
# - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# layer_range: [24, 28]
# - model: WizardLM/WizardMath-7B-V1.0
# layer_range: [24, 28]
# - sources:
# - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# layer_range: [29, 32]
# merge_method: linear
# base_model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# tokenizer_source: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# parameters:
# t:
# - filter: self_attn
# value: 0.05
# - filter: mlp
# value: 0.05
# - filter: norm
# value: 0.02
# - filter: embed_tokens
# value: 0.0
# - filter: lm_head
# value: 0.0
# - value: 0.02
# dtype: bfloat16
# slices:
# - sources:
# - model: DeepSeek-R1-0528
# layer_range: [0, 32]
# - model: subhrokomol/Llama2-7B-Hindi-finetuned
# layer_range: [0, 32]
# merge_method: slerp
# base_model: DeepSeek-R1-0528
# tokenizer_source: subhrokomol/Llama2-7B-Hindi-finetuned
# parameters:
# t:
# - filter: self_attn
# value: [0.7, 0.8, 1.0, 0.8, 0.7]
# - filter: mlp
# value: [0.6, 0.5, 0.3, 0.5, 0.6]
# - value: 0.6
# dtype: bfloat16
slices:
- sources:
- model: subhrokomol/Llama2-7B-Hindi-finetuned
layer_range: [0, 32]
- model: WizardLM/WizardMath-7B-V1.0
layer_range: [0, 32]
merge_method: slerp
base_model: subhrokomol/Llama2-7B-Hindi-finetuned
parameters:
t:
- value: 0.5
dtype: bfloat16
- Downloads last month
- 8