hindi-maths

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SLERP merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

# slices:
#   - sources:
#       - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
#         layer_range: [0, 23]
#   - sources:
#       - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
#         layer_range: [24, 28]
#       - model: WizardLM/WizardMath-7B-V1.0
#         layer_range: [24, 28]
#   - sources:
#       - model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
#         layer_range: [29, 32]


# merge_method: linear
# base_model: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1
# tokenizer_source: Cognitive-Lab/LLama3-Gaja-Hindi-8B-v0.1

# parameters:
#   t:
#     - filter: self_attn
#       value: 0.05
#     - filter: mlp
#       value: 0.05
#     - filter: norm
#       value: 0.02
#     - filter: embed_tokens
#       value: 0.0
#     - filter: lm_head
#       value: 0.0
#     - value: 0.02

# dtype: bfloat16

# slices:
#   - sources:
#       - model: DeepSeek-R1-0528
#         layer_range: [0, 32]
#       - model: subhrokomol/Llama2-7B-Hindi-finetuned
#         layer_range: [0, 32]
# merge_method: slerp
# base_model: DeepSeek-R1-0528
# tokenizer_source: subhrokomol/Llama2-7B-Hindi-finetuned
# parameters:
#   t:
#     - filter: self_attn
#       value: [0.7, 0.8, 1.0, 0.8, 0.7]
#     - filter: mlp
#       value: [0.6, 0.5, 0.3, 0.5, 0.6]
#     - value: 0.6
# dtype: bfloat16

slices:
  - sources:
      - model: subhrokomol/Llama2-7B-Hindi-finetuned
        layer_range: [0, 32]
      - model: WizardLM/WizardMath-7B-V1.0
        layer_range: [0, 32]
merge_method: slerp
base_model: subhrokomol/Llama2-7B-Hindi-finetuned
parameters:
  t:
    - value: 0.5
dtype: bfloat16


Downloads last month
7
Safetensors
Model size
7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jayparmar1301/jrt-hindi-model