Base Model for TransMLA
mengfanxu
fxmeng
AI & ML interests
None yet
Recent Activity
updated
a model
14 days ago
fxmeng/TransMLA-llama3-8b-32k
updated
a model
14 days ago
fxmeng/TransMLA-llama3-8b-8k
updated
a collection
about 1 month ago
TransMLA-base
Organizations
None yet
models
53
fxmeng/TransMLA-llama3-8b-32k
8B
•
Updated
•
38
fxmeng/TransMLA-llama3-8b-8k
8B
•
Updated
•
47
fxmeng/PiSSA-llama-7b-commonsense-148k
7B
•
Updated
•
5
fxmeng/PiSSA-Llama-3-8b-commonsense-148k
8B
•
Updated
•
4
fxmeng/PiSSA-Llama-2-7b-commonsense-148k
7B
•
Updated
•
4
fxmeng/PiSSA-llama-13b-commonsense-148k
13B
•
Updated
•
6
fxmeng/CLOVER-llama-3-8b-commonsense-148k
8B
•
Updated
•
5
fxmeng/CLOVER-llama-2-7b-commonsense-148k
7B
•
Updated
•
6
fxmeng/CLOVER-llama-13b-commonsense-148k
13B
•
Updated
•
6
fxmeng/CLOVER-llama-7b-commonsense-148k
7B
•
Updated
•
10
datasets
12
fxmeng/transmla_pretrain_100m_tokens
Viewer
•
Updated
•
100k
•
21
fxmeng/transmla_pretrain_1B_tokens
Viewer
•
Updated
•
1.14M
•
146
fxmeng/transmla_pretrain_6B_tokens
Viewer
•
Updated
•
5.94M
•
2.14k
fxmeng/pissa-dataset
Viewer
•
Updated
•
844k
•
1.94k
•
3
fxmeng/big-bench-hard-continue-finetuning
Viewer
•
Updated
•
10.3k
•
74
•
1
fxmeng/commonsense_filtered
Viewer
•
Updated
•
170k
•
111
•
1
fxmeng/MetaMath-GSM240K
Viewer
•
Updated
•
240k
•
24
•
1
fxmeng/MetaMath-MATH155K
Viewer
•
Updated
•
155k
•
46
fxmeng/CodeFeedback-Python105K
Viewer
•
Updated
•
105k
•
157
•
6
fxmeng/llava_finetune_336x336
Preview
•
Updated
•
25