KOTA SASATANI's picture

KOTA SASATANI PRO

sasa2000

·

jagaimo010

AI & ML interests

None yet

Recent Activity

liked a model about 10 hours ago

mergekit-community/UltraLong-Thinking

liked a model about 11 hours ago

nightmedia/Qwen3-14B-Data-qx86-hi-mlx

upvoted a collection about 12 hours ago

Mind Over Matter

View all activity

Organizations

None yet

upvoted a collection about 12 hours ago

Mind Over Matter

Emergent behavior • 74 items • Updated 3 days ago • 1

upvoted a paper 1 day ago

GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

Paper • 2511.15705 • Published 21 days ago • 92

upvoted a collection 1 day ago

GeoVista

7 items • Updated 7 days ago • 1

upvoted an article 1 day ago

Article

We Got Claude to Fine-Tune an Open Source LLM

7 days ago

•

433

upvoted a collection 3 days ago

DeepSeek-R1-ReDistill

Re-distilled DeepSeek R1 models • 4 items • Updated Jan 30 • 15

upvoted a collection 7 days ago

ReasonLite

4 items • Updated 6 days ago • 1

upvoted a collection 8 days ago

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with Model Optimizer. • 45 items • Updated 2 days ago • 63

upvoted an article 13 days ago

Article

Norm-Preserving Biprojected Abliteration

Nov 6

•

52

upvoted an article 15 days ago

Article

Qianfan-VL: A Milestone Achievement in Chinese Multimodal AI with Domestic Chips

Sep 24

•

9

upvoted a paper 15 days ago

Virtual Width Networks

Paper • 2511.11238 • Published 27 days ago • 35

upvoted a paper 26 days ago

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published 29 days ago • 194

upvoted a collection 30 days ago

Pre-training Dataset Samples

A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. • 19 items • Updated 30 days ago • 15

upvoted a paper about 1 month ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 208

upvoted a collection about 1 month ago

Apriel-H1

Introducing Apriel-H1 hybrids each blending Attention and Mamba State Space layers in varying proportions. • 8 items • Updated Nov 5 • 7

upvoted a paper about 1 month ago

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

Paper • 2510.27492 • Published Oct 30 • 81

upvoted a collection about 1 month ago

Cerebras REAP

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 1 day ago • 52

upvoted a paper 3 months ago

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29 • 93

upvoted 2 collections 3 months ago

InternVL3.5

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28 • 103

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 175

upvoted an article 4 months ago

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

733