4 26 3

Sangmin Bae

raymin0223

raymin0223

AI & ML interests

None yet

Recent Activity

upvoted a paper about 13 hours ago

TiDAR: Think in Diffusion, Talk in Autoregression

upvoted a paper 15 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

upvoted a paper 25 days ago

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

View all activity

Organizations

None yet

upvoted a paper about 13 hours ago

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 30 days ago • 113

upvoted a paper 15 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9 • 129

upvoted a paper 25 days ago

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published Oct 29 • 45

upvoted an article 26 days ago

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

Oct 30

•

upvoted a paper 29 days ago

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Paper • 2511.05664 • Published Nov 7 • 35

upvoted a paper about 1 month ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 220

upvoted 2 papers about 2 months ago

Dr.LLM: Dynamic Layer Routing in LLMs

Paper • 2510.12773 • Published Oct 14 • 31

Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models

Paper • 2510.11057 • Published Oct 13 • 30

commented a paper about 2 months ago

Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models

Paper • 2510.11057 • Published Oct 13 • 30 •

upvoted 2 papers about 2 months ago

KORMo: Korean Open Reasoning Model for Everyone

Paper • 2510.09426 • Published Oct 10 • 82

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13 • 165

upvoted 3 papers 2 months ago

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 124

MemMamba: Rethinking Memory Patterns in State Space Model

Paper • 2510.03279 • Published Sep 28 • 72

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1 • 117

authored 6 papers 2 months ago

Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning

Paper • 2303.11101 • Published Mar 20, 2023 • 1

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding

Paper • 2310.05424 • Published Oct 9, 2023 • 1

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA

Paper • 2410.20672 • Published Oct 28, 2024 • 6

Why In-Context Learning Transformers are Tabular Data Classifiers

Paper • 2405.13396 • Published May 22, 2024

Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models

Paper • 2410.10166 • Published Oct 14, 2024

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14 • 70

Sangmin Bae

AI & ML interests

Recent Activity

Organizations

raymin0223's activity

Why Did MiniMax M2 End Up as a Full Attention Model?