Guanzhou Ke's picture

Guanzhou Ke

guanzhouk

·

Guanzhou-Ke

AI & ML interests

Multi-modal learning

Recent Activity

upvoted a paper 5 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper 5 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

upvoted a paper 8 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

View all activity

Organizations

None yet

upvoted 2 papers 5 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 7 days ago • 185

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published 16 days ago • 246

upvoted 3 papers 8 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 12 days ago • 167

Can Agent Conquer Web? Exploring the Frontiers of ChatGPT Atlas Agent in Web Games

Paper • 2510.26298 • Published Oct 30 • 45

Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions

Paper • 2511.06876 • Published 29 days ago • 26

upvoted a paper 14 days ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published 19 days ago • 109

upvoted a paper 20 days ago

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7 • 49

upvoted 5 papers about 2 months ago

UniFusion: Vision-Language Model as Unified Encoder in Image Generation

Paper • 2510.12789 • Published Oct 14 • 18

Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published Oct 20 • 67

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20 • 69

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Training-Free Group Relative Policy Optimization

Paper • 2510.08191 • Published Oct 9 • 44

upvoted 3 papers 2 months ago

TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them

Paper • 2509.21117 • Published Sep 25 • 29

Tree Search for LLM Agent Reinforcement Learning

Paper • 2509.21240 • Published Sep 25 • 87

Video models are zero-shot learners and reasoners

Paper • 2509.20328 • Published Sep 24 • 98

upvoted 5 papers 3 months ago

Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22 • 139

Virtual Agent Economies

Paper • 2509.10147 • Published Sep 12 • 26

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9 • 83

Reconstruction Alignment Improves Unified Multimodal Models

Paper • 2509.07295 • Published Sep 8 • 40

Interleaving Reasoning for Better Text-to-Image Generation

Paper • 2509.06945 • Published Sep 8 • 14