Hiroto N. PRO

hironow

AI & ML interests

AI Agent, LLM, Audio, Animate

Recent Activity

liked a model 3 days ago

microsoft/VibeVoice-Realtime-0.5B

upvoted a paper 7 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

liked a Space 7 days ago

ResembleAI/Chatterbox-Multilingual-TTS

View all activity

Organizations

upvoted a paper 7 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8 • 10

upvoted a collection 13 days ago

ShieldGemma Release

Collection

A series of safety classifiers, trained on top of Gemma 2, for developers to filter inputs and outputs of their applications. • 3 items • Updated Jul 10 • 14

upvoted an article 13 days ago

Article

Introducing SynthID Text

Oct 23, 2024

•

upvoted a paper 15 days ago

Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Paper • 2509.14055 • Published Sep 17 • 16

upvoted 2 articles 17 days ago

Article

Training Flux Locally on Mac

Sep 12, 2024

•

Article

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

Sep 2

•

upvoted an article 2 months ago

Article

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

Sep 23

•

upvoted a collection 3 months ago

Mem-Agent

Collection

Small sized agents from Dria trained on interacting with an obsidian-like memory system using python tools. Trained on Qwen3-4B-Thinking-2507. • 4 items • Updated Sep 5 • 3

upvoted an article 3 months ago

Article

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Sep 11

•

upvoted 6 articles 4 months ago

Article

Introducing HELMET: Holistically Evaluating Long-context Language Models

Apr 16

•

Article

17 Reasons Why Gradio Isn't Just Another UI Library

Apr 16

•

Article

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code

May 23

•

170

Article

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

Jun 6

•

Article

ScreenEnv: Deploy your full stack Desktop Agent

Jul 10

•

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

Aug 5

•

509

upvoted a paper 5 months ago

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4 • 157

upvoted an article 7 months ago

Article

How to Build an MCP Server with Gradio

Apr 30

•

200

upvoted an article 8 months ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

•

172

upvoted 2 papers 10 months ago

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 72

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 90

Hiroto N. PRO

AI & ML interests

Recent Activity

Organizations

hironow's activity

Introducing SynthID Text

Training Flux Locally on Mac

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

mem-agent: Persistent, Human Readable Memory Agent Trained with Online RL

Introducing HELMET: Holistically Evaluating Long-context Language Models

17 Reasons Why Gradio Isn't Just Another UI Library

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

ScreenEnv: Deploy your full stack Desktop Agent

Welcome GPT OSS, the new open-source model family from OpenAI!

How to Build an MCP Server with Gradio

FastRTC: The Real-Time Communication Library for Python