Yilun Zhao's picture

Yilun Zhao PRO

yilunzhao

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

upvoted a paper about 1 month ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

upvoted a paper about 1 month ago

PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity

View all activity

Organizations

upvoted a paper 17 days ago

What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity

Paper • 2511.15593 • Published 18 days ago • 55

upvoted 4 papers about 1 month ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 208

PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity

Paper • 2510.23603 • Published Oct 27 • 22

LimRank: Less is More for Reasoning-Intensive Information Reranking

Paper • 2510.23544 • Published Oct 27 • 8

E^2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker

Paper • 2510.22733 • Published Oct 26 • 31

upvoted 3 papers about 2 months ago

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published Oct 22 • 14

FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain

Paper • 2510.15232 • Published Oct 17 • 5

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 105

published a dataset about 2 months ago

ya-ir/limrank-data

Viewer • Updated Oct 13 • 22k • 18

New activity in yale-nlp/SciArena about 2 months ago

adding citation

#4 opened about 2 months ago by

upvoted 4 papers about 2 months ago

Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep Research

Paper • 2510.06056 • Published Oct 7 • 5

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Paper • 2510.08559 • Published Oct 9 • 8

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Paper • 2510.06499 • Published Oct 7 • 31

MRMR: A Realistic and Expert-Level Multidisciplinary Benchmark for Reasoning-Intensive Multimodal Retrieval

Paper • 2510.09510 • Published Oct 10 • 7

updated a dataset about 2 months ago

ytbv/main

Viewer • Updated Oct 11 • 481 • 103

published 2 datasets about 2 months ago

yilunzhao/ytbv

Updated Oct 10 • 20

yilunzhao/test-repo

Updated Oct 10 • 14

upvoted 2 papers about 2 months ago

FinLFQA: Evaluating Attributed Text Generation of LLMs in Financial Long-Form Question Answering

Paper • 2510.06426 • Published Oct 7 • 2

PuzzlePlex: Benchmarking Foundation Models on Reasoning and Planning with Puzzles

Paper • 2510.06475 • Published Oct 7 • 1

published a dataset about 2 months ago

ytbv/main

Viewer • Updated Oct 11 • 481 • 103