7 16

Jason Weston

spermwhale

AI & ML interests

None yet

Recent Activity

upvoted a paper 29 days ago

Scaling Agent Learning via Experience Synthesis

commented on a paper about 1 month ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

commented on a paper about 1 month ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

View all activity

Organizations

None yet

upvoted a paper 29 days ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published about 1 month ago • 80

commented 2 papers about 1 month ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30 • 115 •

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30 • 115 •

upvoted a paper about 1 month ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28 • 15

commented a paper about 1 month ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28 • 15 •

upvoted a paper about 2 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9 • 41

upvoted a paper 2 months ago

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published Sep 29 • 18

upvoted a paper 3 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16

commented a paper 3 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16 •

upvoted a paper 3 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 24

authored a paper 3 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 24

upvoted a paper 3 months ago

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Paper • 2508.19229 • Published Aug 26 • 20

upvoted a paper 6 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2 • 10

commented a paper 6 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2 • 10 •

authored a paper 7 months ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15 • 24

authored a paper 8 months ago

Multi-Token Attention

Paper • 2504.00927 • Published Apr 1 • 55

authored a paper 9 months ago

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published Mar 19 • 13

authored a paper 11 months ago

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18 • 15

authored 2 papers 12 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 90

Jason Weston

AI & ML interests

Recent Activity

Organizations

spermwhale's activity