4 608 481

r PRO

oceansweep

AI & ML interests

None yet

Recent Activity

liked a model about 20 hours ago

jordand/echo-tts-base

liked a model 2 days ago

meituan-longcat/LongCat-Image

upvoted a paper 3 days ago

In-Context Representation Hijacking

View all activity

Organizations

None yet

upvoted 2 papers 3 days ago

In-Context Representation Hijacking

Paper • 2512.03771 • Published 4 days ago • 3

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 11 days ago • 106

upvoted a paper 4 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 5 days ago • 171

upvoted an article 6 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

7 days ago

•

224

upvoted a paper about 1 month ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5 • 124

upvoted a collection about 1 month ago

MDGA

Collection

Make Diffusion Great Again. The resource list for Super Data Learners, Quokka, and OpenMoE 2. • 16 items • Updated Nov 4 • 8

upvoted 5 papers about 2 months ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6 • 113

UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG

Paper • 2510.03663 • Published Oct 4 • 15

upvoted 9 papers 2 months ago

Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs

Paper • 2509.22582 • Published Sep 26 • 10

Learning to Reason for Hallucination Span Detection

Paper • 2510.02173 • Published Oct 2 • 18

F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data

Paper • 2510.02294 • Published Oct 2 • 45

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Paper • 2510.02286 • Published Oct 2 • 28

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26 • 27

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2 • 26

LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1 • 108

jina-reranker-v3: Last but Not Late Interaction for Document Reranking

Paper • 2509.25085 • Published Sep 29 • 6

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535

r PRO

AI & ML interests

Recent Activity

Organizations

oceansweep's activity

Transformers v5: Simple model definitions powering the AI ecosystem