Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:1905.07830

Benchmarks and Evals

Awesome Collection of Benchmarks and Evaluation Papers

Measuring Massive Multitask Language Understanding

Paper • 2009.03300 • Published Sep 7, 2020 • 3
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 51
GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Paper • 2311.12022 • Published Nov 20, 2023 • 33
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

Papers - Reasoning - Commonsense

SocialIQA: Commonsense Reasoning about Social Interactions

Paper • 1904.09728 • Published Apr 22, 2019 • 4
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Paper • 1905.10044 • Published May 24, 2019 • 2
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

Papers - Institute - Allen Institute

The Curious Case of Neural Text Degeneration

Paper • 1904.09751 • Published Apr 22, 2019 • 3
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5
SocialIQA: Commonsense Reasoning about Social Interactions

Paper • 1904.09728 • Published Apr 22, 2019 • 4
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 24
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Papers - University - University of Washington

The Curious Case of Neural Text Degeneration

Paper • 1904.09751 • Published Apr 22, 2019 • 3
Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1, 2024 • 31
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Paper • 1905.10044 • Published May 24, 2019 • 2
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5

gemma_knowledg_tree

Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 47
Measuring Massive Multitask Language Understanding

Paper • 2009.03300 • Published Sep 7, 2020 • 3
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5

Benchmarks and Evals

Awesome Collection of Benchmarks and Evaluation Papers

Measuring Massive Multitask Language Understanding

Paper • 2009.03300 • Published Sep 7, 2020 • 3
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Paper • 2406.01574 • Published Jun 3, 2024 • 51
GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Paper • 2311.12022 • Published Nov 20, 2023 • 33
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 24
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 2
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Papers - Reasoning - Commonsense

SocialIQA: Commonsense Reasoning about Social Interactions

Paper • 1904.09728 • Published Apr 22, 2019 • 4
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Paper • 1905.10044 • Published May 24, 2019 • 2
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

Papers - University - University of Washington

The Curious Case of Neural Text Degeneration

Paper • 1904.09751 • Published Apr 22, 2019 • 3
Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1, 2024 • 31
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Paper • 1905.10044 • Published May 24, 2019 • 2
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5

Papers - Institute - Allen Institute

The Curious Case of Neural Text Degeneration

Paper • 1904.09751 • Published Apr 22, 2019 • 3
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5
SocialIQA: Commonsense Reasoning about Social Interactions

Paper • 1904.09728 • Published Apr 22, 2019 • 4
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6

gemma_knowledg_tree

Gemini: A Family of Highly Capable Multimodal Models

Paper • 2312.11805 • Published Dec 19, 2023 • 47
Measuring Massive Multitask Language Understanding

Paper • 2009.03300 • Published Sep 7, 2020 • 3
HellaSwag: Can a Machine Really Finish Your Sentence?

Paper • 1905.07830 • Published May 19, 2019 • 6
PIQA: Reasoning about Physical Commonsense in Natural Language

Paper • 1911.11641 • Published Nov 26, 2019 • 5

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs