add dataset details
Browse files
app.py
CHANGED
|
@@ -6,6 +6,26 @@ TITLE = """<h1 align="center" id="space-title">πΉπ Thai Sentence Embedding
|
|
| 6 |
|
| 7 |
INTRODUCTION_TEXT = """
|
| 8 |
π The πΉπ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
## Tagging
|
| 10 |
π’ Open sourced π¦ API
|
| 11 |
"""
|
|
|
|
| 6 |
|
| 7 |
INTRODUCTION_TEXT = """
|
| 8 |
π The πΉπ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
|
| 9 |
+
## Dataset
|
| 10 |
+
The evaluation is conducted on 8 datasets across 4 tasks:
|
| 11 |
+
1. Semantic Textual Similarity (STS)
|
| 12 |
+
- Translated STS-B, contains 1,379 test samples, https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark
|
| 13 |
+
2. Text Classification
|
| 14 |
+
- Wisesight, contains 2,671 test samples, https://huggingface.co/datasets/pythainlp/wisesight_sentiment
|
| 15 |
+
- Wongnai, contains 6,203 test samples, https://huggingface.co/datasets/Wongnai/wongnai_reviews
|
| 16 |
+
- Generated Review, contains 17,453 test samples, https://huggingface.co/datasets/airesearch/generated_reviews_enth
|
| 17 |
+
3. Pair Classification
|
| 18 |
+
- XNLI (Thai only), contains 3,340 test samples, https://github.com/facebookresearch/XNLI
|
| 19 |
+
4. Retrieval
|
| 20 |
+
- XQuAD (Thai only), contains 1,190 test samples, https://huggingface.co/datasets/google/xquad
|
| 21 |
+
- MIRACL (Thai only), contains 733 test samples, https://huggingface.co/datasets/miracl/miracl
|
| 22 |
+
- TyDiQA (Thai only), contains 763 test samples, https://huggingface.co/datasets/chompk/tydiqa-goldp-th
|
| 23 |
+
## Metrics
|
| 24 |
+
The evaluation metrics for each task are as follows:
|
| 25 |
+
1. STS -> Spearman correlation
|
| 26 |
+
2. Text Classification -> F1
|
| 27 |
+
3. Pair Classification -> Average Precision
|
| 28 |
+
3. Retrieval -> MMR@10
|
| 29 |
## Tagging
|
| 30 |
π’ Open sourced π¦ API
|
| 31 |
"""
|