Leaderboard Spanish Language Benchmark for Artificial Intelligence Models (TELEIA)

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/13643393

下载链接

链接失效反馈

官方服务：

资源简介：

TELEIA Datasets Leaderboard These dataset contains the answers of different LLMs to the TELEIA (Spanish Language Benchmark for Artificial Intelligence Models) dataset.LLMs evaluated: Yi-6B-Chat Meta-Llama-3-8B-Instruct Llama-2-7b-chat-hf gemma-7b-it Mistral-7B-Instruct-v0.1 occiglot-7b-es-en-instruct GPT3.5 GPT4 Files: TELEIA_Cervantes_AVE_results.xlsx: vocabulary and grammatical structures, following the format of the Cervantes AVE exam TELEIA_PCE_results.xlsx: test on morphology and semantics resembling the style of the PCE exam, consisting of short questions or sentences to be completed TELEIA_SIELE_results.xlsx: different texts with questions related to them, based on the reading comprehension task of the SIELE exam Each .xlsx contains a sheet with the results of each model and the following columns: question: question from TELEIA option_a: possible answer from TELEIA option_b: possible answer from TELEIA option_c: possible answer from TELEIA option_d: possible answer from TELEIA correct_answer: correct answer form TELEIA llm_question: complete question made to the LLM tokens_in: list of tokens that compound the question tokens_in_count: number of tokens that compound the question llm_answer: raw answer from the LLM llm_answer_filtered: answer in format {A,B,C,D} from the LLM tokens_out : list of tokens that compound the raw answer tokens_out_count: number of tokens that compound the raw answer word_count : number of words that compound the raw answer

创建时间：

2024-09-03

5,000+

优质数据集

54 个

任务类型

进入经典数据集