five

Leaderboard Spanish Language Benchmark for Artificial Intelligence Models (TELEIA)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13643393
下载链接
链接失效反馈
官方服务:
资源简介:
TELEIA Datasets Leaderboard These dataset contains the answers of different LLMs to the TELEIA (Spanish Language Benchmark for Artificial Intelligence Models) dataset.LLMs evaluated: Yi-6B-Chat Meta-Llama-3-8B-Instruct Llama-2-7b-chat-hf gemma-7b-it Mistral-7B-Instruct-v0.1 occiglot-7b-es-en-instruct GPT3.5 GPT4 Files: TELEIA_Cervantes_AVE_results.xlsx: vocabulary and grammatical structures, following the format of the Cervantes AVE exam TELEIA_PCE_results.xlsx: test on morphology and semantics resembling the style of the PCE exam, consisting of short questions or sentences to be completed TELEIA_SIELE_results.xlsx: different texts with questions related to them, based on the reading comprehension task of the SIELE exam Each .xlsx contains a sheet with the results of each model and the following columns: question: question from TELEIA option_a: possible answer from TELEIA     option_b: possible answer from TELEIA        option_c: possible answer from TELEIA         option_d: possible answer from TELEIA         correct_answer: correct answer form TELEIA   llm_question: complete question made to the LLM     tokens_in: list of tokens that compound the question     tokens_in_count: number of tokens that compound the question     llm_answer: raw answer from the LLM     llm_answer_filtered: answer in format {A,B,C,D} from the LLM     tokens_out : list of tokens that compound the raw answer     tokens_out_count: number of tokens that compound the raw answer     word_count :  number of words that compound the raw answer
创建时间:
2024-09-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作