LLM Research Repository

Mendeley Data2024-05-28 更新2024-06-29 收录

下载链接：

https://zenodo.org/records/11244018

下载链接

链接失效反馈

官方服务：

资源简介：

Overview Welcome to the Large Language Models (LLM) Repository, a curated collection aimed at researchers, practitioners, and enthusiasts in the field of Natural Language Processing (NLP). This repository offers resources related to Large Language Models, including research papers, theses, tools, datasets, courses, open-source models, and benchmarks. 1. Research Papers A compilation of seminal and cutting-edge research papers that shape the field of Large Language Models. This section includes: Foundational Papers: Groundbreaking papers that laid the framework for LLM research. Recent Advances: Latest research examining novel architectures, training techniques, and applications. Survey and Review Articles: Comprehensive surveys and reviews that aggregate findings and offer insightful analysis on various aspects of LLMs. 2. Theses A collection of master's and doctoral theses that focus on various facets of LLMs, providing in-depth explorations of core concepts, novel methodologies, and empirical studies from all over the world. 3. Tools A list of tools and libraries essential for working with Large Language Models. This section encompasses: Development Frameworks: Popular libraries and frameworks. Utilities: Tools for data preprocessing, model deployment, and inference acceleration. 4. Datasets A collection of datasets tailored for training and evaluating Large Language Models. This section includes: Text Corpora: Large-scale text datasets from diverse domains such as news articles, books, and social media. Annotated Datasets: Datasets with human annotations for tasks such as named entity recognition, sentiment analysis, and machine translation. 5. Courses A list of university courses related to Large Language Models. 6. Open Source Models Access to state-of-the-art open-source Large Language Models, allowing you to leverage pre-trained models for various applications. This section includes: Model Repositories: Links to GitHub repositories and model zoos hosting popular LLMs such as GPT, BERT, T5, and their variants. Pre-trained Models: Ready-to-use models available through platforms like the Hugging Face Model Hub, including detailed usage instructions and licensing information. Customized Implementations: Specialized versions and fine-tuned models tailored for specific tasks or domains. 7. Benchmarks A suite of benchmarks designed to evaluate the performance and robustness of Large Language Models. This section features: Standard Benchmarks: Widely-accepted benchmarks like GLUE, SuperGLUE, and the LAMBADA dataset. Challenge Sets: Curated datasets that test specific capabilities of models, such as commonsense reasoning, multilingual understanding, or adversarial robustness.

创建时间：

2024-05-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集