KingfernJohn/kfj-pypi-packages-metadata
收藏Hugging Face2024-05-01 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KingfernJohn/kfj-pypi-packages-metadata
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
tags:
- PyPi
- package
- dataset
- Large
pretty_name: kfj pypi
---
# kfj-pypi Dataset
This dataset contains a collection of PyPI packages scraped from PyPI. The dataset includes metadata about each package, including its name, version, description, author, license, and more. The dataset is intended to be used for research and development in various natural language processing (NLP) applications such as named entity recognition and text classification.
## Usage
To use this dataset, you can download it from [Hugging Face Datasets](https://huggingface.co/datasets/KingfernJohn/kfj-pypi-packages-metadata) using the `datasets` library in Python:
```python
from datasets import load_dataset
dataset = load_dataset("KingfernJohn/kfj-pypi-packages-metadata")
```
This will load the kfj-pypi dataset into a Python variable, which you can then use to access the metadata for each package.
## Info
The dataset contains metadata of 161,346 packages, with a total size of 743MB (.zip 304MB).
## Versions
- version 0.1
## Structure
```json
{
"name": "",
"version": "",
"description": "",
"author": "",
"author_email": "",
"maintainer": "",
"maintainer_email": "",
"license": "",
"keywords": "",
"classifiers": "",
"download_url": "",
"platform": "",
"homepage": "",
"project_urls": "",
"requires_python": "",
"requires_dist": "",
"provides_dist": "",
"obsoletes_dist": "",
"summary": ""
}
```
提供机构:
KingfernJohn
原始信息汇总
kfj-pypi Dataset 概述
数据集内容
- 类型: PyPI 包元数据集合
- 包含信息: 包名称、版本、描述、作者、许可证等
- 用途: 用于自然语言处理(NLP)研究,如命名实体识别和文本分类
数据集详情
- 元数据数量: 161,346 个包
- 总大小: 743MB(.zip 格式 304MB)
数据集结构
json { "name": "", "version": "", "description": "", "author": "", "author_email": "", "maintainer": "", "maintainer_email": "", "license": "", "keywords": "", "classifiers": "", "download_url": "", "platform": "", "homepage": "", "project_urls": "", "requires_python": "", "requires_dist": "", "provides_dist": "", "obsoletes_dist": "", "summary": "" }
使用方法
- 通过 Python 的
datasets库加载数据集: python from datasets import load_dataset dataset = load_dataset("KingfernJohn/kfj-pypi-packages-metadata")



