ismailcemsahin/job-titles-descriptions
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ismailcemsahin/job-titles-descriptions
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是gpriday/job-titles数据集的合成扩展,包含65,248个结构化职位描述,并预计算了FAISS索引以支持语义搜索。数据集设计用于NLP管道、职业技术应用和检索增强生成(RAG)系统。生成方法使用Llama 3.1-8B模型为基数据集中的每个职位标题合成描述。数据集包含train.parquet文件(存储职位标题和描述)和job_index.faiss文件(用于语义搜索)。应用场景包括语义职位搜索、职业平台推荐、RAG管道和NLP研究。
This dataset is a synthetic expansion of the gpriday/job-titles dataset, containing 65,248 structured job descriptions with a pre-computed FAISS index for semantic search. It is designed for NLP pipelines, career-tech applications, and Retrieval-Augmented Generation (RAG) systems. The generation methodology uses the Llama 3.1-8B model to synthesize descriptions for each job title from the base dataset. The dataset includes a train.parquet file (storing job titles and descriptions) and a job_index.faiss file (for semantic search). Use cases include semantic job search, career platform recommendations, RAG pipelines, and NLP research.
提供机构:
ismailcemsahin



