【我遇到的问题】 • 现象:该数据集的下载链接已失效 【相关信息】 • 可考虑访问这个链接获取类似文件~https://www.selectdataset.com/dataset/3688356173feccbcf1f1e490ddc6bc72
Unique Ingredient Identifier
收藏Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/003d9aad-aae3-4ccf-a0dd-a4a19f22e4b9/John-Snow-Labs_Unique-Ingredient-Identifier
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
This data package contains the details of substances in drugs, biologics, foods and devices registered with a Unique Ingredient Identifier (UNII) through the joint FDA/USP Substance Registration System (SRS). It also contains a list of the names used for each UNII and the changes made to Unique Ingredient Identifiers' (UNIIs) descriptions to the latest update.
**Description**
The Unique Ingredient Identifier (UNII) is a non-proprietary, free, unique, unambiguous, non-semantic, alphanumeric identifier based on a substance's molecular structure and/or descriptive information. The UNII is:
- One of the core components of the United States Federal Medication Terminology.
- Used in the FDA's Structured Product Labeling
- Used to assist in the generation of the National Library of Medicine's (NLM's) RxNorm.
- A US government standard for drug ingredient and food allergen identifiers
- A component of the Environmental Protection Agency's Substance Registry System (future) The overall purpose of the joint FDA/USP Substance Registration System (SRS) is to support health information technology initiatives by generating unique ingredient identifiers (UNIIs) for substances in drugs, biologics, foods, and devices.
The UNII is a non- proprietary, free, unique, unambiguous, non-semantic, alphanumeric identifier based on a substance’s molecular structure and/or descriptive information. The procedures and management of the SRS is provided by the SRS Board. The SRS Board includes experts from both FDA and USP. The SRS operating procedures defined by the SRS Board are detailed in the SRS Manual. The UNII is a core component of the US Federal Medication Terminology, it is used for product labeling, to assist in the generation of RxNorm, as an identifier for drug ingredients and allergens and in the future will be a component of the Environmental Protection Agency's Substance Registry System. The UII is useful for understanding data contained in NLM's Unified Medical Language System, National Cancer Institute Enterprise Vocabulary Service, FDA Data Standards Council website, VA National Drug File Reference Terminology, FDA Inactive Ingredient Query Application and, proximately, USP Dictionary of USAN and International Drug Names.
**Benefits**
- The overall purpose of the joint fda/usp substance registration system (srs) is to support health information technology initiatives by generating unique ingredient identifiers (uniis) for substances in drugs, biologics, foods, and devices.
**License Information**
The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes.
**Included Datasets**
- [Unique Ingredient Identifier Changes](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-changes)
- This dataset displays the changes made to Unique Ingredient Identifiers' (UNIIs) descriptions to the lastest update (2019). Content in this dataset is related to the Unique Ingredient Identifier Records dataset.
- [Unique Ingredient Identifier Names](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-names)
- This dataset contains a list of the names used for each UNII (Unique Ingredient Identifier). Contents on this dataset are related to the UNII Records dataset.
- [Unique Ingredient Identifier Records](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-records)
- This dataset contains the details of substances in drugs, biologics, foods and devices registered with a Unique Ingredient Identifier (UNII) through the joint FDA/USP Substance Registration System (SRS).
**Data Engineering Overview**
**We deliver high-quality data**
- Each dataset goes through 3 levels of quality review
- 2 Manual reviews are done by domain experts
- Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints
- Data is normalized into one unified type system
- All dates, unites, codes, currencies look the same
- All null values are normalized to the same value
- All dataset and field names are SQL and Hive compliant
- Data and Metadata
- Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters
- Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated
- Data Updates
- Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted
**Our data is curated and enriched by domain experts**
Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts:
- Field names, descriptions, and normalized values are chosen by people who actually understand their meaning
- Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset
- Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations
- The data is always kept up to date – even when the source requires manual effort to get updates
- Support for data subscribers is provided directly by the domain experts who curated the data sets
- Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution.
**Need Help?**
If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).
**概览**
本数据包涵盖了通过美国食品药品监督管理局(FDA)与美国药典委员会(USP)联合建立的物质注册系统(Substance Registration System, SRS),获得唯一成分标识符(Unique Ingredient Identifier, UNII)的药品、生物制品、食品及医疗器械中的物质详情。同时包含各UNII所使用的名称列表,以及截至最新版本的UNII描述变更记录。
**数据集详情**
唯一成分标识符(Unique Ingredient Identifier, UNII)是基于物质分子结构及/或描述性信息生成的非专有、免费、唯一、无歧义、非语义化的字母数字标识符。UNII具备以下属性:
- 美国联邦药物术语体系的核心组成部分之一
- 应用于美国食品药品监督管理局(FDA)的结构化产品标签(Structured Product Labeling)
- 辅助生成美国国家医学图书馆(National Library of Medicine, NLM)的RxNorm标准
- 美国政府用于药品成分及食品过敏原标识符的官方标准
- 未来将纳入美国环境保护署(Environmental Protection Agency, EPA)物质注册系统
FDA与USP联合建立的物质注册系统(Substance Registration System, SRS)的核心目标,是通过为药品、生物制品、食品及医疗器械中的物质生成UNII,为健康信息技术项目提供支撑。
UNII是基于物质分子结构及/或描述性信息生成的非专有、免费、唯一、无歧义、非语义化的字母数字标识符。物质注册系统(SRS)的流程与管理由SRS委员会负责,该委员会由FDA与USP的专家组成。SRS委员会制定的SRS操作细则详见《SRS手册》。UNII作为美国联邦药物术语体系的核心组成部分,可用于产品标签、辅助生成RxNorm标准、作为药品成分与过敏原的标识符,未来还将成为EPA物质注册系统的组成部分。UNII可用于解析美国国家医学图书馆统一医学语言系统、美国国家癌症研究所企业词汇服务、FDA数据标准委员会官网、退伍军人事务部国家药品文件参考术语、FDA非活性成分查询应用程序,以及近期的USP美国采用名称(USAN)与国际药品名称词典中的数据。
**优势**
- FDA与USP联合物质注册系统(SRS)的核心目标,是通过为药品、生物制品、食品及医疗器械中的物质生成唯一成分标识符(UNII),为健康信息技术项目提供支撑。
**许可信息**
约翰·斯诺实验室(John Snow Labs)数据集仅供个人使用及科研用途,免费开放。如需商业使用,请前往约翰·斯诺实验室官网订阅[数据库](https://www.johnsnowlabs.com/marketplace/),订阅后可商业使用约翰·斯诺实验室旗下所有数据集与数据包。
**包含数据集**
- [唯一成分标识符变更记录](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-changes)
- 本数据集展示了截至2019年最新版本的唯一成分标识符(UNII)描述变更内容,与UNII记录数据集相关联。
- [唯一成分标识符名称列表](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-names)
- 本数据集包含各UNII所使用的名称列表,与UNII记录数据集相关联。
- [唯一成分标识符记录](https://www.johnsnowlabs.com/marketplace/unique-ingredient-identifier-records)
- 本数据集涵盖了通过FDA与USP联合物质注册系统(SRS)获得UNII的药品、生物制品、食品及医疗器械中的物质详情。
**数据工程概览**
**我们提供高质量数据**
- 每份数据集均经过三级质量审核
- 由领域专家完成2次人工评审
- 随后通过60余项自动化验证流程,确保所有数据符合元数据与预定义约束
- 数据归一化为统一类型系统
- 所有日期、单位、代码、货币格式保持统一
- 所有空值归一化为统一标准值
- 所有数据集与字段名称符合SQL与Hive命名规范
- 数据与元数据
- 数据以CSV与Apache Parquet格式提供,针对分布式Hadoop、Spark及大规模并行处理(Massively Parallel Processing, MPP)集群的高读取性能进行优化
- 元数据采用开放的无摩擦数据(Frictionless Data)标准,所有字段均经过归一化与验证
- 数据更新
- 数据更新采用替换更新模式:过时的外键将被弃用而非直接删除
**本数据集由领域专家精心甄选并丰富完善**
本数据集由我们的医生、药剂师、公共卫生及医疗计费专家团队手动甄选:
- 字段名称、描述与归一化值均由真正理解其含义的专业人员选定
- 医疗健康与生命科学领域专家为每份数据集添加分类、搜索关键词、描述等信息
- 针对临床代码、服务提供商、药品及地理位置,均支持人工与自动化双重数据富集流程
- 始终保持数据实时更新,即便数据源需要手动获取更新内容
- 数据集的技术支持由负责甄选该数据集的领域专家直接提供
- 所有数据源的许可均经过人工验证,确保可免费商用并重新分发
**需要帮助?**
若您对我们的产品有任何疑问,请发送邮件至[info@johnsnowlabs.com](mailto:info@johnsnowlabs.com)。
提供机构:
John Snow Labs
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集提供了通过FDA/USP物质注册系统(SRS)注册的唯一成分标识符(UNII)的详细信息,包括药品、生物制品、食品和器械中的物质。UNII是一种基于分子结构或描述信息的非专有、唯一标识符,用于支持健康信息技术倡议,如药物标签和RxNorm生成。数据集包含三个子集:UNII记录、UNII名称和UNII变更,经过专家审核和质量验证,确保数据准确性和一致性。
以上内容由遇见数据集搜集并总结生成



