five

Analysis of the contents of Equipment.data.ac.uk

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10396716
下载链接
链接失效反馈
官方服务:
资源简介:
This data sample was downloaded from https://equipment.data.ac.uk/institutions for for data analysis purposes in my bachelor's thesis with the title "Equipment.data.ac.uk – The Linked Open Catalogue of Scientific Equipment" (in Czech "Otevřený propojený katalog vědeckých přístrojů Equipment.data.ac.uk"), that was submitted in 2023 at the Institute Information Studies and Librarianship, Charles University, Prague, Czech Republic. In the practical part, the bachelor's thesis (in Czech language) deals with the analysis of the data quality of the Equipment.data.ac.uk catalog in terms of respecting the UNIQUIP standard when entering data, for which the attached data and a PHP script are used. The ed_sample_2023-12-12.zip archive contains data sample mentioned above and complete folder stucture needed for PHP script "import.php": The root folder of the archive contains "import.php" file, which is PHP script used for data import and analysis, and "style.css", the CSS styles file for the HTML output generated by the script. The "data_c" folder contains downloaded data files in csv or json formats, used in analysis. The "defs" folder contains parametrisation for the import.php script: definition of the UNIQUIP standard ("uniquip.csv") from https://equipment.data.ac.uk/uniquip with the following columns: column heading (name of the data field) [varchar] code (assigned by author of thesis to simplify the array index names) [varchar] gorf (group of required fields - assigned to number, which is unique for each group) [integer] the conversion table from KitCat to UNIQUIP standard ("kitcat.csv") with the following columns: column heading (name of the data field) [varchar] code (assigned by author of thesis to simplify the array index names) [varchar] gorf (group of required fields - assigned to number, which is unique for each group) [integer] kitcat (kitcat field - source field from Kitcat standard to be converted to the UNIQUIP field described by previous three columns) [varchar] The "institutions" folder contains "20231212.csv" file, which contains the table of institutions from https://equipment.data.ac.uk/institutions with these columns: instituce = Name of the institution [varchar] ror = ROR identifier of the institution [varchar] záznamy =  count of records stated in table (as a check of correct data import by script) [integer] typ = data standard ("Uniquip" or "Kitcat" - this controls the import method and file extension) [varchar] ignorovat = if this institution should be ignored by the import script by some reason ("t" if true) [char] poznámka = reason for the ignoring (in czech) [varchar] When the import.php script is executed, it produces an HTML report with the following four tables, one per every test: VO1: Summary of non-empty required field groups. Tests, if required group of fields is present in records. Required field groups are defined in "uniquip.csv" table, column "gorf". VO2: Content validity summary. Tests, if content of selected fields is valid in the sense of context (email, url). PHP functions "FILTER_VALIDATE..." are used. VO3: Summary of taxonomy usage. Test, if records uses some taxonomy in the field "Technique". VO4: Summary of types of items. Tests, if field "Type" contains "equipment" or "facility" or other, non-supported value. Then, additional tables follow: VO2-IVL: Listing of invalid values detected by VO2 test. INS: The Institutions table with counts of records in table and counts of records imported by the script. For convenience, the complete html output of the script is present as "output.html" file in the root directory of the archive.   Thesis citation:FLOHR, Martin. Otevřený propojený katalog vědeckých přístrojů Equipment.data.ac.uk. Praha, 2023. Bakalářská práce. Univerzita Karlova, Filozofická fakulta, Ústav informačních studií a knihovnictví. Vedoucí bakalářské práce Dr. Jan Dvořák.
创建时间:
2023-12-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作