DASCH DR7 Digital Inventory
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14563520
下载链接
链接失效反馈官方服务:
资源简介:
These files define a "digital inventory" of all of the files archived as part of DASCH Data Release 7 (DR7). DASCH (Digital Access to a Sky Century @ Harvard) was the project to digitize the Harvard College Observatory’s Astronomical Photographic Glass Plate Collection for scientific applications. This irreplaceable resource provides a means for systematic study of the sky on 100-year time scales.
This inventory does not contain the actual DASCH data. Rather, it contains an exhaustive index of all of the DASCH data — virtually all aspects of DASCH's digital existence throughout the project's entire history, up through the DR7 release date (December, 2024). The complete inventory documents 33,791,530 files totaling 745,627,062,858,355 bytes (around 678 TiB) of data. The inventory itself is about 10 GiB in size (decompressed), spread across 3,946 files.
The actual underlying data are currently archived in a set of Amazon AWS S3 buckets and magnetic tapes held by Harvard College Observatory. Most DASCH users are encouraged to access DASCH data via the project's data access services; this inventory should only be of interest to those interested in large-scale duplication of the DASCH data.
The DASCH archive, which is indexed by this inventory, includes:
Full-plate "mosaic" FITS images of more than 428,000 plates, as well as photographs of the plates and their jackets
Astrometric solution data for about 97% of the plates
Photometric calibration data for about 89% of the plates
Lightcurves for all sources extracted from the plates, matched to two separate reference catalogs:
23,574,404,199 measurements calibrated to the APASS DR8 catalog
27,966,413,880 measurements calibrated to the ATLAS-refcat2 catalog
About 166,000 photographs of observing logbooks documenting the plates, and a selection of historical astronomer notebooks discussing them
Derived products, generated from the above, needed to operate the DASCH data access services
Raw "tile" data from two decades of DASCH scanning, as well as supporting calibration and telemetry files
All of the source code behind the DASCH software systems, from scanning to pipeline processing to data access services to end-user analysis
Logs relating to all modern DASCH pipeline processing, data management, and other operations tasks
All available project documentation
All other data files supporting DASCH operations
See the README.md file within the collection for more information about the structure and contents of this inventory. In summary, it organizes the DASCH data files into a virtual hierarchy of names. Associated with each name is a size (in bytes), MD5 digest, and one or more "data URLs" recording locations where that file is archived as of DR7. Every single file has a data URL indicating a location on Amazon's AWS S3 storage service; many files also have one or more copies on magnetic backup tapes heldby Harvard College Observatory.
The inventory is expressed as a collection of plain-text (UTF-8) files using Markdown syntax. There is approximately one such file for each "folder" or "subtree" of the virtual name hierarchy. Each file contains a human-readable preamble describing the folder contents, an optional Markdown table listing any direct-descendant subfolders, and an optional Markdown table documenting any files contained directly within that folder. The intention is that it should be fairly straightforward for both humans to navigate these files, as well as to write software that processes them. While most files are human-scale in size, the largest (Inventory.pipeline_astrometry.md) is about 280 MiB and contains about 1.5 million records.
As of the DR7 release, only some DASCH archive files are directly accessible by third parties. The Starglass website (https://starglass.cfa.harvard.edu/) makes many photographs and "mosaics" (full-plate FITS images) available, and the web APIs supporting this site and the DASCH data access services (see the DASCH site, https://dasch.cfa.harvard.edu/) provide access to additional resources. To duplicate other portions of the archive, you may need to contact Harvard College Observatory. It is hoped that over time, more and more of the DASCH archive will become available for direct download. It is also hoped that additional copies of the DASCH archive will be created and publicized; the best way to ensure the long-term preservation of this dataset is to duplicate it. A major goal of this inventory is to make such duplication tractable.
To the greatest extent possible, it is believed that all of the files documented as part of this archive can be duplicated free of legal encumbrances. Unless documented otherwise, the copyright owner of all copyrightable elements is the President and Fellows of Harvard College. Please see the DASCH website for the most up-to-date guidance regarding image credits and any legal topics relating to this dataset.
Acknowledgments
The DASCH scanning project was the work of literally hundreds of people over multiple decades. Out of the many people who have devoted their time and energy to the project, the essential contributions of a few deserve special recognition: Prof. Jonathan (Josh) Grindlay; Bob Simcoe; Edward Los; Lindsay Smith Zrull; and Alison Doane.
The DASCH project at Harvard is grateful for partial support from NSF grants AST-0407380, AST-0909073, and AST-1313370; which should be acknowledged in all papers making use of DASCH data.
We acknowledge the one-time gift of the Cornel and Cynthia K. Sarosdy Fund for DASCH, and thank Grzegorz Pojmanski of the ASAS project for providing some of the source code on which the DASCH scientific data access portal was based.
The ongoing AAVSO Photometric All-Sky Survey (APASS) has improved DASCH photometric calibration and is funded by the Robert Martin Ayers Sciences Fund.
This inventory and DASCH Data Release 7 were prepared by Peter K. G. Williams in December, 2024.
创建时间:
2024-12-27



