five

ruiyicheng/LenghuSky-8

收藏
Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ruiyicheng/LenghuSky-8
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 tags: - climate - astronomy pretty_name: LenghuSky-8 size_categories: - 10B<n<100B --- # Data Overview This repo contains the data of LenghuSky-8. Code of this project is available at https://github.com/ruiyicheng/LenghuSky-8 . Paper for this dataset is available in https://arxiv.org/abs/2603.16429 . ## bkg_mask contains the data related to background mask annotation. bkg_mask/bkg_change.csv contains the start time (in yyyy-mm-dd-HH-MM-SS) of each background change event in the data. "l" represents lower and "u" represents upper. bkg_mask/bkg_binary_classification_merged.csv contains time class and probability of the binary classifier for all the image captured after 2023-09-27-18-09-48, where 1 represents the roof is in upper part of the image and 0 represents the roof is in lower part of the image. Results are obtained by code/background_classify bkg_mask/masks/ contain all the json files that is annotated using labelme. Each json file corresponds to one frame where a background change happens. bkg_mask/mask_mat/ contains the npy files of background masks for each start time in bkg_change.csv. bkg_mask/bkg_map.txt provide the map between logits file and background mask npy file in bkg_mask/mask_mat/. ## calibration contains the data related to astrometric calibration yyyy-mm-dd-HH-MM-SS_calibration.json contains the calibration polynomial coefficients for the images captured from the yyyy-mm-dd-HH-MM-SS. Obtained by code/calibration/Jia25_ensemble.py + code/calibration/calibrate_and_save.py . calibration_index.json contains the pointers to the calibration files for each image timestamp. These data are by code/calibration/calibrate_and_save.py . ## images logits contains image for cloud segmentation and the corresponding logits which is obtained by linear probe of DINOv3 local features. They contains raw sample images captured by the cloud camera at different timestamps. These raw data would not be published due to its huge size (~5TB). A [mean-1sigma,mean+3sigma] clip and resize to 512*512 is performed on images for cloud segmentation are chopped from the center part of cloud camera, which produce image/. These data are obtained by code/preprocess/preprocess.py. Volumn of this data is ~20GB. logits contains the corresponding logits for cloud segmentation of each sample image in image/. These data are obtained by code/inference_segmentation_dinov3/inference.py. Volumn of this data is ~40GB, which is available in the github repo. ## interrupt.csv contains the data related to interrupt events during data collection. It contains [start, end) of each interrupt event in the data collection. The time is in yyyy-mm-dd-HH-MM-SS format. These data are obtained by code/preprocess/find_interrupt.py. Some of the interrupt ends with an discontinuity of time, so statistics of interrupt duration can be a bit overestimated.
提供机构:
ruiyicheng
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作