five

Mouse data for whole-embryo lineage reconstruction with linajea

收藏
DataCite Commons2024-05-13 更新2024-07-13 收录
下载链接:
https://janelia.figshare.com/articles/dataset/Mouse_data_for_whole-embryo_lineage_reconstruction_with_linajea/24768798
下载链接
链接失效反馈
官方服务:
资源简介:
This article enables access to the mouse dataset (140521) for "Automated reconstruction of whole-embryo cell lineages by learning from sparse annotations" (Malin-Mayor et al. 2023, DOI: https://doi.org/10.1038/s41587-022-01427-7).Here we provide the ground truth tracks used to train the deep learning model, the trained networks, and the predicted tracks. Additionally, we provide information on how to access the image data, although it is not uploaded here due to size. Related artifacts include the source code for experiments and methods.Image DataThe image dataset in n5/zarr format (as used in Malin-Mayor et al. 2023) can be accessed at the following Dropbox link: https://www.dropbox.com/scl/fi/2mt7jxmtl80s3zf2byfyr/140521_mouse.tar.gz?rlkey=n5r311whn8ky4gdabybjdekcc&amp;dl=0. This image dataset was originally published in "<i>In Toto</i> Imaging and Reconstruction of Post-Implantation Mouse Development at the Single-Cell Level" ( McDole et al. 2018, DOI: https://doi.org/10.1016/j.cell.2018.09.031), and can also be accessed in the Image Dataset Repository in .klb format along with associated metadata at https://idr.openmicroscopy.org/webclient/?show=project-502.Ground Truth TracksInside <code>gt_tracks.zip</code> there are a number of files containing different subsets of tracks. Each has the following columns separated by tabs: <code>time, z, y, x, cell_id, parent_id, track_id</code>.<code>tracks.txt</code> is the main file containg manual annotations of individual cells from start to end of video used to train the model. These tracks are sparse, but each cell included in the <code>tracks.txt</code> had its whole lineage traced as completely as possible from start to end of the video.<code>division_tracks.txt</code> is a different set of manually annotated tracks, where each track is around 5 frames long and centers around a division. <code>daughter_cells.txt</code> is a subset of <code>division_tracks.txt</code> containing only the cells directly after a division event, and was generated for convenient and efficient training of models where divisions are oversampled.<code>full_frame_divisions.txt</code> is a set of manually annotated division points (points right before the cell divides) that are as complete as possible for target time points 120, 240, and 360 and the adjacent time frames, which was used for evaluation and not model training.Trained Models<code>trained_networks.zip</code> includes all networks trained on the mouse dataset. The model we suggest using for best performance is described in <code>140521_mouse_simple_train_all_config.json</code> and the weights are included in <code>train_net_checkpoint_400000.*</code>. This model was trained and validated on all available ground truth data, and as such is NOT the same as the models used to report results in the paper.<code>supp_figure_2</code> includes the configs and models used to report results in the Supplemental Figure 2a of the paper, and Figure 2a of the main text. We separated the data into train/validation/test splits on "early" (times 50-100), "middle" (times 225-275) and "late" (times 400-450). Each model has two time splits held out for validation and testing, and therefore was trained on the remaining split as well as all time frames not in one of the splits. For the mouse, this resulted in 3 trained networks for the main ("setup11_simple") architecture.<code>supp_figure_6b</code> contains the configs and trained models presented in the ablation study in the Supplemental Figure 6B.Predicted Tracks<code>predicted_tracks.zip</code> contains both the TGMM baseline results and the results for the linajea method.<code>tgmm/140521_shifted_TGMM.xml</code> contains the TGMM results provided to us by the authors of the TGMM method.The linajea results are organized similarly to the trained models. <code>mouse_all_results_071621.txt</code> contains the tracks predicted by the model trained on all ground truth tracks (<code>140521_mouse_simple_train_all</code>). Again, these are NOT the tracks evaluated in the paper, but they are likely to be the most correct since they were trained on the most data.<code>supp_figure_2</code> contains the tracks used in the main Figure 2a and in the Supplemental Figure 2a. <code>supp_figure_6b</code> contains the tracks used in Supplemental Figure 6B (ablation study).<br>
提供机构:
Janelia Research Campus
创建时间:
2024-05-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作