Measuring hidden phenotype: quantifying the shape of barley seeds using the Euler characteristic transform
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.rxwdbrv93
下载链接
链接失效反馈官方服务:
资源简介:
Shape plays a fundamental role in biology. Traditional phenotypic analysis methods measure some features but fail to measure the information embedded in shape comprehensively. To extract, compare and analyse this information embedded in a robust and concise way, we turn to topological data analysis (TDA), specifically the Euler characteristic transform. TDA measures shape comprehensively using mathematical representations based on algebraic topology features. To study its use, we compute both traditional and topological shape descriptors to quantify the morphology of 3121 barley seeds scanned with X-ray computed tomography (CT) technology at 127 μm resolution. The Euler characteristic transform measures shape by analysing topological features of an object at thresholds across a number of directional axes. A Kruskal–Wallis analysis of the information encoded by the topological signature reveals that the Euler characteristic transform picks up successfully the shape of the crease and bottom of the seeds. Moreover, while traditional shape descriptors can cluster the seeds based on their accession, topological shape descriptors can cluster them further based on their panicle. We then successfully train a support vector machine to classify 28 different accessions of barley based exclusively on the shape of their grains. We observe that combining both traditional and topological descriptors classifies barley seeds better than using just traditional descriptors alone. This improvement suggests that TDA is thus a powerful complement to traditional morphometrics to comprehensively describe a multitude of ‘hidden’ shape nuances which are otherwise not detected.
Methods
We selected 28 barley accessions with diverse spike morphologies and geographical origins for our analysis (Harlan and Martini 1929, 1936, 1940). In November of 2016, seeds from each accession were stratified at 4C on wet paper towels for a week, and germinated on the bench at room temperature. Four day old seedlings were transferred into pots in triplicate and arranged in a completely randomized design in a greenhouse. Day length was extended throughout the experiment using artificial lighting ---minimum 16h light / 8h dark. After the plants reached maturity and dried, a single spike was collected from each replicate for scanning at Michigan State University.The scans were produced using the North Star Imaging X3000 system and the included efX software, with 720 projections per scan, with 3 frames averaged per projection. The data was obtained in continuous mode. The X‐ray source was set to a voltage of 75 kV, current of 100 μA, and focal spot size of 7.5μm. The 3D reconstruction of the spikes was computed with the efX-CT software, obtaining a final voxel size of 127 microns. The intensity values for all raw reconstructions was standardized as a first step to guarantee that the air and the barley material had the same density values across all scans. Next, the air and debris were thresholded out, and awns digitally.
Finally, the seed coat of the caryopses was digitally removed, leaving only the embryo and endosperm due to their high water content. We did not have enough resolution in the raw scans to distinguish clearly the endosperm from the embryo. Hereafter, we will refer to these embryo-endosperm unions simply as seeds. Due to the large volume of data, we used an in-house scipy-based python script to automate the image processing pipeline for all panicles and grains.
形状在生物学领域中具有基础性地位。传统表型分析方法虽可获取部分特征,却无法全面捕获蕴含于形状之中的信息。为以稳健且简洁的方式提取、对比并分析此类信息,我们采用拓扑数据分析(Topological Data Analysis, TDA)技术,具体选用欧拉特征变换(Euler characteristic transform)作为分析工具。拓扑数据分析借助基于代数拓扑特征的数学表征,可实现形状的全面量化。
为探究该技术的应用效果,我们针对3121颗采用127 μm分辨率X射线计算机断层扫描(X-ray computed tomography, CT)技术扫描的大麦籽粒,分别计算传统形状描述符与拓扑形状描述符,以量化其形态特征。欧拉特征变换通过分析物体在多个方向轴阈值下的拓扑特征,实现形状的表征。对拓扑特征编码信息的克鲁斯卡尔-沃利斯(Kruskal–Wallis)检验结果显示,欧拉特征变换可成功识别大麦籽粒的棱纹与底部形态。此外,传统形状描述符仅能基于大麦品系对籽粒进行聚类,而拓扑形状描述符还可进一步依据穗部特征实现聚类。
随后我们成功训练了支持向量机(Support Vector Machine, SVM),仅基于籽粒形状对28个不同大麦品系进行分类。实验结果表明,结合传统与拓扑描述符的分类效果优于仅使用传统描述符的情况。这一改进表明,拓扑数据分析可作为传统形态计量学的有力补充,全面描述诸多此前未被检测到的“隐藏”形状细节。
材料与方法
本研究选取了28个具有多样穗部形态与地理起源的大麦品系(Harlan与Martini 1929、1936、1940)。2016年11月,将各品系的种子置于湿润滤纸上,于4℃条件下进行低温层积处理一周,随后在室温环境下的实验台进行发芽培养。将生长4天的幼苗以三份生物学重复移栽至花盆,并采用完全随机区组设计安置于温室中。实验全程通过人工照明延长日照时长,确保光照时长不低于16小时,黑暗时长为8小时。待植株成熟干燥后,从每个生物学重复样本中采集单个穗部,送至密歇根州立大学进行扫描。
扫描采用North Star Imaging X3000系统及配套efX软件完成,每次扫描采集720个投影角度,每个投影角度平均3帧图像。数据采集模式设置为连续模式。X射线源参数设置为:管电压75 kV、管电流100 μA、焦点尺寸7.5 μm。利用efX-CT软件完成穗部的三维重建,最终得到的体素尺寸为127微米。首先对所有原始重建图像的强度值进行标准化处理,以确保所有扫描样本中空气与大麦组织的密度值保持一致。随后,通过阈值分割去除空气与杂质,并数字化去除麦芒。
最后,数字化去除颖果的种皮,仅保留胚与胚乳。原始扫描的分辨率无法清晰区分胚与胚乳。此后,我们将此类胚-胚乳联合体简称为“籽粒”。鉴于数据量庞大,我们采用基于SciPy的自研Python脚本,自动化完成所有穗部与籽粒的图像处理流程。
创建时间:
2023-05-31



