Data from: Investigating the spatial, demographic, and genetic structures of Cylicodiscus gabunensis Harms, a light-demanding African timber species
收藏Mendeley Data2024-04-13 更新2024-06-29 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.0zpc8674f
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset for investigating the spatial, demographic, and genetic structures of *Cylicodiscus gabunensis* Harms, a light-demanding African timber species [https://doi.org/10.5061/dryad.0zpc8674f](https://doi.org/10.5061/dryad.0zpc8674f) The dataset comprises files with information on *Cylicodiscus gabunensis* individuals sampled in two 400 ha forest plots (site A and B) and a 839 ha one (site C). It also includes NMpi and SPAGeDi input data files for the three plots (only site A and C for NMpi). This dataset is valuable for examining both diameter distribution and spatial patterns, spatial genetic structure, mating system, gene flow and determinants of reproductives success. ## Description of the data and file structure There are three types of data files provided. The first type consists of lists of Cylicodiscus gabunensis individuals that were sampled. The first row in these datasets serves as the header. These datasets contain individual names in the first column, longitude (x coordinates) and latitude (y coordinates) in the second and third columns, diameter at breast height in the fourth column, and canopy dominance status (Dawkins' crown illumination index) in the fifth column (only for site A). These datasets have versatile applications, enabling the creation of size-class maps for individuals categorized as saplings (dbh < 10 cm), juveniles (dbh from 10 cm to < 20 cm), and trees with dbh from 20 cm up to < 200 cm, grouped in 10 cm intervals. They can also be used to generate histograms showing frequency distribution per dbh class and calculate stand density. Additionally, they facilitate the characterization of the spatial distribution of trees (dbh ≥ 20 cm) using the 'spatstat' R package (Baddeley et al., 2015) through the pair correlation function g (PCF), which is a distance-dependent correlation function related to the derivative of the widely used K-function (Ripley, 1976). The second type of files are the NMpi input data files. The header of each file follows this structure: np no nl nf, where np is the number of parents, no is the number of progeny, nl is the number of loci (limited to nuclear genetic markers), and nf is the number of phenotypic characters. Subsequent lines contain individual data, with each parent or progeny line starting with 0 (indicating generation), followed by ID, X and Y coordinates, cytotype, genotype, phenotypic characters, and the femaleness index. The third type of files are the SPAGeDi input data files. These datasets follow a specific format: The first line contains six format numbers for individuals, categories, spatial coordinates, loci, allele coding digits, and ploidy level. The second line defines distance intervals, while the third line lists column labels. Starting from the fourth line, individual data is presented, including names, categories, coordinates (either coordinates or latitude and longitude), and genotypes at each locus. Each dataset concludes with the word "END." Additionally, there may be optional lines for dominant markers or polyploid data following this structured format, ensuring data organization for analysis.
# 用于研究喜光非洲材用树种加蓬盘豆木(*Cylicodiscus gabunensis* Harms)的空间、种群及遗传结构的数据集 [https://doi.org/10.5061/dryad.0zpc8674f]
本数据集包含采自两个400公顷森林样地(样地A与B)以及一个839公顷森林样地(样地C)的加蓬盘豆木个体信息文件。此外还包含三个样地的NMpi与SPAGeDi输入数据文件(NMpi仅涵盖样地A与C的数据)。本数据集可用于探究直径分布与空间格局、空间遗传结构、交配系统、基因流以及繁殖成功的决定因素等研究方向。
## 数据与文件结构说明
本次提供的数据文件分为三类。
第一类为加蓬盘豆木采样个体信息列表。此类文件的首行为表头,各列依次为:个体名称(第一列)、经度(X坐标,第二列)与纬度(Y坐标,第三列)、胸径(Diameter at Breast Height,DBH,第四列),以及冠层优势度等级——道金斯冠层光照指数(Dawkins' crown illumination index,第五列,仅样地A包含该列)。此类文件应用场景多元:可用于生成按径级划分的个体大小分布图,径级划分标准为:幼树(胸径<10 cm)、幼年树(胸径10 cm ≤ DBH <20 cm)以及胸径20 cm ≤ DBH <200 cm的成树(按10 cm间隔分组);可用于绘制各径级的频率分布直方图并计算林分密度;还可借助R语言`spatstat`包(Baddeley等,2015),通过成对相关函数g(Pair Correlation Function,PCF,即与广泛应用的K函数(K-function)导数相关的距离依赖型相关函数,Ripley,1976)对胸径≥20 cm的个体进行空间分布特征刻画。
第二类为NMpi输入数据文件。每个文件的表头格式为`np no nl nf`,其中np为亲本数量,no为子代数量,nl为位点数量(仅限核遗传标记),nf为表型性状数量。后续各行依次为个体数据:每个亲本或子代行以0(代表世代)开头,随后依次为ID、X与Y坐标、细胞型、基因型、表型性状以及雌性育性指数。
第三类为SPAGeDi输入数据文件。此类数据集遵循固定格式:第一行为六个数值,分别对应个体数、类别数、空间坐标数、位点数、等位基因编码位数以及倍性水平;第二行为距离区间定义;第三行为列标签列表;第四行及以后为个体数据,包含个体名称、类别、坐标(可为坐标值或经纬度)以及各位点的基因型。每个数据集以“END.”结尾。此外,还可根据需求添加显性标记或多倍体数据的可选行,以规范数据组织以便开展后续分析。
创建时间:
2023-11-05



