Sources of prey availability data alter interpretation of outputs from prey choice null networks

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/7908186

下载链接

链接失效反馈

官方服务：

资源简介：

Spider surveys Data collection was described previously by Cuff, Tercel, et al., (2022). This study pertains to a subset of those data, collected between 1st May and 9th July 2018 at 19 separate locations, for which paired sticky trap and vacuum sample data were collected (described below). Briefly, money spiders (Araneae: Linyphiidae) and wolf spiders (Araneae: Lycosidae) were visually located along transects in two adjacent barley fields at Burdons Farm, Wenvoe in South Wales (51°26'24.8"N, 3°16'17.9"W) and collected from webs and the ground. Transects were randomly distributed across the entire field. Along these transects, separate 4 m2 quadrats, at least 10 m apart, were searched and all observed linyphiids and lycosids were collected. Spiders were placed in 100 % ethanol using an aspirator, regularly changing meshing to limit potential cross-contamination. Linyphiids occupying webs were prioritised for collection, but ground-active spiders were also collected. Spiders were taken to Cardiff University, transferred to fresh ethanol and stored at -80 °C in 100 % ethanol until DNA extraction. Extraction, amplification and sequencing of DNA, and bioinformatic analysis is described by Cuff, Tercel, et al., (2022) and Drake et al., (2022), and is also detailed below. Extraction and high-throughput sequencing of spider gut DNA Given their prevalence in field collections, dietary analysis was carried out for the linyphiid genera Erigone, Tenuiphantes, Bathyphantes and Microlinyphia (Araneae: Linyphiidae), and the Lycosidae genus Pardosa. Spiders were transferred to and washed in fresh 100 % ethanol to reduce external contaminants prior to identification via morphological key (Roberts, 1993). Abdomens were removed from spiders and again transferred to and washed in fresh 100 % ethanol. DNA was extracted from the abdomens via Qiagen TissueLyser II and DNeasy Blood & Tissue Kit (Qiagen) as per the manufacturer protocol, but with an extended lysis time of 12 hours to account for the complex and branched gut system in spider abdomens (Krehenwinkel et al., 2017). For amplification of DNA, two primer pairs were used. BerenF-LuthienR (Cuff et al., 2021) amplified a broad range of invertebrates including spiders, and TelperionF-LaureR (Cuff et al., 2022), amplified a range of invertebrates but fewer spiders. Primers were labelled with unique 10 bp molecular identifier tags (MID-tags) so that each individual had a unique pairing of forward and reverse tags for identification of each spider post-sequencing. PCR reactions of 25 µl contained 12.5 µl Qiagen PCR Multiplex kit, 0.2 µmol (2.5 µl of 2 µM) of each primer and 5 µl template DNA. Reactions were carried out in the same thermocycler, optimised via temperature gradient, with an initial 15 minutes at 95 °C, 35 cycles of 95 °C for 30 seconds, the primer-specific annealing temperature for 90 seconds and 72 °C for 90 seconds, respectively, followed by a final extension at 72 °C for 10 minutes. BerenF-LuthienR and TelperionF-LaureR used annealing temperatures of 52 °C and 42 °C, respectively. Within each PCR 96-well plate, 12 negative controls (extraction and PCR), 2 blank controls and 2 positive controls were included (i.e. 80 samples per plate), based on Taberlet et al. (2018). Positive controls were mixtures of invertebrate DNA comprised of non-native Asiatic species in four different proportions and blanks were empty wells within each plate to identify tag-jumping into unused MID-tag combinations. PCR negative controls were DNase-free water treated identically to DNA samples. A negative control was present for each MID-tag to identify any contamination of primers. All PCR products were visualised in a 2 % agarose gel with SYBRSafe (Thermo Fisher Scientific, Paisley, UK) and placed in categories based on their relative brightness. The concentration of these brightness categories was quantified via Qubit dsDNA High-sensitivity Assay Kits (Thermo Fisher Scientific, Waltham, MA, USA) with at least three representatives of each category per plate. The PCR products were then proportionally pooled according to these concentrations. Each pool was cleaned via SPRIselect beads (Beckman Coulter, Brea, USA), with a left-side size selection using a 1:1 ratio (retaining ~300-1000 bp fragments). The concentration of the pooled DNA was then determined via Qubit dsDNA High-sensitivity Assay Kits and pooled together into one library per primer pair. Library preparation for Illumina sequencing was carried out on the cleaned libraries via NEXTflex Rapid DNA-Seq Kit (Bioo Scientific, Austin, USA) and samples were sequenced on an Illumina MiSeq via a V3 chip with 300-bp paired-end reads (expected capacity ≤25,000,000 reads). Bioinformatic analysis followed Drake et al. (2022). Bioinformatic analysis The Illumina run generated 11,165,405 and 10,959,010 reads for BerenF-LuthienR and TelperionF-LaureR, respectively, which were quality-checked and paired via FastP (Chen et al., 2018) to retain only sequences of at least 200 bp with a quality threshold of 33, resulting in 10,561,874 and 9,355,112 paired reads. The paired reads were demultiplexed and assigned to their respective spider sample according to their MID-tags via the “trim.seqs” command in Mothur v1.39.5 (Schloss et al., 2009), leaving 7,854,610 and 7,437,929 reads with exact matches to the primer and MID-tags. Replicates were removed, and denoising and clustering to zero-radius operational taxonomic units (ZOTUs; clustered without % identity to avoid multiple species represented within a single operational taxonomic unit (OTU)) completed via Unoise3 in Usearch11 (Edgar, 2010). The resultant sequences were assigned a taxonomic identity from GenBank via BLASTn v2.7.1 (Camacho et al., 2009) using a 97 % identity threshold (Alberdi et al., 2017). The BLAST output was analysed in MEGAN v6.15.2 (Huson et al., 2016). Where the top BLAST hit, determined by lowest e-value, was resolved at a higher taxonomic level than species-level, the results were checked; where possibly erroneous entries were preventing species-level assignment (e.g., poorly resolved identifications on GenBank), finer resolution was assigned based on the next-closest match. Where ZOTUs were assigned the same taxon, these were aggregated. Data clean-up used the optimal minimum sequence copy thresholds identified by Drake et al. (2022). The maximum value for a ZOTU present in blank or negative controls was identified and subtracted from all read counts for that ZOTU to remove background contaminants. Simultaneously, known lab contaminants (e.g., German cockroach Blattella germanica), artefacts and errors of the sequencing process, unexpected reads in positive controls and positive control taxon reads in dietary samples were identified. These were calculated as a percentage of their respective sample’s read count and any read counts lower than the highest of these percentages for their respective sample were removed to eliminate additional instances of contamination. These thresholds were defined as 0.38 % and 0.39 % for BerenF-LuthienR and TelperionF-LaureR, respectively. The data from the two libraries (i.e., from each primer pair) were then aggregated together by sample and aggregated again by taxon. Non-target taxa (e.g., fungi) and instances in which predator DNA was amplified (i.e., ZOTUs with high read counts matching the individual’s morphological identity) were removed. The resultant sequencing read counts were converted into relative proportions (all values made to sum to one within each sample) and a mean value across the two primer pairs retained for each taxon within each sample. Relative read abundances were converted to presence-absence data of each detected prey taxon in each individual spider, but relative read abundance data were also retained for separate analyses to compare experimental outcomes between treatments. Invertebrate surveys To estimate prey availability using sticky traps, we placed one white dry 100 mm x 125 mm trap (Oecos) in the 4 m2 quadrat centred at the position where the spider was captured. The trap was suspended with wire approximately 25 mm above the ground to catch falling, crawling and flying invertebrates, and left in place for 72 hours. Invertebrates were identified on the traps under a stereomicroscope. To estimate prey availability using suction sampling, ground and crop stems were sampled using a ‘G-vac’ for approximately 30 seconds at each location. The collected material was emptied into a bag, any organisms immediately killed with ethyl-acetate and material frozen for storage before sorting into 70 % ethanol in the lab. All invertebrates were identified to family level to match the resolution of the least resolved of the metabarcoding-derived trophic interaction data, and due to difficulties associated with identification to finer taxonomic resolution for many taxa. Exceptions included springtails of the superfamily Sminthuroidea (Sminthuridae and Bourletiellidae were often indistinguishable following suction sampling and preservation due to the fine features necessary to distinguish them) which were left at super-family, mites (many of which were immature or in poor condition) which were identified to order level, and wasps of the superfamily Ichneumonoidea which were identified no further due to obscurity of wing venation due to damage following suction sampling. Statistical Analysis All analyses were conducted in R v4.0.3 (R Core Team, 2021) and carried out on invertebrate data at the family or superfamily level. Alongside the dietary data derived from metabarcoding, and prey availability as determined directly by suction sampling (abundance) and sticky trapping (activity density), three additional datasets were generated where two were designed to combine data from the two trapping methods. The first approach simply set all invertebrate taxa detected in the field to have equal abundance, to provide a baseline against which to assess the effects of different prey abundance estimates. When generating the two combined data sets, it was apparent that simply adding them together would underrepresent one of the datasets as abundance and activity density are measured in different units. Therefore, a ‘proportional combined’ dataset was generated by converting counts to relative proportions of each sample (to equally weight the two methods), which were then combined by summing proportions between the two methods for each sample, multiplied by the total count of individuals across both methods for each sample (to create realistic abundance values), and then rounded to the nearest integer (to return count data). In addition, a ‘frequency of occurrence (FOO) combined’ dataset was generated by converting counts to binary presence-absence values of each sample, which were then summed between the two methods for each sample. To assess the diversity represented by the two sampling methods and their combinations, and the completeness of those datasets, coverage-based rarefaction and extrapolation were carried out, and Hill diversity calculated (Chao et al., 2014; Roswell et al., 2021) using the ‘iNEXT’ package with families represented by frequency-of-occurrence across samples (Chao et al., 2014; Hsieh et al., 2016). The remaining analyses were performed using both presence-absence and relative read abundance dietary data separately to show how differences in the treatment of the observed data are reflected in the outcomes of the analyses. Figures and outputs given in the main text relate to the presence-absence data, while relative read abundance figures and outputs are presented in the Supplementary Information. Prey preferences of spiders were analysed using network-based null models in the ‘econullnetr’ package (Vaughan et al., 2018) with the ‘generate_null_net’ function. Econullnetr generates null models based on prey availability to predict how consumers would forage if based on the availability of resources alone. These null models are then compared against the observed interactions of consumers (e.g., interactions of spiders with their prey based on dietary metabarcoding) to ascertain the extent to which resource consumption deviated from random. In five separate null models, prey availability was represented separately by the datasets described above: abundance (suction sampling), activity density (sticky trapping), proportional combined, FOO combined and equal prey abundance. To compare effect sizes between null models for each resource taxon, mean prey preference standardised effect size (SES) values were calculated from the individual spiders per model. The SES values were plotted and joined between taxa to visualise paired differences using ‘ggplot’ (Wickham, 2016). Null model-predicted trophic interactions were generated via an econullnetr null model with 999 simulations with outputs extended to allow the comparison of the null interactions for individual consumers (generate_null_net_indiv; Cuff, Kitson, et al., 2023). A visualisation of the per-individual differences in null model and observed data was generated via non-metric multi-dimensional scaling (NMDS) using the ‘metaMDS’ function in the ‘vegan’ package (Oksanen et al., 2016) in two dimensions and 9999 simulations, with Euclidean distance. Centroid coordinates for each null model and the observed data were extracted and pairwise distances calculated between model centroids: The ‘observed’ network (i.e., the network determined solely by dietary data, not necessarily the objectively ‘true’ network) and each null network were visualised with the associated prey choice effect sizes as a bipartite network using ‘ggnetwork’ (Briatte, 2021; Wickham, 2016) via an ‘igraph’ object (Csardi & Nepusz, 2006). The degree of each prey node, weighted nestedness and linkage density were generated using the ‘bipartite’ package (Dormann et al., 2008) for each network and compared visually via ggplot2.

创建时间：

2023-07-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集