five

白血病相关细胞癌变早筛检测数据

收藏
浙江省数据知识产权登记平台2024-10-09 更新2024-10-10 收录
下载链接:
https://www.zjip.org.cn/home/announce/trends/68622
下载链接
链接失效反馈
官方服务:
资源简介:
早筛检测通过检测15毫升血液中DNA碎片里面是否含有癌特异性突变,实现对细胞癌化信号的早期排查。采用专利分子捕获技术和高通量测序技术检测血液中来自不同组织癌化细胞的DNA小片段,覆盖关键抑癌基因中超1900种特异突变指标,对体内是否存在癌化克隆细胞(癌细胞的前体)、存在多少以及在哪里进行计算,可对正常细胞的癌化进程进行定量监控,能让预警发生在细胞癌变前,及早发现和治疗白血病患者,减少因晚期治疗而带来的高昂医疗费用和医疗资源的浪费。1.数据预处理:质量控制:首先对原始的FASTQ文件进行质控,检查序列数据的质量,确保数据的准确性。过滤掉低质量读段,移除污染序列,保证后续分析的可靠性。去接头:去除接头序列和低质量的碱基,确保分析过程中只使用高质量的读段。2.序列比对:映射到参考基因组:使用比对工具将质控后的读段映射到参考基因组。3.变异检测:突变位点检测:选择特定的变异检测工具,对比对后的序列数据进行变异检测。检测单核苷酸多态性和插入缺失变异,并注释每个突变位点的变异类型(例如错义突变、同义突变等)。4.过滤显著变异位点:基于变异频率、测序深度和统计显著性(如P值等)对检测到的变异位点进行过滤。去除假阳性和低置信度的位点,保留显著的候选突变位点。5.临床信息关联:位点功能注释:结合选择的BED文件对显著变异位点进行功能注释。判断这些位点在基因组中的位置(如是否位于功能重要区域或已知的致病性位点)。6.癌症相关性分析:利用注释信息和临床数据库,分析突变位点与不同类型癌症的关联性。7.最终输出本数据,包含显著突变位点、基因名、体突变(氨基酸)等白血病相关信息,结合临床数据,为早期癌症筛查提供参考依据。

Early cancer screening detects cancer-specific mutations in DNA fragments circulating in 15 mL of blood samples, enabling early identification of cellular carcinogenic signals. Adopting patented molecular capture technology and high-throughput sequencing technology, it detects small DNA fragments from cancerous cells of different tissues in blood, covering over 1900 specific mutation indicators in key tumor suppressor genes. It calculates whether there are carcinogenic clone cells (precursors of cancer cells) in the body, their quantity and location, enabling quantitative monitoring of the carcinogenic progression of normal cells. This allows early warning before cellular carcinogenesis, enabling early detection and treatment of leukemia patients, thereby reducing high medical costs and waste of medical resources caused by late-stage treatment. 1. Data Preprocessing: 1.1 Quality Control: First, perform quality control on the original FASTQ files to check the quality of sequence data and ensure data accuracy. Filter out low-quality reads and remove contaminated sequences to guarantee the reliability of subsequent analysis. 1.2 Adapter Trimming: Remove adapter sequences and low-quality bases to ensure only high-quality reads are used in the analysis process. 2. Sequence Alignment: Mapping to Reference Genome: Use alignment tools to map the quality-controlled reads to the reference genome. 3. Variant Detection: Mutation Site Detection: Select specific variant detection tools to perform variant detection on the aligned sequence data. Detect single nucleotide polymorphisms (SNPs) and insertions-deletions (indels), and annotate the variant type of each mutation site (e.g., missense mutation, synonymous mutation, etc.). 4. Filtering of Significant Variant Sites: Filter the detected variant sites based on variant allele frequency, sequencing depth, and statistical significance (e.g., P-value, etc.). Remove false-positive and low-confidence sites, and retain significant candidate mutation sites. 5. Clinical Information Association: Site Functional Annotation: Perform functional annotation on the significant variant sites in combination with the selected BED file. Determine the genomic location of these sites (e.g., whether they are located in functionally important regions or known pathogenic sites). 6. Cancer Correlation Analysis: Use annotation information and clinical databases to analyze the correlation between mutation sites and different types of cancers. 7. Final Output: This dataset includes leukemia-related information such as significant mutation sites, gene names, somatic mutations (amino acid changes), etc. Combined with clinical data, it provides a reference basis for early cancer screening.
提供机构:
嘉兴金弗康医学检验实验室有限公司
创建时间:
2024-09-13
搜集汇总
数据集介绍
main_image_url
特点
该数据集是白血病相关细胞癌变早筛检测数据,包含3559条记录,涵盖多个基因和蛋白质突变信息,用于早期癌症筛查。数据每半年更新一次,应用专利分子捕获技术和高通量测序技术进行检测。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作