five

Processed unique sorted BAM files of High Risk HPV types 16, 18, 45 and 68b transcripts

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/fh47g3dp6t
下载链接
链接失效反馈
官方服务:
资源简介:
RNAseq raw data of RNA extracted from five cervical cancer cell lines were mapped to HPV16 (NC_001526.2 or NC_001526.4), HPV18 (NC_001357.1), HPV68b (FR751039.1) and HPV45 (X74479.1). The processed BAM files were further analysed using The Hisat2 v2.1.0 aligner, Cufflinks v. 2.2.1, Cuffmerge and Cuffdiff. The processed sorted BAM files containing sequences of viral transcripts were shown as ‘sort BAM files’. The HPV16, 18, 45 and 68b transcripts data and their expression levels reported as FPKM can be visualized using IGV software. The number of total reads and viral reads after mapping to viral references were shown in ‘HR-HPV genes and isoforms FPKM tracking file’, the spliced transcripts were named as CUFF. The integration sites of four high risk HPV types were analysed and shown in ‘fusion file’. According to IGV visualization, the splicing junctions of all four HPV types were found within E6 and E1 regions. For E6 region, one splicing donor (SD) at the 5′ end and different splicing acceptor (SA) positions at the 3′ end were found as follow; three splicing junctions were found in HPV16 positive cervical cancer cell lines, CaSki and SiHa, SD226^SA409(E6*I), SD226^SA526(E6*II) and SD226^SA742(E6*X). Two splicing junctions were found in HPV18 (HeLa), SD233^SA416(E6*I), SD233^SA635, HPV45(MS751); SD230^SA412(E6*I), SD230^SA640 and HPV68b (ME180); SD129^SA311(E6*I), SD129^SA406. Splicing junctions within E1 region found in CaSki and SiHa were SD880^SA3358, SD880^SA3361, SD880^SA3391, SD880^SA1726, SD880^SA2405, SD880^SA2582(E1C), SD880^SA2709(E2*), SD880^SA3020, SD880^SA3078, SD880^SA3329, SD577^SA6810, SD898^SA1725, SD1302^SA2709, SD1302^SA3358(E2C), SD1760^SA3391, SD1263^SA3391, SD2309^SA3461 and the other forms were SD96^SA1063, SD226^SA2709 (E6*IV), SD226^SA3329, SD226^SA3358(E6*III), SD226^SA3361, SD226^SA3391 and SD579^SA6809. Splicing junctions within E1 region in HPV18(HeLa) were SD929^SA2779, SD977^SA1836, SD1342^SA1436, SD1987^SA2047 and one splicing event within E7 region, SD599^SA619. HPV68b(ME180) were SD839^SA2586, SD683^SA2586, SD839^SA2586 and no E1 splicing junctions were found in HPV45(MS751).

从5株宫颈癌细胞系中提取的RNA的RNA测序(RNA-seq)原始数据,被比对至人乳头瘤病毒16型(HPV16,NC_001526.2或NC_001526.4)、人乳头瘤病毒18型(HPV18,NC_001357.1)、人乳头瘤病毒68b型(HPV68b,FR751039.1)及人乳头瘤病毒45型(HPV45,X74479.1)的参考序列。后续使用Hisat2 v2.1.0比对软件、Cufflinks v2.2.1、Cuffmerge及Cuffdiff对处理后的BAM(Binary Alignment Map)文件进行进一步分析。包含病毒转录组序列的经处理与排序的BAM文件被标记为"sort BAM files"(排序BAM文件)。以FPKM(Fragments Per Kilobase of transcript per Million mapped reads)值报告的HPV16、18、45及68b的转录组数据及其表达水平,可通过IGV(Integrative Genomics Viewer)软件进行可视化展示。比对至病毒参考序列后的总读段数与病毒读段数,见于"HR-HPV genes and isoforms FPKM tracking file"(高危HPV基因及异构体FPKM追踪文件);剪接转录本被命名为CUFF。对4种高危HPV型别的整合位点进行分析,结果展示于"fusion file"(融合文件)中。通过IGV可视化分析发现,4种HPV型别的剪接连接位点均位于E6与E1区域。针对E6区域,在5'端存在1个剪接供体位点(SD),3'端存在不同的剪接受体位点(SA),具体如下:在HPV16阳性宫颈癌细胞系CaSki与SiHa中,共发现3种剪接连接位点:SD226^SA409(E6*I)、SD226^SA526(E6*II)及SD226^SA742(E6*X);在HPV18阳性细胞系HeLa中发现2种剪接连接位点:SD233^SA416(E6*I)、SD233^SA635;在HPV45阳性细胞系MS751中:SD230^SA412(E6*I)、SD230^SA640;在HPV68b阳性细胞系ME180中:SD129^SA311(E6*I)、SD129^SA406。CaSki与SiHa细胞系中E1区域的剪接连接位点包括:SD880^SA3358、SD880^SA3361、SD880^SA3391、SD880^SA1726、SD880^SA2405、SD880^SA2582(E1C)、SD880^SA2709(E2*)、SD880^SA3020、SD880^SA3078、SD880^SA3329、SD577^SA6810、SD898^SA1725、SD1302^SA2709、SD1302^SA3358(E2C)、SD1760^SA3391、SD1263^SA3391、SD2309^SA3461;其余剪接形式包括:SD96^SA1063、SD226^SA2709(E6*IV)、SD226^SA3329、SD226^SA3358(E6*III)、SD226^SA3361、SD226^SA3391及SD579^SA6809。HPV18阳性细胞系HeLa的E1区域剪接连接位点为:SD929^SA2779、SD977^SA1836、SD1342^SA1436、SD1987^SA2047;另有1个位于E7区域的剪接事件:SD599^SA619。HPV68b阳性细胞系ME180的E1区域剪接连接位点包括:SD839^SA2586、SD683^SA2586、SD839^SA2586;而HPV45阳性细胞系MS751中未发现E1区域剪接连接位点。
创建时间:
2020-09-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作