five

Oxidosqualene cyclases (OSC) from sequence similarity network mining

收藏
Figshare2025-03-26 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Oxidosqualene_cyclases_OSC_from_sequence_similarity_network_mining/26826439
下载链接
链接失效反馈
官方服务:
资源简介:
Contact: Jakob Franke, jakob.franke@botanik.uni-hannover.deDescription of files:1 - OSC search scriptContains all files for searching OSC sequences from the 1,000 plant transcriptome (1KP) data (https://www.nature.com/articles/s41586-019-1693-2 and https://gigascience.biomedcentral.com/articles/10.1186/2047-217X-3-17)1 - OSC search script / 230418_osc_mining_blast100_hmmer30.RScript for extracting OSC sequences from 1KP data.Important: All sequences to be screened must be provided in subfolder "sequences" as amino acid sequence fasta files. For 1KP data, there should be 1455 files named XXXX-translated-protein.fa. Assembled 1KP data is available here: https://sites.google.com/a/ualberta.ca/onekp and https://drive.google.com/drive/folders/175nB8kf1UQushuEzv7UaJLPNNwdOrxh51 - OSC search script / input / 1kp sample list.xlsxOriginally provided sample list from 1KP: https://sites.google.com/a/ualberta.ca/onekp/home-page1 - OSC search script / input / reference_OSCs_Chen_NPR.fastaAmino acid sequences of 170 reference OSCs reported by Chen et al. in this paper: https://pubs.rsc.org/en/content/articlelanding/2021/np/d1np00015b1 - OSC search script / output / 1kp_sample_list_with_OSCs_blast100_hmmer30.xlsxExtended version of 1kp sample list.xlsx generated by the R script. Includes the numbers of OSCs found by different search strategies.1 - OSC search script / output / OSCs_blast100_hmmer30.xlsxTable with names and further data for all OSC hits found by the R script.1 - OSC search script / output / OSCs_full_length_blast100_hmmer30.faFasta file with amino acid sequences of all full length OSCs found by the R script (full length is defined in the R script as full_length_cutoff; here 700 AA)1 - OSC search script / output / OSCs_total_blast100_hmmer30.faFasta file with amino acid sequences of all OSCs found by the R script, not just full length. Includes many partial sequences which are most likely artefacts.2 - SSN analysisContains all files for analysing OSC sequences with sequence similarity networks.2 - SSN analysis / 123887_240429_OSCs_references_e360_repnode-1.00_ssn.xgmmlOutput from EFI-EST website generated with "Option C - Fasta" with fasta file 230425_OSCs+references.fa, an E value of 360, and representative node network at 100% ID: https://efi.igb.illinois.edu/efi-est/Analogous xgmml files for E 350 and 370 (shown in SI) are provided.2 - SSN analysis / 230425_OSCs+references.faInput fasta sequence file for EFI-EST website: https://efi.igb.illinois.edu/efi-est/Combines reference sequences from reference_OSCs_Chen_NPR.fasta and found OSC sequences from OSCs_full_length_blast100_hmmer30.fa (see above).2 - SSN analysis / 240429_full_length_OSCs_blast100_hmmer30+references_shared_names.xlsxAdjusted Cytoscape node table for visual annotation of SSN.2 - SSN analysis / 240430 SSN network.cysCytoscape (3.10.2) session containing networks, styles, and adjusted node table used in the manuscript. Visualisation using the yFiles Organic Layout (version 1.1.4).
创建时间:
2025-03-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作