Oxidosqualene cyclases (OSC) from sequence similarity network mining
收藏DataCite Commons2025-03-26 更新2025-04-20 收录
下载链接:
https://figshare.com/articles/dataset/Oxidosqualene_cyclases_OSC_from_sequence_similarity_network_mining/26826439/1
下载链接
链接失效反馈官方服务:
资源简介:
Contact: Jakob Franke, jakob.franke@botanik.uni-hannover.deDescription of files:<br>1 - OSC search scriptContains all files for searching OSC sequences from the 1,000 plant transcriptome (1KP) data (https://www.nature.com/articles/s41586-019-1693-2 and https://gigascience.biomedcentral.com/articles/10.1186/2047-217X-3-17)<br><b>1 - OSC search script</b><b> / </b><b>230418_osc_mining_blast100_hmmer30.R</b>Script for extracting OSC sequences from 1KP data.<b>I</b><b>mportant: </b>All sequences to be screened must be provided in subfolder "sequences" as amino acid sequence fasta files. For 1KP data, there should be 1455 files named XXXX-translated-protein.fa. Assembled 1KP data is available here: https://sites.google.com/a/ualberta.ca/onekp and https://drive.google.com/drive/folders/175nB8kf1UQushuEzv7UaJLPNNwdOrxh5<br><b>1 - OSC search script</b><b> / input / 1kp sample list.xlsx</b>Originally provided sample list from 1KP: https://sites.google.com/a/ualberta.ca/onekp/home-page<br><b>1 - OSC search script</b><b> / input / reference_OSCs_Chen_NPR.fasta</b>Amino acid sequences of 170 reference OSCs reported by Chen et al. in this paper: https://pubs.rsc.org/en/content/articlelanding/2021/np/d1np00015b<br><b>1 - OSC search script</b><b> / output / 1kp_sample_list_with_OSCs_blast100_hmmer30.xlsx</b>Extended version of 1kp sample list.xlsx generated by the R script. Includes the numbers of OSCs found by different search strategies.<br><b>1 - OSC search script</b><b> / output / OSCs_blast100_hmmer30.xlsx</b>Table with names and further data for all OSC hits found by the R script.<br><b>1 - OSC search script</b><b> / output / OSCs_full_length_blast100_hmmer30.fa</b>Fasta file with amino acid sequences of all full length OSCs found by the R script (full length is defined in the R script as full_length_cutoff; here 700 AA)<br><b>1 - OSC search script</b><b> / output / OSCs_total_blast100_hmmer30.fa</b>Fasta file with amino acid sequences of all OSCs found by the R script, not just full length. Includes many partial sequences which are most likely artefacts.<br><br>2 - SSN analysisContains all files for analysing OSC sequences with sequence similarity networks.<br><b>2 - SSN analysis / 123887_240429_OSCs_references_e360_repnode-1.00_ssn.xgmml</b>Output from EFI-EST website generated with "Option C - Fasta" with fasta file 230425_OSCs+references.fa, an E value of 360, and representative node network at 100% ID: https://efi.igb.illinois.edu/efi-est/Analogous xgmml files for E 350 and 370 (shown in SI) are provided.<br><b>2 - SSN analysis / 230425_OSCs+references.fa</b>Input fasta sequence file for EFI-EST website: https://efi.igb.illinois.edu/efi-est/Combines reference sequences from reference_OSCs_Chen_NPR.fasta and found OSC sequences from OSCs_full_length_blast100_hmmer30.fa (see above).<br><b>2 - SSN analysis / </b><b>240429_full_length_OSCs_blast100_hmmer30+references_shared_names.xlsx</b>Adjusted Cytoscape node table for visual annotation of SSN.<br><b>2 - SSN analysis / </b><b>240430 SSN network.cys</b>Cytoscape (3.10.2) session containing networks, styles, and adjusted node table used in the manuscript. Visualisation using the yFiles Organic Layout (version 1.1.4).
提供机构:
figshare
创建时间:
2024-08-24



