Data for paper: Transfer learning for cross-context prediction of protein expression from 5'UTR sequence

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/11081335

下载链接

链接失效反馈

官方服务：

资源简介：

This depsit contains data for the paper entitled: "Transfer learning for cross-context prediction of protein expression from 5'UTR sequence". The rebeca.zip file contains a snapshot of the rebeca package which can be used to train, fine tune and test the CONV-LSTM model used in this study. The datasets.zip file contains the compiled sequence to expression datasets from across all Flow-seq expressions considered in this study. The analysis.zip file contains all data files and jupyter notebooks necessary to reproduce our analysis. Each Flow-seq study has a dedicated folder (e.g., `fepB') with two sub-folders: 1. The `data\_split' folder, which contains the steps necessary to split the Flow-seq data for our ML experiments (a `readme.txt' file describes the input and output files and a jupyter notebook is available to reproduce the data split); 2. The `data\_analysis' folder, which contains a jupyter notebook and the necessary input files to reproduce the analysis of our experiments.

本数据集存档包含题为《基于5'非翻译区（5'UTR）序列的蛋白质表达跨上下文预测迁移学习》的论文配套数据。 rebeca.zip文件包含rebeca工具包的快照版本，可用于训练、微调并测试本研究中使用的CONV-LSTM模型。 datasets.zip文件包含本研究中所有Flow-seq（流式测序）表达实验整理得到的序列-表达量数据集。 analysis.zip文件包含复现本研究分析所需的全部数据文件与Jupyter Notebook。每项Flow-seq研究均配有专属文件夹（例如`fepB`），内含两个子文件夹：1. `data_split`文件夹，其中包含为机器学习实验拆分Flow-seq数据所需的全部操作流程，另有`readme.txt`文件说明输入与输出文件的规范，同时提供可复现数据拆分流程的Jupyter Notebook；2. `data_analysis`文件夹，其中包含可复现实验分析的Jupyter Notebook与所需输入文件。

创建时间：

2024-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集