Replication Data for: Protein Secondary Structure in Spider Silk Nanofibrils
收藏DataONE2022-06-04 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:3878682438949a5da98d9a7802c500c3ebc430f436410cb3c258a6d52a749696
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains raw data, processed data, and the codes used for data processing in our manuscript from our Fourier-transform infrared (FTIR) spectroscopy, Nuclear magnetic resonance (NMR), Raman spectroscopy, and X-ray diffraction (XRD) experiments. The data and codes for the fits of our unpolarized Raman spectra to polypeptide spectra is also included. The following explains the folder structure of the data provided in this dataset, which is also explained in the file ReadMe.txt. Browsing the data in \"Tree\" view is recommended. Folder contents Codes Raman Data Processing: The MATLAB script file RamanDecomposition.m contains the code to decompose the sub-peaks across different polarized Raman spectra (XX, XZ, ZX, ZZ, and YY), considering a set of pre-determined restrictions. The helper functions used in RamanDecomposition.m are included in the Helpers folder. RamanDecomposition.pdf is a PDF printout of the MATLAB code and output. P Value Simulation: 31_helix.ipynb and a_helix.ipynb: These two Jupyter Notebook files contain the intrinsic P value simulation for the 31-helix and alpha-helix structures. The simulation results were used to prepare Supplementary Table 4. See more details in the comments contained. Vector.py, Atom.py, Amino.py, and Helpers.py: These python files contains the class definitions used in 31_helix.ipynb and a_helix.ipynb. See more details in the comments contained. FTIR FTIR Raw Transmission.opj: This Origin data file contains the raw transmission data measured on single silk strand and used for FTIR spectra analysis. FTIR Deconvoluted Oscillators.opj: This Origin data file was generated from the data contained in the previous file using W-VASE software from J. A. Woollam, Inc. FTIR Unpolarized MultiStrand Raw Transmission.opj: This Origin data file contains the raw transmission data measured on multiple silk strands. The datasets contained in the first two files above were used to plot Figure 2a-b and the FTIR data points in Figure 4a, and Supplementary Figure 6. The datasets contained in the third file above were used to plot Supplementary Figure 3a. The datasets contained in the first two files above were used to plot Figure 2a-b, FTIR data points in Figure 4a, and Supplementary Figure 6. NMR Raw data files of the 13C MAS NMR spectra: ascii-spec_CP.txt: cross-polarized spectrum ascii-spec_DP.txt: direct-polarized spectrum Data is in ASCII format (comma separated values) using the following columns: Data point number Intensity Frequency [Hz] Frequency [ppm] Polypeptide Spectrum Fits MATLAB scripts (.m files) and Helpers: The MATLAB script file Raman_Fitting_Process_Part_1.m and Raman_Fitting_Process_Part_2.m contains the step-by-step instructions to perform the fitting process of our calculated unpolarized Raman spectrum, using digitized model polypeptide Raman spectra. The Helper folder contains two helper functions used by the above scripts. See the scripts for further instruction and information. Data aPA.csv, bPA.csv, GlyI.csv, GlyII.csv files: These csv files contain the digitized Raman spectra of poly-alanine, beta-alanine, poly-glycine-I, and poly-glycine-II. Raman_Exp_Data.mat: This MATLAB data file contains the processed, polarized Raman spectra obtained from our experiments. Variable freq is the wavenumber information of each collected spectrum. The variables xx, yy, zz, xz, zx represent the polarized Raman spectra collected. These variables are used to calculate the unpolarized Raman spectrum in Raman_Fitting_Process_Part_2.m. See the scripts for further instruction and information. Raman Raman Raw Data.mat: This MATLAB data file contains all the raw data used for Raman spectra analysis. All variables are of MATLAB structure data type. Each variable has fields called Freq and Raw, with Freq contains the wavenumber information of the measured spectra and Raw contains 5 measured Raman signal strengths. Variable XX, XZ, ZX, ZZ, and YY were used to plot and sub-peak analysis for Figure 2c-d, Raman data points in Figure 4a, Figure 5b, Supplementary Figure 2, and Supplementary Figure 7. Variable WideRange was used to plot and identify the peaks for Supplementary Figure 3b. X-Ray X-Ray.mat: This MATLAB data file contains the raw X-ray data used for the diffraction analysis in Supplementary Figure 5.
本数据集包含本论文手稿中涉及的傅里叶变换红外(Fourier-transform infrared, FTIR)光谱、核磁共振(Nuclear magnetic resonance, NMR)、拉曼光谱及X射线衍射(X-ray diffraction, XRD)实验的原始数据、处理后数据与数据处理代码,同时包含非偏振拉曼光谱拟合至多肽光谱的相关数据与代码。
下文将说明本数据集提供的数据文件夹结构,相关说明也可参见ReadMe.txt文件,建议以“树形”视图浏览数据。
### Codes 目录下的 Raman Data Processing 模块:
MATLAB 脚本文件 `RamanDecomposition.m` 包含了针对不同偏振态拉曼光谱(XX、XZ、ZX、ZZ 及 YY)进行子峰分解的代码,该代码考虑了一组预设约束条件。`RamanDecomposition.m` 中使用的辅助函数已存放至 Helpers 文件夹中。`RamanDecomposition.pdf` 为该 MATLAB 代码及其运行输出的 PDF 打印文档。
### P Value Simulation 模块:
`31_helix.ipynb` 与 `a_helix.ipynb`:这两个 Jupyter Notebook 文件包含了针对 3₁ 螺旋与 α 螺旋结构的本征 P 值模拟代码,其模拟结果用于制作补充表 4,更多细节可参见文件内的注释。
`Vector.py`、`Atom.py`、`Amino.py` 及 `Helpers.py`:这些 Python 文件包含了 `31_helix.ipynb` 与 `a_helix.ipynb` 中使用的类定义,更多细节可参见文件内的注释。
#### FTIR
`FTIR Raw Transmission.opj`:该 Origin 数据文件包含了单根蚕丝纤维的原始透射数据,用于 FTIR 光谱分析。
`FTIR Deconvoluted Oscillators.opj`:该 Origin 数据文件由前一文件中的数据通过 J. A. Woollam, Inc. 的 W-VASE 软件生成。
`FTIR Unpolarized MultiStrand Raw Transmission.opj`:该 Origin 数据文件包含了多根蚕丝纤维的原始透射数据。
前述前两个文件中的数据集用于绘制图 2a-b、图 4a 中的 FTIR 数据点以及补充图 6;前述第三个文件中的数据集用于绘制补充图 3a。
#### NMR
¹³C MAS NMR 光谱的原始数据文件:
`ascii-spec_CP.txt`:交叉极化光谱
`ascii-spec_DP.txt`:直接极化光谱
数据采用 ASCII 格式(逗号分隔值),列依次为:数据点编号、强度、频率[Hz]、频率[ppm]
#### Polypeptide Spectrum Fits
MATLAB 脚本(.m 文件)与辅助工具:
MATLAB 脚本文件 `Raman_Fitting_Process_Part_1.m` 与 `Raman_Fitting_Process_Part_2.m` 包含了将我们计算得到的非偏振拉曼光谱与数字化模型多肽拉曼光谱进行拟合的分步流程。Helpers 文件夹包含了上述脚本使用的两个辅助函数,更多说明与信息可参见脚本内容。
数据文件:
`aPA.csv`、`bPA.csv`、`GlyI.csv`、`GlyII.csv`:这些 CSV 文件包含了聚丙氨酸、β-丙氨酸、聚甘氨酸-I 及聚甘氨酸-II 的数字化拉曼光谱。
`Raman_Exp_Data.mat`:该 MATLAB 数据文件包含了经处理后的实验偏振拉曼光谱。其中变量 `freq` 为每条采集光谱的波数信息,变量 `xx`、`yy`、`zz`、`xz`、`zx` 分别代表采集得到的偏振拉曼光谱,这些变量用于在 `Raman_Fitting_Process_Part_2.m` 中计算非偏振拉曼光谱,更多说明与信息可参见脚本内容。
#### Raman
`Raman Raw Data.mat`:该 MATLAB 数据文件包含了所有用于拉曼光谱分析的原始数据。所有变量均为 MATLAB 结构体数据类型,每个变量均包含 `Freq` 与 `Raw` 两个字段:`Freq` 存储被测光谱的波数信息,`Raw` 存储 5 组测得的拉曼信号强度。
变量 `XX`、`XZ`、`ZX`、`ZZ` 及 `YY` 用于绘制图 2c-d、图 4a 中的拉曼数据点、图 5b、补充图 2 及补充图 7 并进行子峰分析;变量 `WideRange` 用于绘制补充图 3b 并识别峰位。
#### X-Ray
`X-Ray.mat`:该 MATLAB 数据文件包含了用于补充图 5 中衍射分析的原始 X 射线数据。
创建时间:
2023-11-09



