dynamicPDB_complete
收藏魔搭社区2026-05-12 更新2025-08-16 收录
下载链接:
https://modelscope.cn/datasets/fudan-generative-vision/dynamicPDB_complete
下载链接
链接失效反馈官方服务:
资源简介:
## Data Usage
1. Make sure you have Git LFS installed:
```shell
sudo apt-get install git-lfs
# Initialize Git LFS
git lfs install
```
2. Navigate to your `DATA_ROOT` and clone the source:
```shell
GIT_LFS_SKIP_SMUDGE=1 git clone https://www.modelscope.cn/datasets/fudan-generative-vision/dynamicPDB_complete.git dynamicPDB_complete_raw
```
`GIT_LFS_SKIP_SMUDGE=1` configures Git to clone the pointers for all LFS files.
3. Download data with a specific `protein_id`, for example `1ab1_A`:
```shell
cd dynamicPDB_complete_raw
git lfs pull --include="{protein_id}/*"
```
4. Merge the split-volume compression into one file and then unzip the `.tar.gz` file:
```shell
cat {protein_id}/{protein_id}.tar.gz.part* > {protein_id}/{protein_id}.tar.gz
cd ${Your Storage Root}
mkdir dynamicPDB_complete # ignore if directory exists
tar -xvzf dynamicPDB_complete_raw/{protein_id}/{protein_id}.tar.gz -C dynamicPDB_complete
```
Finally, the dataset should be organized as follows:
```text
./dynamicPDB_complete/
|-- 1ab1_A_npt1000000.0_ts0.001
| |-- 1ab1_A_npt_sim_data
| | |-- 1ab1_A_npt_sim_0.dat
| | `-- ...
| |-- 1ab1_A.pdb
| |-- 1ab1_A_minimized.pdb
| |-- 1ab1_A_nvt_equi.dat
| |-- 1ab1_A_npt_equi.dat
| |-- 1ab1_A_T.dcd
| |-- 1ab1_A_T.pkl
| |-- 1ab1_A_F.pkl
| |-- 1ab1_A_V.pkl
| `-- 1ab1_A_state_npt1000000.0.xml
|-- 1d3y_B_npt1000000.0_ts0.001
| |-- ...
| `-- ...
`-- ...
```
# 数据集使用指南
1. 确保已安装Git大文件存储(Git Large File Storage, LFS):
shell
sudo apt-get install git-lfs
# 初始化Git大文件存储
git lfs install
2. 进入你的`DATA_ROOT`目录,克隆源数据集:
shell
GIT_LFS_SKIP_SMUDGE=1 git clone https://www.modelscope.cn/datasets/fudan-generative-vision/dynamicPDB_complete.git dynamicPDB_complete_raw
该参数用于配置Git仅克隆所有LFS文件的索引指针,而非实际文件内容。
3. 按指定蛋白质ID(如`1ab1_A`)下载对应数据:
shell
cd dynamicPDB_complete_raw
git lfs pull --include="{protein_id}/*"
4. 合并分卷压缩包为单个文件,随后解压该`.tar.gz`压缩包:
shell
cat {protein_id}/{protein_id}.tar.gz.part* > {protein_id}/{protein_id}.tar.gz
cd ${Your Storage Root}
mkdir dynamicPDB_complete # 若目录已存在可忽略该命令
tar -xvzf dynamicPDB_complete_raw/{protein_id}/{protein_id}.tar.gz -C dynamicPDB_complete
最终,数据集的目录结构应如下所示:
text
./dynamicPDB_complete/
|-- 1ab1_A_npt1000000.0_ts0.001
| |-- 1ab1_A_npt_sim_data
| | |-- 1ab1_A_npt_sim_0.dat
| | `-- ...
| |-- 1ab1_A.pdb
| |-- 1ab1_A_minimized.pdb
| |-- 1ab1_A_nvt_equi.dat
| |-- 1ab1_A_npt_equi.dat
| |-- 1ab1_A_T.dcd
| |-- 1ab1_A_T.pkl
| |-- 1ab1_A_F.pkl
| |-- 1ab1_A_V.pkl
| `-- 1ab1_A_state_npt1000000.0.xml
|-- 1d3y_B_npt1000000.0_ts0.001
| |-- ...
| `-- ...
`-- ...
提供机构:
maas
创建时间:
2025-07-29



