five

Chaos Game Representation (CGR images ) of SARS-CoV-2 Variants (Alpha,Beta, Delta, Gamma and Omicron)

收藏
Mendeley Data2024-03-27 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/2x546shhwk
下载链接
链接失效反馈
官方服务:
资源简介:
Currently available genome sequence classification methods are based on text or sequence alignment techniques. Our aim is to build an image-based genome sequence classifier using deep learning technique. In 1990 H J Jeffry proposed a method Chaos Game Representation (CGR), which converts long one-dimensional sequences into two-dimensional images. This dataset contains the CGR images of genomic sequences of SARS-CoV-2 Variants - alpha, beta, delta, gamma, and omicron. The dataset is divided into three folders named train, test, and validate. Each folder contains five subfolders named alpha, beta, delta, gamma, and omicron. The "train" folder has a total of 17500 images - 3500 images in each subfolder. The "test" folder has 5000 images - 1000 from each category. The "validate" folder has 2500 images - 500 images from each individual class. Genomic sequences of the above-mentioned SARS- CoV-2 variants were downloaded from the GISAID database and the sequences were then converted to CGR images using a python script.
创建时间:
2024-01-23
二维码
社区交流群
二维码
科研交流群
商业服务