koba-jon/normal_distribution_dataset
收藏Hugging Face2025-12-02 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/koba-jon/normal_distribution_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
size_categories:
- 100K<n<1M
---
# Normal Distribution Dataset
1-dimensional shape dataset generated by random numbers of normal distribution
## 1. Usage
### (1) Original Dataset
#### Get
```
$ git clone https://huggingface.co/datasets/koba-jon/normal_distribution_dataset
$ cd normal_distribution_dataset/NormalDistribution
$ ls -l
```
#### Hierarchy
- train : training data (100,000 pieces) of 300 dimensions
```
train
|--0
|--00000.dat
|--00001.dat
| ...
|--09999.dat
|--1
| ...
|--9
|--90000.dat
|--90001.dat
| ...
|--99999.dat
```
- test : test data (1,000 pieces) of 300 dimensions
```
test
|--000.dat
|--001.dat
| ...
|--999.dat
```
### (2) Create Dataset
```
$ sudo apt install python3 python3-pip
$ pip3 install numpy
```
```
$ vi scripts/create.sh
```
`--dir` : output directory <br>
`--num` : total number of data <br>
`--dim` : dimensions on one data <br>
`--list` : list of mean and standard deviation of normal distribution <br>
`--seed` : seed of random number <br>
```
#!/bin/bash
MODE='create'
SCRIPT_DIR=$(cd $(dirname $0); pwd)
python3 ${SCRIPT_DIR}/create.py \
--dir "./NormalDistribution/${MODE}" \
--num 100 \
--dim 300 \
--list "./list/params.txt" \
--seed 0
```
```
$ sh scripts/create.sh
```
许可证:MIT协议
语言:英语
数据规模分类:10万 < 样本数 < 100万
# 正态分布数据集(Normal Distribution Dataset)
基于正态分布随机数生成的一维形态数据集
## 1. 使用方法
### (1)原始数据集
#### 获取方式
$ git clone https://huggingface.co/datasets/koba-jon/normal_distribution_dataset
$ cd normal_distribution_dataset/NormalDistribution
$ ls -l
#### 目录结构
- 训练集(train):包含10万条300维训练数据
train
|--0
|--00000.dat
|--00001.dat
| ...
|--09999.dat
|--1
| ...
|--9
|--90000.dat
|--90001.dat
| ...
|--99999.dat
- 测试集(test):包含1000条300维测试数据
test
|--000.dat
|--001.dat
| ...
|--999.dat
### (2)数据集生成
$ sudo apt install python3 python3-pip
$ pip3 install numpy
$ vi scripts/create.sh
参数说明:
`--dir`:输出目录路径
`--num`:总数据量
`--dim`:单条数据的维度
`--list`:正态分布均值与标准差参数列表文件路径
`--seed`:随机数种子
#!/bin/bash
MODE='create'
SCRIPT_DIR=$(cd $(dirname $0); pwd)
python3 ${SCRIPT_DIR}/create.py
--dir "./NormalDistribution/${MODE}"
--num 100
--dim 300
--list "./list/params.txt"
--seed 0
$ sh scripts/create.sh
提供机构:
koba-jon



