five

MikeGreen2710/location_with_extra_feature_outlier

收藏
Hugging Face2024-05-22 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/MikeGreen2710/location_with_extra_feature_outlier
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: house_front_std dtype: float64 - name: road_wide_std dtype: float64 - name: car_area_std dtype: float64 - name: price_std dtype: float64 - name: number_of_floors_std dtype: float64 - name: street dtype: string - name: city dtype: string - name: district dtype: string - name: ward dtype: string - name: id dtype: string - name: title dtype: string - name: text dtype: string - name: LAN sequence: string - name: overlapped dtype: float64 - name: label dtype: int64 - name: ngo dtype: bool - name: house_location_2 dtype: string - name: address dtype: string - name: duong dtype: bool - name: house_front_std_is_filled dtype: int64 - name: house_front_std_filled dtype: float64 - name: house_front_std_normed dtype: float64 - name: road_wide_std_is_filled dtype: int64 - name: road_wide_std_filled dtype: float64 - name: road_wide_std_normed dtype: float64 - name: car_area_std_is_filled dtype: int64 - name: car_area_std_filled dtype: float64 - name: car_area_std_normed dtype: float64 - name: price_std_is_filled dtype: int64 - name: price_std_filled dtype: float64 - name: price_std_normed dtype: float64 - name: number_of_floors_std_is_filled dtype: int64 - name: number_of_floors_std_filled dtype: float64 - name: number_of_floors_std_normed dtype: float64 - name: street_filled dtype: string - name: city_filled dtype: string - name: district_filled dtype: string - name: ward_filled dtype: string - name: price_median_by_location dtype: float64 - name: price_median_by_location_normed dtype: float64 - name: street_encoded dtype: float64 - name: city_encoded dtype: float64 - name: district_encoded dtype: float64 - name: ward_encoded dtype: float64 - name: street_encoded_normed dtype: float64 - name: city_encoded_normed dtype: float64 - name: district_encoded_normed dtype: float64 - name: ward_encoded_normed dtype: float64 - name: extra_data sequence: float64 - name: final_z_score dtype: float64 - name: outlier dtype: float64 - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 19288832 num_examples: 11999 download_size: 8562751 dataset_size: 19288832 configs: - config_name: default data_files: - split: train path: data/train-* ---

数据集信息: 特征: - 房屋面宽标准差(house_front_std):64位浮点数(float64) - 道路宽度标准差(road_wide_std):64位浮点数(float64) - 车库面积标准差(car_area_std):64位浮点数(float64) - 房价标准差(price_std):64位浮点数(float64) - 楼层数标准差(number_of_floors_std):64位浮点数(float64) - 街道(street):字符串型(string) - 城市(city):字符串型(string) - 行政区(district):字符串型(string) - 街区(ward):字符串型(string) - 数据编号(id):字符串型(string) - 标题(title):字符串型(string) - 文本(text):字符串型(string) - LAN序列(LAN):字符串序列(sequence of string) - 重叠度(overlapped):64位浮点数(float64) - 标签(label):64位整数(int64) - 非政府组织标识(ngo):布尔型(bool) - 二级房屋位置(house_location_2):字符串型(string) - 地址(address):字符串型(string) - 临街标识(duong):布尔型(bool) - 房屋面宽标准差是否已填充(house_front_std_is_filled):64位整数(int64) - 填充后的房屋面宽标准差(house_front_std_filled):64位浮点数(float64) - 归一化后的房屋面宽标准差(house_front_std_normed):64位浮点数(float64) - 道路宽度标准差是否已填充(road_wide_std_is_filled):64位整数(int64) - 填充后的道路宽度标准差(road_wide_std_filled):64位浮点数(float64) - 归一化后的道路宽度标准差(road_wide_std_normed):64位浮点数(float64) - 车库面积标准差是否已填充(car_area_std_is_filled):64位整数(int64) - 填充后的车库面积标准差(car_area_std_filled):64位浮点数(float64) - 归一化后的车库面积标准差(car_area_std_normed):64位浮点数(float64) - 房价标准差是否已填充(price_std_is_filled):64位整数(int64) - 填充后的房价标准差(price_std_filled):64位浮点数(float64) - 归一化后的房价标准差(price_std_normed):64位浮点数(float64) - 楼层数标准差是否已填充(number_of_floors_std_is_filled):64位整数(int64) - 填充后的楼层数标准差(number_of_floors_std_filled):64位浮点数(float64) - 归一化后的楼层数标准差(number_of_floors_std_normed):64位浮点数(float64) - 填充后的街道信息(street_filled):字符串型(string) - 填充后的城市信息(city_filled):字符串型(string) - 填充后的行政区信息(district_filled):字符串型(string) - 填充后的街区信息(ward_filled):字符串型(string) - 按位置统计的房价中位数(price_median_by_location):64位浮点数(float64) - 归一化后的按位置统计的房价中位数(price_median_by_location_normed):64位浮点数(float64) - 街道编码(street_encoded):64位浮点数(float64) - 城市编码(city_encoded):64位浮点数(float64) - 行政区编码(district_encoded):64位浮点数(float64) - 街区编码(ward_encoded):64位浮点数(float64) - 归一化后的街道编码(street_encoded_normed):64位浮点数(float64) - 归一化后的城市编码(city_encoded_normed):64位浮点数(float64) - 归一化后的行政区编码(district_encoded_normed):64位浮点数(float64) - 归一化后的街区编码(ward_encoded_normed):64位浮点数(float64) - 额外数据序列(extra_data):64位浮点数序列(sequence of float64) - 最终Z分数(final_z_score):64位浮点数(float64) - 离群值标记(outlier):64位浮点数(float64) - 索引列(__index_level_0__):64位整数(int64) 数据集划分: - 训练集(train):数据字节数19288832,样本量11999 下载大小:8562751字节 数据集总大小:19288832字节 配置信息: - 配置名称:default 数据文件: - 划分:train 路径:data/train-*
提供机构:
MikeGreen2710
原始信息汇总

数据集特征概述

主要特征及其数据类型

  • house_front_std: 浮点型 (float64)
  • road_wide_std: 浮点型 (float64)
  • car_area_std: 浮点型 (float64)
  • price_std: 浮点型 (float64)
  • number_of_floors_std: 浮点型 (float64)
  • street: 字符串 (string)
  • city: 字符串 (string)
  • district: 字符串 (string)
  • ward: 字符串 (string)
  • id: 字符串 (string)
  • title: 字符串 (string)
  • text: 字符串 (string)
  • LAN: 字符串序列 (sequence: string)
  • overlapped: 浮点型 (float64)
  • label: 整型 (int64)
  • ngo: 布尔型 (bool)
  • house_location_2: 字符串 (string)
  • address: 字符串 (string)
  • duong: 布尔型 (bool)
  • house_front_std_is_filled: 整型 (int64)
  • house_front_std_filled: 浮点型 (float64)
  • house_front_std_normed: 浮点型 (float64)
  • road_wide_std_is_filled: 整型 (int64)
  • road_wide_std_filled: 浮点型 (float64)
  • road_wide_std_normed: 浮点型 (float64)
  • car_area_std_is_filled: 整型 (int64)
  • car_area_std_filled: 浮点型 (float64)
  • car_area_std_normed: 浮点型 (float64)
  • price_std_is_filled: 整型 (int64)
  • price_std_filled: 浮点型 (float64)
  • price_std_normed: 浮点型 (float64)
  • number_of_floors_std_is_filled: 整型 (int64)
  • number_of_floors_std_filled: 浮点型 (float64)
  • number_of_floors_std_normed: 浮点型 (float64)
  • street_filled: 字符串 (string)
  • city_filled: 字符串 (string)
  • district_filled: 字符串 (string)
  • ward_filled: 字符串 (string)
  • price_median_by_location: 浮点型 (float64)
  • price_median_by_location_normed: 浮点型 (float64)
  • street_encoded: 浮点型 (float64)
  • city_encoded: 浮点型 (float64)
  • district_encoded: 浮点型 (float64)
  • ward_encoded: 浮点型 (float64)
  • street_encoded_normed: 浮点型 (float64)
  • city_encoded_normed: 浮点型 (float64)
  • district_encoded_normed: 浮点型 (float64)
  • ward_encoded_normed: 浮点型 (float64)
  • extra_data: 浮点型序列 (sequence: float64)
  • final_z_score: 浮点型 (float64)
  • outlier: 浮点型 (float64)
  • index_level_0: 整型 (int64)

数据集分割

  • train: 包含11999个样本,数据大小为19288832字节。

数据集大小

  • 下载大小: 8562751字节
  • 数据集大小: 19288832字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作