US Birth Data (2000-2014)|人口统计数据集|出生数据数据集
收藏US Birth Data (2000-2014)
Overview
- Data Source: Provided by the Social Security Administration.
- Time Period: 2000 to 2014.
Data Structure
Each record in the dataset includes the following fields:
- year: Year of birth.
- month: Month of birth (January is denoted by
1
). - date_of_month: Day number of the month.
- day_of_week: Day of the week (Monday is
1
, Sunday is7
). - births: Number of births on that specific day.
Usage
To use the dataset in a JavaScript environment: javascript var dataset = require( @stdlib/datasets-ssa-us-births-2000-2014 ); var data = dataset();
Examples
The dataset can be used to analyze trends and patterns in US births. For instance, one can investigate if there are fewer births on the 13th of each month compared to other dates.
CLI Usage
The dataset can also be accessed via a command-line interface: bash npm install -g @stdlib/datasets-ssa-us-births-2000-2014-cli ssa-us-births-2000-2014
This command outputs the data in CSV format to stdout
.
References
- Bialik, Carl. 2016. "Some People Are Too Superstitious To Have A Baby On Friday The 13th." https://fivethirtyeight.com/features/some-people-are-too-superstitious-to-have-a-baby-on-friday-the-13th/.
License
- Data Files: Licensed under [Open Data Commons Public Domain Dedication & License 1.0][pddl-1.0].
- Contents: Licensed under [Creative Commons Zero v1.0 Universal][cc0].
- Software: Licensed under [Apache License, Version 2.0][apache-license].

NuminaMath-CoT
数据集包含约86万道数学题目,每道题目的解答都采用思维链(Chain of Thought, CoT)格式。数据来源包括中国高中数学练习题以及美国和国际数学奥林匹克竞赛题目。数据主要从在线考试试卷PDF和数学讨论论坛收集。处理步骤包括从原始PDF中进行OCR识别、分割成问题-解答对、翻译成英文、重新对齐以生成CoT推理格式,以及最终答案格式化。
huggingface 收录
LFW
人脸数据集;LFW数据集共有13233张人脸图像,每张图像均给出对应的人名,共有5749人,且绝大部分人仅有一张图片。每张图片的尺寸为250X250,绝大部分为彩色图像,但也存在少许黑白人脸图片。 URL: http://vis-www.cs.umass.edu/lfw/index.html#download
AI_Studio 收录
TongueDx Dataset
TongueDx数据集是一个专为远程舌诊研究设计的综合性舌象图像数据集,由香港理工大学和新加坡管理大学的研究团队创建。该数据集包含5109张图像,涵盖了多种环境条件下的舌象,图像通过智能手机和笔记本电脑摄像头采集,具有较高的多样性和代表性。数据集不仅包含舌象图像,还提供了详细的舌面属性标注,如舌色、舌苔厚度等,并附有受试者的年龄、性别等人口统计信息。数据集的创建过程包括图像采集、舌象分割、标准化处理和多标签标注,旨在解决远程医疗中舌诊图像质量不一致的问题。该数据集的应用领域主要集中在远程医疗和中医诊断,旨在通过自动化技术提高舌诊的准确性和可靠性。
arXiv 收录
红楹金融终端
红楹金融终端涵盖股票、债券、基金、指数、期货、期权、外汇、宏观、行业等各类金融信息,内置专业组合管理、期货分析、量化平台、金融计算器等多种分析研究工具,为金融机构、政府组织、企业、高校、媒体等提供全面精准的投研服务。既是跨市场、跨品种、跨地区的金融信息平台,更是集行情、数据、资讯、工具及特色研究平台于一体的金融分析系统。
合肥数据要素流通平台 收录
Is voice a marker of ASD? A systematic review and meta-analysis
Dataset reporting the statistical estimates of acoustic patterns in ASD retrieved from a systematic review of the literature. <br>The data is used in an article published in Autism Research:Fusaroli, R., Lambrechts, A., Bang, D., Bowler, D., Gaigg, S. (in press) Is voice a marker of ASD? A systematic review and meta-analysis. Autism Research.<br>The data and statistical script are available also on:https://github.com/fusaroli/AcousticPatternsInASD
DataCite Commons 收录