five

SAMPLE Job Postings Data|800M+ Deduplicated US Records Updated Hourly|Enriched Job Postings Data ...

收藏
Databricks2026-01-19 收录
下载链接:
https://marketplace.databricks.com/details/1bafc338-2625-4ea3-a37f-24c8a5307948/Canaria-Inc-_SAMPLE-Job-Postings-Data800M+-Deduplicated-US-Records-Updated-HourlyEnriched-Job-Postings-Data-
下载链接
链接失效反馈
官方服务:
资源简介:
Most job posting datasets are a mess. The same role shows up on Indeed, LinkedIn, and the company's career site, inflating counts by 2-3x. Our Job Postings Data pipeline fixes that by collapsing billions of raw postings into 800M+ deduplicated records that reflect true hiring intent. Each record in this Job Postings Data comes with over 95 enriched fields: normalized job titles, extracted skills, predicted salaries, seniority levels, and matched company profiles. Updated every hour, this is the most current view of US employer demand you can get. What This Job Postings Data Covers This Job Postings Data is sourced from Indeed, LinkedIn, and 50,000+ employer career sites. Every record includes: • Job ID, original title, and normalized title (from 50,000+ standardized categories) • Company name, location (city, state, ZIP), and SOC code • Seniority level (Entry, Mid, Senior, Lead, Executive) • Work type (Remote, Hybrid, Onsite) and employment type • Skills: 37,000+ hard skills, 3,000+ certifications, 400+ soft skills • Original salary and AI-predicted salary • Posting date, scrape date, source URL, and company ID • Benefits, qualification requirements, and full job description text Coverage is the United States, 2022 to present, updated hourly. What You Can Do With This Job Postings Data Labor Market Research Government labor data arrives 12-18 months late. Job Postings Data shows what employers are posting right now. You can track demand by city, state, or ZIP, analyze skill trends in real time, and build econometric models using Job Postings Data without look-ahead bias. • Replace stale government figures with real-time Job Postings Data to show current employer demand • Analyze hiring by ZIP code, city, or metro area using Job Postings Data geography fields • Track skill and certification demand over time using Job Postings Data historical snapshots • Build econometric models using Job Postings Data microdata instead of aggregated summaries • Identify emerging job hubs and declining markets through Job Postings Data posting concentrations Talent Acquisition HR and recruiting teams use Job Postings Data to benchmark compensation, identify competitor hiring patterns, and map talent availability by location. With hourly updates, you are working from what companies posted today, not last month. • Benchmark compensation packages against current market rates using Job Postings Data salary fields • Identify hiring spikes at target companies through Job Postings Data posting velocity • Map talent availability by location, skill, and seniority using Job Postings Data enrichment fields • Prioritize candidate sourcing by tracking in-demand skills extracted from Job Postings Data • Analyze competitor hiring patterns across departments and geographies with Job Postings Data HR Analytics People analytics teams use Job Postings Data to track hiring trends, measure skills gaps, and benchmark internal workforce against external market demand. With 95+ enriched fields per record, Job Postings Data feeds HR dashboards, compensation models, and workforce planning tools. • Track hiring trends across functions, geographies, and seniority levels using Job Postings Data • Identify skills gaps by comparing internal workforce data against external demand in Job Postings Data • Conduct competitor analysis using Job Postings Data to see which companies are hiring for similar roles • Analyze remote, hybrid, and onsite trends by industry and region using Job Postings Data work type fields • Build People Analytics dashboards fed by Job Postings Data with 95+ enriched fields per record Workforce Planning and Site Selection Site selection consultants and economic development organizations use Job Postings Data to prove labor availability and build workforce reports for RFIs. Investors also use Job Postings Data hiring velocity as a leading indicator of company health. • Prove labor availability for site selection projects with current Job Postings Data posting counts • Define functional labor sheds using drive-time analysis against Job Postings Data open positions • Build workforce availability reports for RFI responses backed by Job Postings Data microdata • Forecast hiring needs based on market trends observed in Job Postings Data • Track hiring velocity of companies and sectors using Job Postings Data as alternative data How We Build This Job Postings Data The deduplication model collapses the same job appearing across Indeed, LinkedIn, and employer career sites into one canonical record using semantic matching on title, company, location, and description. Roughly 60% of raw records collapse into unique entries, producing 800M+ unique Job Postings Data records that reflect true hiring intent instead of inflated counts. The title taxonomy model normalizes raw Job Postings Data titles into 50,000+ standardized categories from 20M+ raw titles by stripping location noise, qualifiers, and formatting variations. The skill taxonomy model extracts 37,000+ hard skills, 3,000+ certifications, and 400+ soft skills per Job Postings Data record using contextual analysis of job descriptions. Job category models classify every Job Postings Data record by seniority (Entry, Mid, Senior, Lead, Executive) and modality (Remote, Hybrid, Onsite) using LLM-based reading of the full job description. The salary estimation model predicts pay ranges for the 60-70% of Job Postings Data records without explicit salary, trained on 50M+ observations with a Mean Absolute Percentage Error under 15%. Our annotation team validates every Job Postings Data model output before delivery. What Makes This Job Postings Data Different • 800M+ deduplicated records: Job Postings Data collapses billions of raw postings into a clean signal of true hiring intent • Hourly updates: Job Postings Data refreshes every hour so you are never working with stale information • 95+ enriched fields: every Job Postings Data record includes normalized titles, extracted skills, seniority, and salary • Full ML stack: four primary AI models power Job Postings Data — deduplication, title normalization, skill extraction, and seniority classification • Company matching: Job Postings Data connects every posting to a verified company profile with firmographics • Coverage: Job Postings Data sources include Indeed, LinkedIn, and 50,000+ employer career sites from 2022 to present Who Uses This Job Postings Data • HR and Talent Acquisition Teams use Job Postings Data to benchmark compensation, source candidates, and track competitor hiring in real time • Labor Market Economists and Government Agencies use Job Postings Data as a leading indicator of workforce demand, replacing lagging survey data • Site Selection Consultants use Job Postings Data to prove labor availability and build defensible workforce studies for corporate site decisions • Economic Development Organizations use Job Postings Data to respond to RFIs with current employer demand data and workforce metrics • Staffing and Recruiting Agencies use Job Postings Data to identify market expansion opportunities and benchmark pay rates for clients • Investors and Financial Analysts use Job Postings Data hiring velocity as alternative data for due diligence and portfolio monitoring • Market Research and BI Teams use Job Postings Data to analyze industry hiring trends, track skills demand, and build labor market dashboards • HR Tech Companies and B2B Platforms use Job Postings Data to power ATS integrations, compensation tools, and talent intelligence products Delivery and Format Job Postings Data is delivered in CSV, JSON, or Parquet via AWS S3 or Google Cloud Storage. Custom filters by geography (ZIP, city, state), company, date range, title, skill, industry, seniority, and salary range. Compatible with Snowflake, Databricks, Power BI, Tableau, Salesforce, and most BI platforms.
提供机构:
Canaria Inc.
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集包含800M+条经过去重处理的美国职位发布记录,每小时更新一次,覆盖2022年至今的数据。每条记录包含95+个丰富字段,如标准化职位名称、提取技能、预测薪资和公司匹配信息,数据来源包括Indeed、LinkedIn和5万+雇主招聘网站。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作