five

DatamadeA1/kentucky-businesses

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/DatamadeA1/kentucky-businesses
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-classification - text-retrieval - text-generation language: - en tags: - kentucky - local-business - directory - tourism - geospatial - small-business pretty_name: Kentucky Businesses (YouMeKY) size_categories: - 100K<n<1M --- # Kentucky Businesses (YouMeKY) > The living atlas of Kentucky — every business, every town, every trail — free for readers, open for developers, and made for the people who live here. **YouMeKY** is a Kentucky-first local business directory. This dataset is the public view of our business graph: ~360K Kentucky places with category, location, contact info, narrative summaries, attributes, and KY-specific signals (Kentucky Proud status, local-ingredient sourcing, local-gem scoring). - Canonical site: https://youmeky.ai - Developer API (machine access, pay-as-you-go): https://youmeky.ai/developers - Refresh cadence: manually refreshed as the curated graph improves (see commit history) ## What's in it One row per business. Each row ties back to a canonical page at `https://youmeky.ai/business/{place_id}`. **Identity & location** — `place_id`, `name`, `category`, `template_family`, `address`, `town_slug`, `latitude`, `longitude`, `canonical_url` **Kentucky signals** — `is_gem`, `gem_mode`, `local_gem_score`, `kentucky_proud_status`, `sources_local_ingredients`, `ky_sos_entity_name`, `ky_sos_status`, `ky_sos_formation_date`, `founding_year` **Narrative** — `seo_description`, `editorial_summary`, `community_contribution_summary`, `chef_name`, `founder_name`, `owner_name`, `signature_dishes` **Contact & links** — `phone`, `website`, `social_*`, `contact_page_url`, `menu_page_url`, `order_online_url`, `reservation_url`, `gift_cards_url` **Service attributes** — `takeout`, `delivery`, `dine_in`, `outdoor_seating`, `serves_*`, `has_wifi`, `wheelchair_accessible`, `allows_dogs`, and more **Our own ratings** — `rating`, `rating_count` (YouMeKY-native review system; **not imported from Google or Yelp**) **Provenance** — `data_sources`, `source` ## What is _not_ in it By policy, the public dataset excludes: - Google place IDs and Maps URIs (Google ToS) - Email addresses (PII — often personal, not business) - Third-party metrics under separate license (Yelp ratings/counts, Facebook likes) - YouMeKY's internal monetization state (sponsorship flags, brand feature reasons, membership tiers) - Internal enrichment pipeline state (curation status, confidence scores, version stamps) If you need our scoring internals for a research collaboration, open a discussion. ## Example use cases - **Local-commerce recommenders** — train on rich attribute + narrative signal for a single US state - **Geospatial reasoning** — dense coverage across rural + urban Kentucky - **Entity resolution benchmarks** — multi-source provenance (`data_sources`) with known duplicates to resolve - **Small-business NLP** — `editorial_summary`, `community_contribution_summary`, and `seo_description` are written to describe real places in plain English - **Tourism / travel planning agents** — attributes + categories + location are ready for retrieval ## License **CC-BY-4.0** — use it freely for any purpose, commercial or otherwise. Attribution requested: link back to `https://youmeky.ai` or cite this dataset. The underlying businesses are real. Please don't use this dataset to build harassment tools, shadow directories that strip attribution, or spam pipelines targeting Kentucky small businesses. ## How the data is built - Ingested from Overture Maps, OpenStreetMap, CommonCrawl, KY Secretary of State, Census County Business Patterns, and targeted web crawls of business websites. - Enriched by a local Nemotron 3 Super 120B pipeline running on our hardware — no third-party LLM calls. - Entity-resolved, deduped, and gated through a `public_places_v1` view that enforces publishability (has a URL, is in Kentucky, has survived our state machine). - Each row in this export was publishable on `youmeky.ai` at export time. ## Related resources - **Live site** — https://youmeky.ai - **Developer API** — https://youmeky.ai/developers (query-by-query access with daily dedup, $0.005/lookup) - **Suggest a spot** — https://youmeky.ai/feature-your-spot (free enrichment queue) ## Citation ``` @misc{youmeky2026, title = {Kentucky Businesses: The YouMeKY Living Atlas}, author = {{YouMeKY}}, year = {2026}, url = {https://huggingface.co/datasets/DatamadeA1/kentucky-businesses}, note = {CC-BY-4.0} } ``` ## Contact Questions, corrections, collaboration requests: open a discussion on this dataset or reach us via https://youmeky.ai.
提供机构:
DatamadeA1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作