DatamadeA1/kentucky-businesses
收藏Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/DatamadeA1/kentucky-businesses
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-classification
- text-retrieval
- text-generation
language:
- en
tags:
- kentucky
- local-business
- directory
- tourism
- geospatial
- small-business
pretty_name: Kentucky Businesses (YouMeKY)
size_categories:
- 100K<n<1M
---
# Kentucky Businesses (YouMeKY)
> The living atlas of Kentucky — every business, every town, every trail — free for readers, open for developers, and made for the people who live here.
**YouMeKY** is a Kentucky-first local business directory. This dataset is the public view of our business graph: ~360K Kentucky places with category, location, contact info, narrative summaries, attributes, and KY-specific signals (Kentucky Proud status, local-ingredient sourcing, local-gem scoring).
- Canonical site: https://youmeky.ai
- Developer API (machine access, pay-as-you-go): https://youmeky.ai/developers
- Refresh cadence: manually refreshed as the curated graph improves (see commit history)
## What's in it
One row per business. Each row ties back to a canonical page at `https://youmeky.ai/business/{place_id}`.
**Identity & location** — `place_id`, `name`, `category`, `template_family`, `address`, `town_slug`, `latitude`, `longitude`, `canonical_url`
**Kentucky signals** — `is_gem`, `gem_mode`, `local_gem_score`, `kentucky_proud_status`, `sources_local_ingredients`, `ky_sos_entity_name`, `ky_sos_status`, `ky_sos_formation_date`, `founding_year`
**Narrative** — `seo_description`, `editorial_summary`, `community_contribution_summary`, `chef_name`, `founder_name`, `owner_name`, `signature_dishes`
**Contact & links** — `phone`, `website`, `social_*`, `contact_page_url`, `menu_page_url`, `order_online_url`, `reservation_url`, `gift_cards_url`
**Service attributes** — `takeout`, `delivery`, `dine_in`, `outdoor_seating`, `serves_*`, `has_wifi`, `wheelchair_accessible`, `allows_dogs`, and more
**Our own ratings** — `rating`, `rating_count` (YouMeKY-native review system; **not imported from Google or Yelp**)
**Provenance** — `data_sources`, `source`
## What is _not_ in it
By policy, the public dataset excludes:
- Google place IDs and Maps URIs (Google ToS)
- Email addresses (PII — often personal, not business)
- Third-party metrics under separate license (Yelp ratings/counts, Facebook likes)
- YouMeKY's internal monetization state (sponsorship flags, brand feature reasons, membership tiers)
- Internal enrichment pipeline state (curation status, confidence scores, version stamps)
If you need our scoring internals for a research collaboration, open a discussion.
## Example use cases
- **Local-commerce recommenders** — train on rich attribute + narrative signal for a single US state
- **Geospatial reasoning** — dense coverage across rural + urban Kentucky
- **Entity resolution benchmarks** — multi-source provenance (`data_sources`) with known duplicates to resolve
- **Small-business NLP** — `editorial_summary`, `community_contribution_summary`, and `seo_description` are written to describe real places in plain English
- **Tourism / travel planning agents** — attributes + categories + location are ready for retrieval
## License
**CC-BY-4.0** — use it freely for any purpose, commercial or otherwise. Attribution requested: link back to `https://youmeky.ai` or cite this dataset.
The underlying businesses are real. Please don't use this dataset to build harassment tools, shadow directories that strip attribution, or spam pipelines targeting Kentucky small businesses.
## How the data is built
- Ingested from Overture Maps, OpenStreetMap, CommonCrawl, KY Secretary of State, Census County Business Patterns, and targeted web crawls of business websites.
- Enriched by a local Nemotron 3 Super 120B pipeline running on our hardware — no third-party LLM calls.
- Entity-resolved, deduped, and gated through a `public_places_v1` view that enforces publishability (has a URL, is in Kentucky, has survived our state machine).
- Each row in this export was publishable on `youmeky.ai` at export time.
## Related resources
- **Live site** — https://youmeky.ai
- **Developer API** — https://youmeky.ai/developers (query-by-query access with daily dedup, $0.005/lookup)
- **Suggest a spot** — https://youmeky.ai/feature-your-spot (free enrichment queue)
## Citation
```
@misc{youmeky2026,
title = {Kentucky Businesses: The YouMeKY Living Atlas},
author = {{YouMeKY}},
year = {2026},
url = {https://huggingface.co/datasets/DatamadeA1/kentucky-businesses},
note = {CC-BY-4.0}
}
```
## Contact
Questions, corrections, collaboration requests: open a discussion on this dataset or reach us via https://youmeky.ai.
提供机构:
DatamadeA1



