yuiseki/text2geoql
收藏Hugging Face2024-05-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/yuiseki/text2geoql
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: wtfpl
task_categories:
- text2text-generation
dataset_info:
features:
- name: input
dtype: string
- name: input_type
dtype: string
- name: output
dtype: string
- name: output_type
dtype: string
splits:
- name: train
num_bytes: 567658
num_examples: 2058
download_size: 69399
dataset_size: 567658
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
tags:
- geospatial
- synthetic
- overpassql
---
# `text2geoql`
- I have defined a natural language processing task named `text2geoql` and am in the process of building a dataset for it
- `text2geoql` is a task that translates arbitrary natural language into reasonable `geoql` based on the intent
- `geoql` is an abbreviation for "Geospatial data query languages"
- Off course, `geoql` contains `overpassql`
## `text2geoql-dataset`
- https://github.com/yuiseki/text2geoql-dataset
- This repository publishes over 1000 Overpass QLs that are paired with the `TRIDENT intermediate language`
- These Overpass QLs, except for the original 100 Overpass QLs, were automatically generated by TinyDolphin, an very tiny LLM fine-tuned from TinyLlama
- https://huggingface.co/cognitivecomputations/TinyDolphin-2.8-1.1b
- These Overpass QLs have been verified to send actual requests to the Overpass API and obtain correct results
- This dataset is may be the first ever `synthetic dataset` generated by LLM in the field of GIS
提供机构:
yuiseki
原始信息汇总
数据集概述
数据集名称
text2geoql-dataset
数据集描述
- 该数据集用于
text2geoql任务,旨在将自然语言转换为基于意图的geoql(Geospatial data query languages)。 geoql包含overpassql。
数据集特征
- 输入(input):字符串类型
- 输入类型(input_type):字符串类型
- 输出(output):字符串类型
- 输出类型(output_type):字符串类型
数据集拆分
- 训练集(train):包含2058个示例,总大小为567658字节。
数据集来源
- 数据集包含超过1000个与
TRIDENT intermediate language配对的Overpass QLs。 - 除了最初的100个Overpass QLs外,其余均由TinyDolphin自动生成,TinyDolphin是基于TinyLlama微调的非常小的语言模型。
- 这些Overpass QLs已验证能够向Overpass API发送实际请求并获得正确结果。
数据集标签
geospatialsyntheticoverpassql



