five

yuiseki/text2geoql

收藏
Hugging Face2024-05-06 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/yuiseki/text2geoql
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: wtfpl task_categories: - text2text-generation dataset_info: features: - name: input dtype: string - name: input_type dtype: string - name: output dtype: string - name: output_type dtype: string splits: - name: train num_bytes: 567658 num_examples: 2058 download_size: 69399 dataset_size: 567658 configs: - config_name: default data_files: - split: train path: data/train-* tags: - geospatial - synthetic - overpassql --- # `text2geoql` - I have defined a natural language processing task named `text2geoql` and am in the process of building a dataset for it - `text2geoql` is a task that translates arbitrary natural language into reasonable `geoql` based on the intent - `geoql` is an abbreviation for "Geospatial data query languages" - Off course, `geoql` contains `overpassql` ## `text2geoql-dataset` - https://github.com/yuiseki/text2geoql-dataset - This repository publishes over 1000 Overpass QLs that are paired with the `TRIDENT intermediate language` - These Overpass QLs, except for the original 100 Overpass QLs, were automatically generated by TinyDolphin, an very tiny LLM fine-tuned from TinyLlama - https://huggingface.co/cognitivecomputations/TinyDolphin-2.8-1.1b - These Overpass QLs have been verified to send actual requests to the Overpass API and obtain correct results - This dataset is may be the first ever `synthetic dataset` generated by LLM in the field of GIS
提供机构:
yuiseki
原始信息汇总

数据集概述

数据集名称

  • text2geoql-dataset

数据集描述

  • 该数据集用于text2geoql任务,旨在将自然语言转换为基于意图的geoql(Geospatial data query languages)。
  • geoql包含overpassql

数据集特征

  • 输入(input):字符串类型
  • 输入类型(input_type):字符串类型
  • 输出(output):字符串类型
  • 输出类型(output_type):字符串类型

数据集拆分

  • 训练集(train):包含2058个示例,总大小为567658字节。

数据集来源

  • 数据集包含超过1000个与TRIDENT intermediate language配对的Overpass QLs。
  • 除了最初的100个Overpass QLs外,其余均由TinyDolphin自动生成,TinyDolphin是基于TinyLlama微调的非常小的语言模型。
  • 这些Overpass QLs已验证能够向Overpass API发送实际请求并获得正确结果。

数据集标签

  • geospatial
  • synthetic
  • overpassql

数据集链接

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作