five

lilacai/lilac-squad_v2

收藏
Hugging Face2023-09-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/lilacai/lilac-squad_v2
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is generated by [Lilac](http://lilacml.com) for a HuggingFace Space: [huggingface.co/spaces/lilacai/lilac](https://huggingface.co/spaces/lilacai/lilac). Original dataset: [https://huggingface.co/datasets/squad_v2](https://huggingface.co/datasets/squad_v2) Lilac dataset config: ```embeddings: - {embedding: gte-small, path: context} name: squad_v2 namespace: local settings: preferred_embedding: gte-small ui: media_paths: - context - question - [answers, text, '*'] signals: - path: context signal: {signal_name: text_statistics} - path: context signal: {signal_name: pii} - path: context signal: {signal_name: near_dup} - path: question signal: {signal_name: spacy_ner} - path: question signal: {signal_name: pii} - path: [answers, text, '*'] signal: {signal_name: pii} - path: [answers, text, '*'] signal: {signal_name: spacy_ner} - path: [answers, text, '*'] signal: {signal_name: near_dup} - path: context signal: {signal_name: lang_detection} - path: [answers, text, '*'] signal: {signal_name: lang_detection} - path: question signal: {signal_name: near_dup} - path: question signal: {signal_name: lang_detection} - path: [answers, text, '*'] signal: {signal_name: text_statistics} - path: question signal: {signal_name: text_statistics} - path: context signal: {signal_name: spacy_ner} - path: context signal: {concept_name: question, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: non-english, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: positive-sentiment, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: negative-sentiment, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: legal-termination, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: source-code, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: toxicity, embedding: gte-small, namespace: lilac, signal_name: concept_score} - path: context signal: {concept_name: profanity, embedding: gte-small, namespace: lilac, signal_name: concept_score} source: {dataset_name: squad_v2, source_name: huggingface} tags: [machine-learning] ```
提供机构:
lilacai
原始信息汇总

数据集概述

原始数据集

配置详情

  • 名称: squad_v2
  • 命名空间: local
  • 首选嵌入: gte-small
  • 媒体路径:
    • context
    • question
    • [answers, text, *]

信号配置

  • 路径: context
    • 信号: text_statistics
    • 信号: pii
    • 信号: near_dup
    • 信号: lang_detection
    • 信号: spacy_ner
    • 概念评分:
      • question
      • non-english
      • positive-sentiment
      • negative-sentiment
      • legal-termination
      • source-code
      • toxicity
      • profanity
  • 路径: question
    • 信号: spacy_ner
    • 信号: pii
    • 信号: near_dup
    • 信号: lang_detection
    • 信号: text_statistics
  • 路径: [answers, text, *]
    • 信号: pii
    • 信号: spacy_ner
    • 信号: near_dup
    • 信号: lang_detection
    • 信号: text_statistics

来源

  • 数据集名称: squad_v2
  • 来源名称: huggingface

标签

  • [machine-learning]
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作