five

JayHyeon/distilabel_shp_test

收藏
Hugging Face2024-07-12 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/JayHyeon/distilabel_shp_test
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是通过distilabel工具创建的,包含一个pipeline.yaml文件,可以用于复现生成该数据集的流程。数据集的特征包括instruction(指令)、generations(生成内容)、ratings(评分)和rationales(理由)。数据集的结构以JSON格式展示,包含一个默认配置,可以通过Hugging Face的datasets库加载。

This dataset, generated by the distilabel tool, includes four main features: instruction (string type), generations (string sequence), ratings (integer sequence), and rationales (string sequence). It is divided into a training set with 4 samples, totaling 9175 bytes. The dataset is used for analyzing and evaluating generated text.
提供机构:
JayHyeon
原始信息汇总

Dataset Card for distilabel_shp_test

Dataset Summary

  • Size Categories: n<1K
  • Download Size: 17746 bytes
  • Dataset Size: 9175 bytes
  • Tags:
    • synthetic
    • distilabel
    • rlaif

Dataset Structure

Features

  • instruction: string
  • generations: sequence of string
  • ratings: sequence of int64
  • rationales: sequence of string

Splits

  • train:
    • num_bytes: 9175
    • num_examples: 4

Configurations

  • default:
    • data_files:
      • split: train
      • path: data/train-*

Example Structure

json { "generations": [ "Currently wrapping up my PhD. There is a stark difference in work balance life between students in my lab who are focused on industry and those focused on academia. The ones in academia feel an immense stress to get high level publications (some staying 8+ years to try to push something into nature/science). The competition has become cut throat. This is a trend not just in America but in Europe, Asia and middle east. International graduate students tell me in China go back 20 years, having any ACS publication from american university is enough to get professorship. Now you better come stacked with publications and at least one nature/science. American universities are even more competitive. How many publications, how many conferences, how many patents...", "Itu2019s ironic to me that research has shown that productivity isnu2019t all itu2019s cracked up to be yet here we are." ], "instruction": "In an interview right before receiving the 2013 Nobel prize in physics, Peter Higgs stated that he wouldnu0027t be able to get an academic job today, because he wouldnu0027t be regarded as productive enough. u003e By the time he retired in 1996, he was uncomfortable with the new academic culture. "After I retired it was quite a long time before I went back to my department. I thought I was well out of it. It wasnu0027t my way of doing things any more. Today I wouldnu0027t get an academic job. Itu0027s as simple as that. I donu0027t think I would be regarded as productive enough." Another interesting quote from the article is the following: u003e He doubts a similar breakthrough could be achieved in todayu0027s academic culture, because of the expectations on academics to collaborate and keep churning out papers. He said: "Itu0027s difficult to imagine how I would ever have enough peace and quiet in the present sort of climate to do what I did in 1964." Source (the whole article is pretty interesting): http://theguardian.com/science/2013/dec/06/peter-higgs-boson-academic-system", "ratings": [ 4, 2 ], "rationales": [ "The text provides accurate and relevant information about the academic culture, aligning with the given instruction. It highlights the stress and competition faced by students in academia to get high-level publications, which is a valid concern. The mention of international graduate students and the requirement for publications in China adds a useful perspective. The text is well-structured and easy to follow, with no apparent hallucinations or inaccuracies.", "The text seems to partially address the instruction, mentioning the" ] }

Loading the Dataset

python from datasets import load_dataset

ds = load_dataset("JayHyeon/distilabel_shp_test", "default")

Or simply: python from datasets import load_dataset

ds = load_dataset("JayHyeon/distilabel_shp_test")

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作