five

nataliaElv/textclass_descriptives_vectors

收藏
Hugging Face2023-11-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/nataliaElv/textclass_descriptives_vectors
下载链接
链接失效反馈
官方服务:
资源简介:
--- size_categories: 1K<n<10K tags: - rlfh - argilla - human-feedback --- # Dataset Card for textclass_descriptives_vectors This dataset has been created with [Argilla](https://docs.argilla.io). As shown in the sections below, this dataset can be loaded into Argilla as explained in [Load with Argilla](#load-with-argilla), or used directly with the `datasets` library in [Load with `datasets`](#load-with-datasets). ## Dataset Description - **Homepage:** https://argilla.io - **Repository:** https://github.com/argilla-io/argilla - **Paper:** - **Leaderboard:** - **Point of Contact:** ### Dataset Summary This dataset contains: * A dataset configuration file conforming to the Argilla dataset format named `argilla.yaml`. This configuration file will be used to configure the dataset when using the `FeedbackDataset.from_huggingface` method in Argilla. * Dataset records in a format compatible with HuggingFace `datasets`. These records will be loaded automatically when using `FeedbackDataset.from_huggingface` and can be loaded independently using the `datasets` library via `load_dataset`. * The [annotation guidelines](#annotation-guidelines) that have been used for building and curating the dataset, if they've been defined in Argilla. ### Load with Argilla To load with Argilla, you'll just need to install Argilla as `pip install argilla --upgrade` and then use the following code: ```python import argilla as rg ds = rg.FeedbackDataset.from_huggingface("nataliaElv/textclass_descriptives_vectors") ``` ### Load with `datasets` To load this dataset with `datasets`, you'll just need to install `datasets` as `pip install datasets --upgrade` and then use the following code: ```python from datasets import load_dataset ds = load_dataset("nataliaElv/textclass_descriptives_vectors") ``` ### Supported Tasks and Leaderboards This dataset can contain [multiple fields, questions and responses](https://docs.argilla.io/en/latest/conceptual_guides/data_model.html#feedback-dataset) so it can be used for different NLP tasks, depending on the configuration. The dataset structure is described in the [Dataset Structure section](#dataset-structure). There are no leaderboards associated with this dataset. ### Languages [More Information Needed] ## Dataset Structure ### Data in Argilla The dataset is created in Argilla with: **fields**, **questions**, **suggestions**, **metadata**, **vectors**, and **guidelines**. The **fields** are the dataset records themselves, for the moment just text fields are supported. These are the ones that will be used to provide responses to the questions. | Field Name | Title | Type | Required | Markdown | | ---------- | ----- | ---- | -------- | -------- | | prompt | Prompt | text | True | True | | context | Context | text | False | True | The **questions** are the questions that will be asked to the annotators. They can be of different types, such as rating, text, label_selection, multi_label_selection, or ranking. | Question Name | Title | Type | Required | Description | Values/Labels | | ------------- | ----- | ---- | -------- | ----------- | ------------- | | class | Classify the instruction according to its class | label_selection | True | N/A | ['closed_qa', 'classification', 'open_qa', 'information_extraction', 'brainstorming', 'general_qa', 'summarization', 'creative_writing'] | | response | Response | text | True | N/A | N/A | The **suggestions** are human or machine generated recommendations for each question to assist the annotator during the annotation process, so those are always linked to the existing questions, and named appending "-suggestion" and "-suggestion-metadata" to those, containing the value/s of the suggestion and its metadata, respectively. So on, the possible values are the same as in the table above, but the column name is appended with "-suggestion" and the metadata is appended with "-suggestion-metadata". The **metadata** is a dictionary that can be used to provide additional information about the dataset record. This can be useful to provide additional context to the annotators, or to provide additional information about the dataset record itself. For example, you can use this to provide a link to the original source of the dataset record, or to provide additional information about the dataset record itself, such as the author, the date, or the source. The metadata is always optional, and can be potentially linked to the `metadata_properties` defined in the dataset configuration file in `argilla.yaml`. **✨ NEW** The **vectors** are different columns that contain a vector in floating point, which is constraint to the pre-defined dimensions in the **vectors_settings** when configuring the vectors within the dataset itself, also the dimensions will always be 1-dimensional. The **vectors** are optional and identified by the pre-defined vector name in the dataset configuration file in `argilla.yaml`. | Vector Name | Title | Dimensions | |-------------|-------|------------| | prompt-similarity | prompt-similarity | [1, 768] | | Metadata Name | Title | Type | Values | Visible for Annotators | | ------------- | ----- | ---- | ------ | ---------------------- | | n_characters | No. of characters | integer | None - None | True | | passed_quality_check | Passed quality check? | terms | - | True | | flesch_reading_ease | Reading ease | float | None - None | True | | entropy | Entropy | float | None - None | True | The **guidelines**, are optional as well, and are just a plain string that can be used to provide instructions to the annotators. Find those in the [annotation guidelines](#annotation-guidelines) section. ### Data Instances An example of a dataset instance in Argilla looks as follows: ```json { "external_id": null, "fields": { "prompt": "Can brain cells move? By movement I mean long distance migration (preferably within the brain only)." }, "metadata": { "entropy": 0.4352176404374839, "flesch_reading_ease": 82.39000000000001, "n_characters": 85, "passed_quality_check": "True" }, "responses": [], "suggestions": [ { "agent": "setfit", "question_name": "class", "score": 0.2823514938354492, "type": null, "value": "brainstorming" }, { "agent": null, "question_name": "response", "score": null, "type": null, "value": "The question is relatively broad and one should take into account that the brain not only consists of neurons, but also glial cells (supportive cells) and pre-mitotic neuronal stem cells. Furthermore, as critical fellow-scientists have indicated, developmental stage is very important, as the developing embryonic brain is very different from the adult brain.\nHowever, after sifting through various publications, the answer to the question is actually remarkably simple: Yes, brain cells migrate.\nIn the adult brain glial cells migrate in the brain (Kl\u00e4mbt, 2009). Glial cells are involved in a myriad of functions, but a notable example of migrating glial cells are the oligodendrocytes that migrate relative long distances to find their target axons onto which they wrap themselves to form the insulating myelin sheath (Tsai and Miller, 2002).\nNeuronal stem cells migrate over long distances in response to injury (Imitola et al., 2004) and they migrate from specific stem-cell locations (e.g., hippocampus and subventricular zone) to other regions (Clarke, 2003).\nPost-mitotic, but non-differentiated neurons have been shown to migrate in the adult brain in fish (Scott et al., 2012), and in mammals and non-human primates as well (Sawada et al., 2011).\nNot surprisingly, glial cells, stem cells and neurons also migrate during embryonic development. Most notably, post-mitotic neurons destined to fulfill peripheral functions have to migrate over relatively long distances from the neural crest to their target locations (Neuroscience, 2nd ed, Neuronal Migration)." } ], "vectors": { "prompt-similarity": [ -0.013013245537877083, 0.01881960965692997, 0.018717532977461815, -0.014981311745941639, 0.03672853484749794, -0.015297300182282925, 0.031154541298747063, 0.009528533555567265, -0.031607501208782196, -0.039829764515161514, -0.019534926861524582, -0.019294919446110725, -0.047140125185251236, 0.03812485188245773, -0.018894944339990616, 0.039123568683862686, 0.03436238318681717, -0.007996739819645882, 0.013651853427290916, -0.016834214329719543, -0.02929615043103695, 0.002512674080207944, 0.008257705718278885, 0.03932825103402138, 0.031019780784845352, -0.028575727716088295, -0.022710563614964485, 0.0132012739777565, -0.048433348536491394, -0.02651829645037651, 0.01601981930434704, -0.006484998855739832, -0.07150214165449142, -0.010764969512820244, 0.00407565338537097, -0.007564086001366377, -0.015640858560800552, -0.012789258733391762, 0.00717244204133749, -0.051655009388923645, -0.030335327610373497, 0.007193537428975105, -0.020686019212007523, 0.016904372721910477, -0.057382386177778244, 0.020192697644233704, -0.0621950700879097, 0.0034242896363139153, -0.04375811666250229, -0.012516515329480171, -0.04787379130721092, 0.05757446959614754, 0.045590516179800034, -0.019442711025476456, 0.02614322304725647, 0.022066324949264526, -0.017174094915390015, -0.03904383257031441, -0.014966102316975594, -0.04261021316051483, 0.06123539060354233, 0.01483749970793724, -0.009737796150147915, -0.021765291690826416, -0.001423536567017436, -0.04854138195514679, 0.03245295211672783, 0.02051699534058571, -0.05414895340800285, -0.03563692420721054, -0.0506395623087883, -0.06071240082383156, -0.017511913552880287, 0.006278000771999359, 0.009547360241413116, -0.05603624880313873, -0.0038324843626469374, 0.012652688659727573, 0.06399084627628326, 0.01680467091500759, 0.030588308349251747, 0.023556867614388466, -0.04122614115476608, 0.06281794607639313, 0.002343484666198492, -0.03668874129652977, -0.01711929589509964, -4.190538675175048e-06, -0.05742541700601578, 0.04727115109562874, -0.04583971947431564, -0.01956474594771862, 0.02877974882721901, 0.05513108894228935, 0.015185099095106125, -0.006118557415902615, 0.0272984616458416, -0.02677239291369915, -0.009623365476727486, 0.05534995347261429, -0.02598058618605137, -0.04715755954384804, -0.022215673699975014, -0.009219354949891567, -0.05435849353671074, -0.03680011257529259, -0.008128424175083637, -0.029657825827598572, 0.022026637569069862, -0.012166539207100868, -0.025011586025357246, -0.02193683199584484, -0.00693196477368474, 0.006336281541734934, -0.043086495250463486, 0.05915242061018944, 0.02211538702249527, -0.023119445890188217, 0.007697188761085272, -0.0552712008357048, 0.03299417346715927, 0.05157257989048958, -0.03600669652223587, 0.044204846024513245, 0.025432858616113663, 0.007447212003171444, 0.006279517896473408, 0.03376108407974243, -0.040294621139764786, -0.058066226541996, 0.012761987745761871, 0.04904710873961449, -0.012213962152600288, -0.013692168518900871, 0.027355555444955826, -0.0023957074154168367, 0.028188826516270638, -0.027611739933490753, 0.029400011524558067, 0.0013150176964700222, 0.0362129732966423, 0.012163455598056316, 0.03474310413002968, -0.007054436486214399, 0.02536170184612274, -0.07868500053882599, -0.04395574703812599, -0.04243417829275131, 0.002584034577012062, -0.0005564193706959486, -0.019545502960681915, 0.05276765301823616, 0.0394630953669548, -0.057229649275541306, -0.01710808463394642, 0.05301479622721672, -0.03010011836886406, 0.03373352438211441, -0.04287588968873024, -0.006589761935174465, 0.02951083518564701, -0.019792240113019943, 0.012560124509036541, -0.022978615015745163, -0.01804402843117714, -0.01765276864171028, 0.050604935735464096, -0.031133880838751793, -0.03520930930972099, 0.06622219830751419, -0.04686705023050308, 0.01252678595483303, 0.06677322834730148, 0.0012780202087014914, -0.007755340542644262, -0.002916350495070219, 0.062082815915346146, -0.003067526500672102, 0.006080616265535355, -0.036430295556783676, -0.06199180707335472, 0.02642948180437088, -0.00425749970600009, 0.025306515395641327, -0.0014685469213873148, -0.028660226613283157, 0.052989762276411057, -0.01557255256921053, 0.009855816140770912, -0.0121422428637743, -0.03747929632663727, -0.08137062191963196, 0.007190469186753035, 0.011331912130117416, 0.06765188276767731, -0.022611519321799278, -0.02787146158516407, 0.05748944729566574, 0.00487024150788784, 0.039478056132793427, 0.01931411400437355, 0.013803835026919842, 0.04888024553656578, -0.037333935499191284, -0.027693377807736397, 0.059805672615766525, 0.03614082559943199, 0.005785312503576279, 0.013619908131659031, 0.05161786451935768, -0.00884980708360672, 0.010016173124313354, 0.042678751051425934, -0.027733702212572098, 0.027968743816018105, -0.037427231669425964, -0.002935838419944048, -0.01202351227402687, 0.006725606042891741, -0.07508431375026703, -0.0060306512750685215, 0.008263292722404003, -0.025336965918540955, 0.04014277085661888, 0.008093785494565964, 0.08171582221984863, 0.07616759836673737, -0.0771564468741417, 0.022446291521191597, 0.008821032010018826, 0.013829128816723824, 0.02364560402929783, -0.0022572220768779516, 0.03746487572789192, -0.005879886448383331, 0.008362085558474064, -0.013305987231433392, -0.06773458421230316, 0.047247979789972305, -0.054940834641456604, 0.006651178002357483, 0.04406357184052467, 0.0032514971680939198, 0.06607890874147415, -0.023339349776506424, -0.015506909228861332, 0.056580446660518646, -0.013175010681152344, -0.009680991992354393, 0.003048372222110629, -0.02173807844519615, -0.03575072064995766, 0.0034152292646467686, 0.0023930943571031094, 0.032616451382637024, -0.08494752645492554, -0.04464119300246239, -0.008594084531068802, 0.07189679890871048, 0.039310749620199203, -0.0032280997838824987, 0.0571722686290741, 0.031821854412555695, -0.018074551597237587, -0.05658836290240288, -0.10419323295354843, -0.038979772478342056, -0.004710170906037092, 0.06021471694111824, 0.02279377542436123, 0.06624987721443176, -0.0021200855262577534, 0.02761155366897583, 9.02639476407785e-06, -0.021869199350476265, 0.024204667657613754, 0.06580100208520889, 0.002844455884769559, -0.01991298981010914, -0.0200088731944561, 0.02950236387550831, 0.06952787935733795, -0.017109204083681107, -0.029190661385655403, 0.022067055106163025, -0.05215190351009369, -0.002498551970347762, -0.003893302520737052, -0.004048035945743322, 0.044902484863996506, 0.01182111818343401, 0.014091513119637966, 0.007183252368122339, 0.035346873104572296, -0.005363106727600098, 0.05331592261791229, 0.04623641446232796, -0.01476075779646635, -0.010740607045590878, -0.019701674580574036, 0.00595542136579752, 0.03692961856722832, 0.012378417886793613, -0.022257760167121887, 0.003160405671223998, -1.8131876231564092e-06, -0.017647042870521545, -0.03700786456465721, -0.24109095335006714, 0.006522865034639835, -0.0008469457970932126, -0.03644183278083801, 0.017320087179541588, 0.01328502781689167, 0.003192389849573374, -0.028336772695183754, -0.03504892438650131, -0.0014239358715713024, -0.03514610975980759, 0.022008158266544342, -0.011342125944793224, 0.05192045867443085, 0.03085877001285553, -0.025241609662771225, 0.0237770676612854, -0.05109399929642677, -0.010781534016132355, 0.0020606154575943947, -0.04335577413439751, -0.028212837874889374, 0.0002747350081335753, 0.046457286924123764, 0.010325346142053604, 0.08826259523630142, -0.043199118226766586, -0.010338421911001205, -0.06027568131685257, 0.009151126258075237, -0.01782579906284809, -0.027093859389424324, 0.007199855055660009, -0.019019782543182373, 0.022030359134078026, -0.010693224146962166, 0.0009507028153166175, -0.026087958365678787, 0.024485325440764427, -0.04338093847036362, -0.04680050536990166, -0.03561573103070259, -0.02055582031607628, 0.0038633362855762243, 0.06559355556964874, -0.023061249405145645, -0.017895730212330818, 0.0038954829797148705, 0.008263446390628815, 0.04940579831600189, -0.008470145985484123, -0.0014497878728434443, -0.0061887046322226524, 0.03428115323185921, -0.0007602313999086618, -0.009981812909245491, 0.027376258745789528, 0.026810050010681152, -0.03568948805332184, -0.0058975000865757465, 0.02460271678864956, -0.01275318767875433, -0.03641323372721672, -0.044666923582553864, 0.029698815196752548, -0.03262021392583847, -0.02356722205877304, -0.04117002710700035, 0.0848817452788353, -0.004286558832973242, -0.018582580611109734, 0.013618958182632923, -0.03509534150362015, -0.06519659608602524, 0.028257008641958237, 0.021286210045218468, -0.06835642457008362, -0.054849766194820404, -0.01941634714603424, 0.035323113203048706, -0.025973310694098473, 0.002146123442798853, 0.026771889999508858, 0.05470979958772659, -0.03781023249030113, -0.04531051591038704, 0.012180115096271038, 0.0009777187369763851, -0.0416688397526741, -0.013594291172921658, 0.09633821249008179, 0.00042126362677663565, 0.02082621492445469, -0.011436634697020054, 0.052587978541851044, 0.04485282301902771, -0.011207791976630688, -0.028182996436953545, 0.028562700375914574, -0.0452943854033947, 0.06573814153671265, -0.04766593873500824, 0.029138406738638878, -0.014932483434677124, 0.012515360489487648, -0.008935957215726376, -0.05353805422782898, 0.026841312646865845, 0.03796624764800072, 0.012656201608479023, 0.03330421447753906, 0.011739440262317657, 0.030942635610699654, -0.04102332144975662, 0.015347322449088097, -0.05560077726840973, 0.008390153758227825, 0.07054135203361511, 0.028721380978822708, 0.0028039051685482264, -0.020784109830856323, 0.009438532404601574, -0.0605308897793293, -0.01866653747856617, -0.06967351585626602, 0.03392767161130905, 0.006826978642493486, 0.025683172047138214, -0.0034906533546745777, 0.029044777154922485, -0.015162697061896324, 0.0038685882464051247, 0.0499376617372036, 0.02318284660577774, 0.010678326711058617, -0.014715512283146381, -0.042784977704286575, -0.002209000289440155, -0.014008396305143833, -0.028120383620262146, 0.0026574472431093454, 0.030087493360042572, 0.03461616113781929, 0.03625616058707237, -0.011008461937308311, 0.043217092752456665, -0.045464660972356796, 0.022507434710860252, -0.02420778200030327, -0.002824041061103344, 0.028755616396665573, -0.04187369719147682, -0.015139559283852577, -0.053725019097328186, -0.025201475247740746, -0.012609651312232018, 0.04252387210726738, 0.02392260916531086, 0.016753822565078735, -0.03215314820408821, -0.01936139352619648, -0.046136122196912766, -0.005073823034763336, 0.008640735410153866, -0.009679833427071571, 0.07807573676109314, -0.012567133642733097, -0.031146127730607986, -0.026593416929244995, 0.026098934933543205, 0.024264968931674957, -0.0075249760411679745, -0.06842546164989471, 0.03510553762316704, -0.006868013646453619, 0.01947402022778988, -0.029724987223744392, -0.03539305925369263, 0.028799021616578102, 0.030593188479542732, 0.03373757004737854, -0.028323186561465263, -0.005245779640972614, 0.0025080086197704077, 0.06109020859003067, -0.0414900928735733, 0.05396903306245804, -0.047728512436151505, -0.017351394519209862, 0.02362070232629776, -0.007311966270208359, 0.028682058677077293, -0.014722640626132488, -0.007481182459741831, -0.035072099417448044, -0.021136067807674408, 0.019015248864889145, 0.008854486048221588, -0.0005861225072294474, -0.012599045410752296, 0.0175931416451931, -0.04479547217488289, -0.008386379107832909, 0.03618542104959488, 0.01628889888525009, -0.08031677454710007, 0.039770182222127914, 0.041299525648355484, -0.008586069568991661, 0.038849104195833206, -0.019013259559869766, 0.015810709446668625, -0.026148298755288124, 0.03409867733716965, 0.012881561182439327, 0.0007065649842843413, -0.010571092367172241, -0.04538531228899956, -0.005888957995921373, 0.010284706018865108, -0.00910396408289671, 0.0024551369715481997, -0.028111808001995087, -0.056267447769641876, -0.03570198640227318, 0.0007470435812138021, -0.03200932964682579, 3.1971394491847605e-05, 0.07073836773633957, -0.025731729343533516, 0.016087668016552925, -0.019969554618000984, -0.02380352094769478, 0.07783369719982147, -0.0077037508599460125, -0.026075275614857674, 0.03502178564667702, -0.005804023705422878, -0.015163084492087364, 0.06934002041816711, 0.0368470698595047, 0.017380570992827415, -0.03955657035112381, -0.028987567871809006, 0.027637561783194542, 0.04501322656869888, -0.026961492374539375, 0.00020521112310234457, -0.0452781617641449, 0.049811046570539474, 0.028363030403852463, 0.004181100055575371, 0.0021030332427471876, -0.015064270235598087, 0.05535869300365448, -0.029472526162862778, -0.04478950425982475, 0.0027753578033298254, -0.004514075815677643, -0.023607026785612106, 0.023749861866235733, 0.01957106776535511, -0.024119185283780098, -0.01694166287779808, 0.04224187880754471, 0.017501620575785637, -0.004305294249206781, 0.018400326371192932, 0.044329140335321426, -0.06549150496721268, 0.008912339806556702, -0.03948299214243889, -0.03004170022904873, 0.0032710819505155087, -0.019911974668502808, 0.02723447047173977, -0.022703979164361954, 0.034845732152462006, 0.05078149959445, -0.06074056029319763, -0.01075307372957468, 0.07076920568943024, 0.0021933179814368486, -0.03962651267647743, 0.024789808318018913, -0.07408491522073746, 0.0247175469994545, -0.03231014311313629, -0.02483881451189518, 0.002730102278292179, 0.037088677287101746, -0.0033236793242394924, 0.005284950602799654, 0.014846455305814743, 0.03255154564976692, 0.02706083469092846, 0.049154844135046005, 0.06594257056713104, -0.02415977232158184, 0.026963576674461365, -0.07380963861942291, 0.06781016290187836, 0.018511293455958366, -0.015869174152612686, -0.038478851318359375, 0.0335836261510849, 0.02612367272377014, -0.06550119817256927, 0.01825067587196827, 0.013035713694989681, -0.008435440249741077, -0.08638200908899307, 0.05963002145290375, 0.024324510246515274, -0.02895611710846424, -0.04167400300502777, 0.04319422319531441, -0.05413385480642319, 0.015215273015201092, 0.03725837171077728, -0.004908927250653505, -0.002934563672170043, 0.041528936475515366, 0.012155082076787949, 0.04147651046514511, 0.05855671316385269, -0.0299361739307642, 0.02512580342590809, 0.020929407328367233, 0.06349261105060577, 0.053939227014780045, 0.05713503807783127, -0.0038927458226680756, 0.07881465554237366, -0.012467852793633938, -0.034171897917985916, 0.020261041820049286, -0.0021278418134897947, -0.002377619966864586, 0.004330282565206289, 0.012825283221900463, 0.04088682681322098, 0.008562165312469006, 0.0359053835272789, -0.053358469158411026, 0.011921711266040802, 0.020781131461262703, 0.036604978144168854, 0.03237057104706764, 0.027678076177835464, 0.025395873934030533, 0.024215875193476677, -0.02316826581954956, -0.049021363258361816, -0.005335877649486065, -0.04324529692530632, 0.033709343522787094, 0.009520786814391613, -0.06291788816452026, 0.016032546758651733, -0.017273124307394028, 0.03564963862299919, 0.06645374745130539, 0.0019759878050535917, 0.04844486713409424, -0.033923204988241196, 0.03365401178598404, -0.03546270355582237, 0.017526622861623764, 0.05221246927976608, 0.027283355593681335, 0.00947093591094017, -0.027012217789888382, -0.001877183560281992, 0.016856137663125992, 0.013093618676066399, 0.025977004319429398, -0.06342248618602753, -0.002382427453994751, 0.02860536240041256, 0.05974981561303139, -0.03283765912055969, -0.04812508821487427, -0.05995623767375946, -0.037662360817193985, -0.035185620188713074, -0.01508689671754837, 0.035811878740787506, -0.052011068910360336, -0.059904687106609344, -0.026118896901607513, -0.010637863539159298, -0.011021668091416359, -0.03290007635951042, -0.030089853331446648, -0.03142952546477318, 0.04359989985823631, 0.040401678532361984, 0.02362644672393799, 0.013705096207559109, 0.08372753113508224, -0.029495922848582268, -0.06889309734106064, 0.00678789708763361, -0.007068346720188856, 0.07379143685102463, -0.02387312427163124, -0.0024106407072395086, -0.08333039283752441, 0.018529068678617477, 0.03415510058403015, 0.022234655916690826, -0.10251957923173904, 0.036007318645715714, -0.00660698814317584, 0.00572143355384469, 0.026509005576372147, -0.011688550002872944, -0.008342253975570202, -0.04845166578888893, -0.030434146523475647, 0.0014085661387071013, -0.03824504837393761, 0.06172807887196541, -0.03449011966586113, 0.07329946011304855, 0.029795274138450623, 0.026717940345406532, -0.045109957456588745, 0.024327795952558517, -0.008753367699682713, 0.01352944690734148, -0.023602385073900223, -0.036179229617118835, -0.008612464182078838, -0.12454637885093689, -0.016345543786883354, -0.012179647572338581, -0.02734498679637909, -0.05160606652498245, 0.019233766943216324, -0.027092240750789642, 0.016395756974816322, -0.012205400504171848, -0.014156125485897064, -0.04153557866811752, -0.020725106820464134, -0.03977225720882416, -0.05970294773578644, -0.0023274689447134733, -0.0164078027009964, -0.021304765716195107, 0.053715966641902924, -0.017753545194864273, 0.010519351810216904, 0.004593766760081053, -0.03116416372358799, -0.027580147609114647, 0.0033015876542776823, 0.033720631152391434 ] } } ``` While the same record in HuggingFace `datasets` looks as follows: ```json { "class": [], "class-suggestion": "brainstorming", "class-suggestion-metadata": { "agent": "setfit", "score": 0.2823514938354492, "type": null }, "context": null, "external_id": null, "metadata": "{\"n_characters\": 85, \"passed_quality_check\": \"True\", \"flesch_reading_ease\": 82.39000000000001, \"entropy\": 0.4352176404374839}", "prompt": "Can brain cells move? By movement I mean long distance migration (preferably within the brain only).", "response": [], "response-suggestion": "The question is relatively broad and one should take into account that the brain not only consists of neurons, but also glial cells (supportive cells) and pre-mitotic neuronal stem cells. Furthermore, as critical fellow-scientists have indicated, developmental stage is very important, as the developing embryonic brain is very different from the adult brain.\nHowever, after sifting through various publications, the answer to the question is actually remarkably simple: Yes, brain cells migrate.\nIn the adult brain glial cells migrate in the brain (Kl\u00e4mbt, 2009). Glial cells are involved in a myriad of functions, but a notable example of migrating glial cells are the oligodendrocytes that migrate relative long distances to find their target axons onto which they wrap themselves to form the insulating myelin sheath (Tsai and Miller, 2002).\nNeuronal stem cells migrate over long distances in response to injury (Imitola et al., 2004) and they migrate from specific stem-cell locations (e.g., hippocampus and subventricular zone) to other regions (Clarke, 2003).\nPost-mitotic, but non-differentiated neurons have been shown to migrate in the adult brain in fish (Scott et al., 2012), and in mammals and non-human primates as well (Sawada et al., 2011).\nNot surprisingly, glial cells, stem cells and neurons also migrate during embryonic development. Most notably, post-mitotic neurons destined to fulfill peripheral functions have to migrate over relatively long distances from the neural crest to their target locations (Neuroscience, 2nd ed, Neuronal Migration).", "response-suggestion-metadata": { "agent": null, "score": null, "type": null }, "vectors": { "prompt-similarity": [ -0.013013245537877083, 0.01881960965692997, 0.018717532977461815, -0.014981311745941639, 0.03672853484749794, -0.015297300182282925, 0.031154541298747063, 0.009528533555567265, -0.031607501208782196, -0.039829764515161514, -0.019534926861524582, -0.019294919446110725, -0.047140125185251236, 0.03812485188245773, -0.018894944339990616, 0.039123568683862686, 0.03436238318681717, -0.007996739819645882, 0.013651853427290916, -0.016834214329719543, -0.02929615043103695, 0.002512674080207944, 0.008257705718278885, 0.03932825103402138, 0.031019780784845352, -0.028575727716088295, -0.022710563614964485, 0.0132012739777565, -0.048433348536491394, -0.02651829645037651, 0.01601981930434704, -0.006484998855739832, -0.07150214165449142, -0.010764969512820244, 0.00407565338537097, -0.007564086001366377, -0.015640858560800552, -0.012789258733391762, 0.00717244204133749, -0.051655009388923645, -0.030335327610373497, 0.007193537428975105, -0.020686019212007523, 0.016904372721910477, -0.057382386177778244, 0.020192697644233704, -0.0621950700879097, 0.0034242896363139153, -0.04375811666250229, -0.012516515329480171, -0.04787379130721092, 0.05757446959614754, 0.045590516179800034, -0.019442711025476456, 0.02614322304725647, 0.022066324949264526, -0.017174094915390015, -0.03904383257031441, -0.014966102316975594, -0.04261021316051483, 0.06123539060354233, 0.01483749970793724, -0.009737796150147915, -0.021765291690826416, -0.001423536567017436, -0.04854138195514679, 0.03245295211672783, 0.02051699534058571, -0.05414895340800285, -0.03563692420721054, -0.0506395623087883, -0.06071240082383156, -0.017511913552880287, 0.006278000771999359, 0.009547360241413116, -0.05603624880313873, -0.0038324843626469374, 0.012652688659727573, 0.06399084627628326, 0.01680467091500759, 0.030588308349251747, 0.023556867614388466, -0.04122614115476608, 0.06281794607639313, 0.002343484666198492, -0.03668874129652977, -0.01711929589509964, -4.190538675175048e-06, -0.05742541700601578, 0.04727115109562874, -0.04583971947431564, -0.01956474594771862, 0.02877974882721901, 0.05513108894228935, 0.015185099095106125, -0.006118557415902615, 0.0272984616458416, -0.02677239291369915, -0.009623365476727486, 0.05534995347261429, -0.02598058618605137, -0.04715755954384804, -0.022215673699975014, -0.009219354949891567, -0.05435849353671074, -0.03680011257529259, -0.008128424175083637, -0.029657825827598572, 0.022026637569069862, -0.012166539207100868, -0.025011586025357246, -0.02193683199584484, -0.00693196477368474, 0.006336281541734934, -0.043086495250463486, 0.05915242061018944, 0.02211538702249527, -0.023119445890188217, 0.007697188761085272, -0.0552712008357048, 0.03299417346715927, 0.05157257989048958, -0.03600669652223587, 0.044204846024513245, 0.025432858616113663, 0.007447212003171444, 0.006279517896473408, 0.03376108407974243, -0.040294621139764786, -0.058066226541996, 0.012761987745761871, 0.04904710873961449, -0.012213962152600288, -0.013692168518900871, 0.027355555444955826, -0.0023957074154168367, 0.028188826516270638, -0.027611739933490753, 0.029400011524558067, 0.0013150176964700222, 0.0362129732966423, 0.012163455598056316, 0.03474310413002968, -0.007054436486214399, 0.02536170184612274, -0.07868500053882599, -0.04395574703812599, -0.04243417829275131, 0.002584034577012062, -0.0005564193706959486, -0.019545502960681915, 0.05276765301823616, 0.0394630953669548, -0.057229649275541306, -0.01710808463394642, 0.05301479622721672, -0.03010011836886406, 0.03373352438211441, -0.04287588968873024, -0.006589761935174465, 0.02951083518564701, -0.019792240113019943, 0.012560124509036541, -0.022978615015745163, -0.01804402843117714, -0.01765276864171028, 0.050604935735464096, -0.031133880838751793, -0.03520930930972099, 0.06622219830751419, -0.04686705023050308, 0.01252678595483303, 0.06677322834730148, 0.0012780202087014914, -0.007755340542644262, -0.002916350495070219, 0.062082815915346146, -0.003067526500672102, 0.006080616265535355, -0.036430295556783676, -0.06199180707335472, 0.02642948180437088, -0.00425749970600009, 0.025306515395641327, -0.0014685469213873148, -0.028660226613283157, 0.052989762276411057, -0.01557255256921053, 0.009855816140770912, -0.0121422428637743, -0.03747929632663727, -0.08137062191963196, 0.007190469186753035, 0.011331912130117416, 0.06765188276767731, -0.022611519321799278, -0.02787146158516407, 0.05748944729566574, 0.00487024150788784, 0.039478056132793427, 0.01931411400437355, 0.013803835026919842, 0.04888024553656578, -0.037333935499191284, -0.027693377807736397, 0.059805672615766525, 0.03614082559943199, 0.005785312503576279, 0.013619908131659031, 0.05161786451935768, -0.00884980708360672, 0.010016173124313354, 0.042678751051425934, -0.027733702212572098, 0.027968743816018105, -0.037427231669425964, -0.002935838419944048, -0.01202351227402687, 0.006725606042891741, -0.07508431375026703, -0.0060306512750685215, 0.008263292722404003, -0.025336965918540955, 0.04014277085661888, 0.008093785494565964, 0.08171582221984863, 0.07616759836673737, -0.0771564468741417, 0.022446291521191597, 0.008821032010018826, 0.013829128816723824, 0.02364560402929783, -0.0022572220768779516, 0.03746487572789192, -0.005879886448383331, 0.008362085558474064, -0.013305987231433392, -0.06773458421230316, 0.047247979789972305, -0.054940834641456604, 0.006651178002357483, 0.04406357184052467, 0.0032514971680939198, 0.06607890874147415, -0.023339349776506424, -0.015506909228861332, 0.056580446660518646, -0.013175010681152344, -0.009680991992354393, 0.003048372222110629, -0.02173807844519615, -0.03575072064995766, 0.0034152292646467686, 0.0023930943571031094, 0.032616451382637024, -0.08494752645492554, -0.04464119300246239, -0.008594084531068802, 0.07189679890871048, 0.039310749620199203, -0.0032280997838824987, 0.0571722686290741, 0.031821854412555695, -0.018074551597237587, -0.05658836290240288, -0.10419323295354843, -0.038979772478342056, -0.004710170906037092, 0.06021471694111824, 0.02279377542436123, 0.06624987721443176, -0.0021200855262577534, 0.02761155366897583, 9.02639476407785e-06, -0.021869199350476265, 0.024204667657613754, 0.06580100208520889, 0.002844455884769559, -0.01991298981010914, -0.0200088731944561, 0.02950236387550831, 0.06952787935733795, -0.017109204083681107, -0.029190661385655403, 0.022067055106163025, -0.05215190351009369, -0.002498551970347762, -0.003893302520737052, -0.004048035945743322, 0.044902484863996506, 0.01182111818343401, 0.014091513119637966, 0.007183252368122339, 0.035346873104572296, -0.005363106727600098, 0.05331592261791229, 0.04623641446232796, -0.01476075779646635, -0.010740607045590878, -0.019701674580574036, 0.00595542136579752, 0.03692961856722832, 0.012378417886793613, -0.022257760167121887, 0.003160405671223998, -1.8131876231564092e-06, -0.017647042870521545, -0.03700786456465721, -0.24109095335006714, 0.006522865034639835, -0.0008469457970932126, -0.03644183278083801, 0.017320087179541588, 0.01328502781689167, 0.003192389849573374, -0.028336772695183754, -0.03504892438650131, -0.0014239358715713024, -0.03514610975980759, 0.022008158266544342, -0.011342125944793224, 0.05192045867443085, 0.03085877001285553, -0.025241609662771225, 0.0237770676612854, -0.05109399929642677, -0.010781534016132355, 0.0020606154575943947, -0.04335577413439751, -0.028212837874889374, 0.0002747350081335753, 0.046457286924123764, 0.010325346142053604, 0.08826259523630142, -0.043199118226766586, -0.010338421911001205, -0.06027568131685257, 0.009151126258075237, -0.01782579906284809, -0.027093859389424324, 0.007199855055660009, -0.019019782543182373, 0.022030359134078026, -0.010693224146962166, 0.0009507028153166175, -0.026087958365678787, 0.024485325440764427, -0.04338093847036362, -0.04680050536990166, -0.03561573103070259, -0.02055582031607628, 0.0038633362855762243, 0.06559355556964874, -0.023061249405145645, -0.017895730212330818, 0.0038954829797148705, 0.008263446390628815, 0.04940579831600189, -0.008470145985484123, -0.0014497878728434443, -0.0061887046322226524, 0.03428115323185921, -0.0007602313999086618, -0.009981812909245491, 0.027376258745789528, 0.026810050010681152, -0.03568948805332184, -0.0058975000865757465, 0.02460271678864956, -0.01275318767875433, -0.03641323372721672, -0.044666923582553864, 0.029698815196752548, -0.03262021392583847, -0.02356722205877304, -0.04117002710700035, 0.0848817452788353, -0.004286558832973242, -0.018582580611109734, 0.013618958182632923, -0.03509534150362015, -0.06519659608602524, 0.028257008641958237, 0.021286210045218468, -0.06835642457008362, -0.054849766194820404, -0.01941634714603424, 0.035323113203048706, -0.025973310694098473, 0.002146123442798853, 0.026771889999508858, 0.05470979958772659, -0.03781023249030113, -0.04531051591038704, 0.012180115096271038, 0.0009777187369763851, -0.0416688397526741, -0.013594291172921658, 0.09633821249008179, 0.00042126362677663565, 0.02082621492445469, -0.011436634697020054, 0.052587978541851044, 0.04485282301902771, -0.011207791976630688, -0.028182996436953545, 0.028562700375914574, -0.0452943854033947, 0.06573814153671265, -0.04766593873500824, 0.029138406738638878, -0.014932483434677124, 0.012515360489487648, -0.008935957215726376, -0.05353805422782898, 0.026841312646865845, 0.03796624764800072, 0.012656201608479023, 0.03330421447753906, 0.011739440262317657, 0.030942635610699654, -0.04102332144975662, 0.015347322449088097, -0.05560077726840973, 0.008390153758227825, 0.07054135203361511, 0.028721380978822708, 0.0028039051685482264, -0.020784109830856323, 0.009438532404601574, -0.0605308897793293, -0.01866653747856617, -0.06967351585626602, 0.03392767161130905, 0.006826978642493486, 0.025683172047138214, -0.0034906533546745777, 0.029044777154922485, -0.015162697061896324, 0.0038685882464051247, 0.0499376617372036, 0.02318284660577774, 0.010678326711058617, -0.014715512283146381, -0.042784977704286575, -0.002209000289440155, -0.014008396305143833, -0.028120383620262146, 0.0026574472431093454, 0.030087493360042572, 0.03461616113781929, 0.03625616058707237, -0.011008461937308311, 0.043217092752456665, -0.045464660972356796, 0.022507434710860252, -0.02420778200030327, -0.002824041061103344, 0.028755616396665573, -0.04187369719147682, -0.015139559283852577, -0.053725019097328186, -0.025201475247740746, -0.012609651312232018, 0.04252387210726738, 0.02392260916531086, 0.016753822565078735, -0.03215314820408821, -0.01936139352619648, -0.046136122196912766, -0.005073823034763336, 0.008640735410153866, -0.009679833427071571, 0.07807573676109314, -0.012567133642733097, -0.031146127730607986, -0.026593416929244995, 0.026098934933543205, 0.024264968931674957, -0.0075249760411679745, -0.06842546164989471, 0.03510553762316704, -0.006868013646453619, 0.01947402022778988, -0.029724987223744392, -0.03539305925369263, 0.028799021616578102, 0.030593188479542732, 0.03373757004737854, -0.028323186561465263, -0.005245779640972614, 0.0025080086197704077, 0.06109020859003067, -0.0414900928735733, 0.05396903306245804, -0.047728512436151505, -0.017351394519209862, 0.02362070232629776, -0.007311966270208359, 0.028682058677077293, -0.014722640626132488, -0.007481182459741831, -0.035072099417448044, -0.021136067807674408, 0.019015248864889145, 0.008854486048221588, -0.0005861225072294474, -0.012599045410752296, 0.0175931416451931, -0.04479547217488289, -0.008386379107832909, 0.03618542104959488, 0.01628889888525009, -0.08031677454710007, 0.039770182222127914, 0.041299525648355484, -0.008586069568991661, 0.038849104195833206, -0.019013259559869766, 0.015810709446668625, -0.026148298755288124, 0.03409867733716965, 0.012881561182439327, 0.0007065649842843413, -0.010571092367172241, -0.04538531228899956, -0.005888957995921373, 0.010284706018865108, -0.00910396408289671, 0.0024551369715481997, -0.028111808001995087, -0.056267447769641876, -0.03570198640227318, 0.0007470435812138021, -0.03200932964682579, 3.1971394491847605e-05, 0.07073836773633957, -0.025731729343533516, 0.016087668016552925, -0.019969554618000984, -0.02380352094769478, 0.07783369719982147, -0.0077037508599460125, -0.026075275614857674, 0.03502178564667702, -0.005804023705422878, -0.015163084492087364, 0.06934002041816711, 0.0368470698595047, 0.017380570992827415, -0.03955657035112381, -0.028987567871809006, 0.027637561783194542, 0.04501322656869888, -0.026961492374539375, 0.00020521112310234457, -0.0452781617641449, 0.049811046570539474, 0.028363030403852463, 0.004181100055575371, 0.0021030332427471876, -0.015064270235598087, 0.05535869300365448, -0.029472526162862778, -0.04478950425982475, 0.0027753578033298254, -0.004514075815677643, -0.023607026785612106, 0.023749861866235733, 0.01957106776535511, -0.024119185283780098, -0.01694166287779808, 0.04224187880754471, 0.017501620575785637, -0.004305294249206781, 0.018400326371192932, 0.044329140335321426, -0.06549150496721268, 0.008912339806556702, -0.03948299214243889, -0.03004170022904873, 0.0032710819505155087, -0.019911974668502808, 0.02723447047173977, -0.022703979164361954, 0.034845732152462006, 0.05078149959445, -0.06074056029319763, -0.01075307372957468, 0.07076920568943024, 0.0021933179814368486, -0.03962651267647743, 0.024789808318018913, -0.07408491522073746, 0.0247175469994545, -0.03231014311313629, -0.02483881451189518, 0.002730102278292179, 0.037088677287101746, -0.0033236793242394924, 0.005284950602799654, 0.014846455305814743, 0.03255154564976692, 0.02706083469092846, 0.049154844135046005, 0.06594257056713104, -0.02415977232158184, 0.026963576674461365, -0.07380963861942291, 0.06781016290187836, 0.018511293455958366, -0.015869174152612686, -0.038478851318359375, 0.0335836261510849, 0.02612367272377014, -0.06550119817256927, 0.01825067587196827, 0.013035713694989681, -0.008435440249741077, -0.08638200908899307, 0.05963002145290375, 0.024324510246515274, -0.02895611710846424, -0.04167400300502777, 0.04319422319531441, -0.05413385480642319, 0.015215273015201092, 0.03725837171077728, -0.004908927250653505, -0.002934563672170043, 0.041528936475515366, 0.012155082076787949, 0.04147651046514511, 0.05855671316385269, -0.0299361739307642, 0.02512580342590809, 0.020929407328367233, 0.06349261105060577, 0.053939227014780045, 0.05713503807783127, -0.0038927458226680756, 0.07881465554237366, -0.012467852793633938, -0.034171897917985916, 0.020261041820049286, -0.0021278418134897947, -0.002377619966864586, 0.004330282565206289, 0.012825283221900463, 0.04088682681322098, 0.008562165312469006, 0.0359053835272789, -0.053358469158411026, 0.011921711266040802, 0.020781131461262703, 0.036604978144168854, 0.03237057104706764, 0.027678076177835464, 0.025395873934030533, 0.024215875193476677, -0.02316826581954956, -0.049021363258361816, -0.005335877649486065, -0.04324529692530632, 0.033709343522787094, 0.009520786814391613, -0.06291788816452026, 0.016032546758651733, -0.017273124307394028, 0.03564963862299919, 0.06645374745130539, 0.0019759878050535917, 0.04844486713409424, -0.033923204988241196, 0.03365401178598404, -0.03546270355582237, 0.017526622861623764, 0.05221246927976608, 0.027283355593681335, 0.00947093591094017, -0.027012217789888382, -0.001877183560281992, 0.016856137663125992, 0.013093618676066399, 0.025977004319429398, -0.06342248618602753, -0.002382427453994751, 0.02860536240041256, 0.05974981561303139, -0.03283765912055969, -0.04812508821487427, -0.05995623767375946, -0.037662360817193985, -0.035185620188713074, -0.01508689671754837, 0.035811878740787506, -0.052011068910360336, -0.059904687106609344, -0.026118896901607513, -0.010637863539159298, -0.011021668091416359, -0.03290007635951042, -0.030089853331446648, -0.03142952546477318, 0.04359989985823631, 0.040401678532361984, 0.02362644672393799, 0.013705096207559109, 0.08372753113508224, -0.029495922848582268, -0.06889309734106064, 0.00678789708763361, -0.007068346720188856, 0.07379143685102463, -0.02387312427163124, -0.0024106407072395086, -0.08333039283752441, 0.018529068678617477, 0.03415510058403015, 0.022234655916690826, -0.10251957923173904, 0.036007318645715714, -0.00660698814317584, 0.00572143355384469, 0.026509005576372147, -0.011688550002872944, -0.008342253975570202, -0.04845166578888893, -0.030434146523475647, 0.0014085661387071013, -0.03824504837393761, 0.06172807887196541, -0.03449011966586113, 0.07329946011304855, 0.029795274138450623, 0.026717940345406532, -0.045109957456588745, 0.024327795952558517, -0.008753367699682713, 0.01352944690734148, -0.023602385073900223, -0.036179229617118835, -0.008612464182078838, -0.12454637885093689, -0.016345543786883354, -0.012179647572338581, -0.02734498679637909, -0.05160606652498245, 0.019233766943216324, -0.027092240750789642, 0.016395756974816322, -0.012205400504171848, -0.014156125485897064, -0.04153557866811752, -0.020725106820464134, -0.03977225720882416, -0.05970294773578644, -0.0023274689447134733, -0.0164078027009964, -0.021304765716195107, 0.053715966641902924, -0.017753545194864273, 0.010519351810216904, 0.004593766760081053, -0.03116416372358799, -0.027580147609114647, 0.0033015876542776823, 0.033720631152391434 ] } } ``` ### Data Fields Among the dataset fields, we differentiate between the following: * **Fields:** These are the dataset records themselves, for the moment just text fields are supported. These are the ones that will be used to provide responses to the questions. * **prompt** is of type `text`. * (optional) **context** is of type `text`. * **Questions:** These are the questions that will be asked to the annotators. They can be of different types, such as `RatingQuestion`, `TextQuestion`, `LabelQuestion`, `MultiLabelQuestion`, and `RankingQuestion`. * **class** is of type `label_selection` with the following allowed values ['closed_qa', 'classification', 'open_qa', 'information_extraction', 'brainstorming', 'general_qa', 'summarization', 'creative_writing']. * **response** is of type `text`. * **Suggestions:** As of Argilla 1.13.0, the suggestions have been included to provide the annotators with suggestions to ease or assist during the annotation process. Suggestions are linked to the existing questions, are always optional, and contain not just the suggestion itself, but also the metadata linked to it, if applicable. * (optional) **class-suggestion** is of type `label_selection` with the following allowed values ['closed_qa', 'classification', 'open_qa', 'information_extraction', 'brainstorming', 'general_qa', 'summarization', 'creative_writing']. * (optional) **response-suggestion** is of type `text`. * **✨ NEW** **Vectors**: As of Argilla 1.19.0, the vectors have been included in order to add support for similarity search to explore similar records based on vector search powered by the search engine defined. The vectors are optional and cannot be seen within the UI, those are uploaded and internally used. Also the vectors will always be optional, and only the dimensions previously defined in their settings. * (optional) **prompt-similarity** is of type `float32` and has a dimension of (1, `768`). Additionally, we also have two more fields that are optional and are the following: * **metadata:** This is an optional field that can be used to provide additional information about the dataset record. This can be useful to provide additional context to the annotators, or to provide additional information about the dataset record itself. For example, you can use this to provide a link to the original source of the dataset record, or to provide additional information about the dataset record itself, such as the author, the date, or the source. The metadata is always optional, and can be potentially linked to the `metadata_properties` defined in the dataset configuration file in `argilla.yaml`. * **external_id:** This is an optional field that can be used to provide an external ID for the dataset record. This can be useful if you want to link the dataset record to an external resource, such as a database or a file. ### Data Splits The dataset contains a single split, which is `train`. ## Dataset Creation ### Curation Rationale [More Information Needed] ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation guidelines This is a supervised fine-tuning dataset that contains instructions. Please write the response to the instruction in the response field. Take the context into account when writing the response. #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information [More Information Needed] ### Citation Information [More Information Needed] ### Contributions [More Information Needed]
提供机构:
nataliaElv
原始信息汇总

数据集卡片 for textclass_descriptives_vectors

数据集描述

  • 主页: https://argilla.io
  • 仓库: https://github.com/argilla-io/argilla
  • 论文:
  • 排行榜:
  • 联系人:

数据集摘要

该数据集包含:

  • 符合Argilla数据集格式的配置文件argilla.yaml。该配置文件用于在使用Argilla的FeedbackDataset.from_huggingface方法时配置数据集。
  • 与HuggingFace datasets兼容的数据集记录。这些记录在使用FeedbackDataset.from_huggingface时会自动加载,也可以通过datasets库的load_dataset方法独立加载。
  • 用于构建和整理数据集的标注指南(如果已在Argilla中定义)。

加载方式

使用Argilla加载

安装Argilla:pip install argilla --upgrade,然后使用以下代码:

python import argilla as rg

ds = rg.FeedbackDataset.from_huggingface("nataliaElv/textclass_descriptives_vectors")

使用datasets库加载

安装datasets库:pip install datasets --upgrade,然后使用以下代码:

python from datasets import load_dataset

ds = load_dataset("nataliaElv/textclass_descriptives_vectors")

支持的任务和排行榜

该数据集可以包含多个字段、问题和响应,因此可以用于不同的NLP任务,具体取决于配置。数据集结构在数据集结构部分中描述。

该数据集没有关联的排行榜。

语言

[更多信息需要]

数据集结构

数据在Argilla中的结构

数据集在Argilla中创建,包含以下内容:字段问题建议元数据向量指南

字段

字段是数据集记录本身,目前仅支持文本字段。这些字段将用于提供对问题的响应。

字段名称 标题 类型 必需 支持Markdown
prompt Prompt text True True
context Context text False True

问题

问题是将向标注者提出的问题。它们可以是不同类型,如评分、文本、标签选择、多标签选择或排序。

问题名称 标题 类型 必需 描述 值/标签
class Classify the instruction according to its class label_selection True N/A [closed_qa, classification, open_qa, information_extraction, brainstorming, general_qa, summarization, creative_writing]
response Response text True N/A N/A

建议

建议是人为或机器生成的推荐,用于在标注过程中辅助标注者。这些建议总是与现有问题相关联,并在问题名称后附加“-suggestion”和“-suggestion-metadata”,分别包含建议的值和其元数据。

元数据

元数据是一个字典,用于提供有关数据集记录的额外信息。这可以为标注者提供额外的上下文,或提供有关数据集记录本身的额外信息。元数据总是可选的,并且可以与数据集配置文件argilla.yaml中定义的metadata_properties相关联。

向量

向量是包含浮点数的不同列,其维度在数据集配置时预定义。向量是可选的,并且由数据集配置文件argilla.yaml中预定义的向量名称标识。

向量名称 标题 维度
prompt-similarity prompt-similarity [1, 768]

元数据

元数据名称 标题 类型 对标注者可见
n_characters No. of characters integer None - None True
passed_quality_check Passed quality check? terms - True
flesch_reading_ease Reading ease float None - None True
entropy Entropy float None - None True

指南

指南是可选的,只是一段纯文本,用于向标注者提供指示。可以在标注指南部分找到这些指南。

数据实例

一个数据集实例在Argilla中的示例如下:

json { "external_id": null, "fields": { "prompt": "Can brain cells move? By movement I mean long distance migration (preferably within the brain only)." }, "metadata": { "entropy": 0.4352176404374839, "flesch_reading_ease": 82.39000000000001, "n_characters": 85, "passed_quality_check": "True" }, "responses": [], "suggestions": [ { "agent": "setfit", "question_name": "class", "score": 0.2823514938354492, "type": null, "value": "brainstorming" }, { "agent": null, "question_name": "response", "score": null, "type": null, "value": "The question is relatively broad and one should take into account that the brain not only consists of neurons, but also glial cells (supportive cells) and pre-mitotic neuronal stem cells. Furthermore, as critical fellow-scientists have indicated, developmental stage is very important, as the developing embryonic brain is very different from the adult brain. However, after sifting through various publications, the answer to the question is actually remarkably simple: Yes, brain cells migrate. In the adult brain glial cells migrate in the brain (Klämbt, 2009). Glial cells are involved in a myriad of functions, but a notable example of migrating glial cells are the oligodendrocytes that migrate relative long distances to find their target axons onto which they wrap themselves to form the insulating myelin sheath (Tsai and Miller, 2002). Neuronal stem cells migrate over long distances in response to injury (Imitola et al., 2004) and they migrate from specific stem-cell locations (e.g., hippocampus and subventricular zone) to other regions (Clarke, 2003). Post-mitotic, but non-differentiated neurons have been shown to migrate in the adult brain in fish (Scott et al., 2012), and in mammals and non-human primates as well (Sawada et al., 2011). Not surprisingly, glial cells, stem cells and neurons also migrate during embryonic development. Most notably, post-mitotic neurons destined to fulfill peripheral functions have to migrate over relatively long distances from the neural crest to their target locations (Neuroscience, 2nd ed, Neuronal Migration)." } ], "vectors": { "prompt-similarity": [ -0.013013245537877083, 0.01881960965692997, 0.018717532977461815, -0.014981311745941639, 0.03672853484749794, -0.015297300182282925, 0.031154541298747063, 0.009528533555567265, -0.031607501208782196, -0.039829764515161514, -0.019534926861524582, -0.019294919446110725, -0.047140125185251236, 0.03812485188245773, -0.018894944339990616, 0.039123568683862686, 0.03436238318681717, -0.007996739819645882, 0.013651853427290916, -0.016834214329719543, -0.02929615043103695, 0.002512674080207944, 0.008257705718278885, 0.03932825103402138, 0.031019780784845352, -0.028575727716088295, -0.022710563614964485, 0.0132012739777565, -0.048433348536491394, -0.02651829645037651, 0.01601981930434704, -0.006484998855739832, -0.07150214165449142, -0.010764969512820244, 0.00407565338537097, -0.007564086001366377, -0.015640858560800552, -0.012789258733391762, 0.00717244204133749, -0.051655009388923645, -0.030335327610373497, 0.007193537428975105, -0.020686019212007523, 0.016904372721910477, -0.057382386177778244, 0.020192697644233704, -0.0621950700879097, 0.0034242896363139153, -0.04375811666250229, -0.012516515329480171, -0.04787379130721092, 0.05757446959614754, 0.045590516179800034, -0.019442711025476456, 0.02614322304725647, 0.022066324949264526, -0.017174094915390015, -0.03904383257031441, -0.014966102316975594, -0.04261021316051483, 0.06123539060354233, 0.01483749970793724, -0.009737796150147915, -0.021765291690826416, -0.001423536567017436, -0.04854138195514679, 0.03245295211672783, 0.02051699534058571, -0.05414895340800285, -0.03563692420721054, -0.0506395623087883, -0.06071240082383156, -0.017511913552880287, 0.006278000771999359, 0.009547360241413116, -0.05603624880313873, -0.0038324843626469374, 0.012652688659727573, 0.06399084627628326, 0.01680467091500759, 0.030588308349251747, 0.023556867614388466, -0.04122614115476608, 0.06281794607639313, 0.002343484666198492, -0.03668874129652977, -0.01711929589509964, -4.190538675175048e-06, -0.05742541700601578, 0.04727115109562874, -0.04583971947431564, -0.01956474594771862, 0.02877974882721901, 0.05513108894228935, 0.015185099095106125, -0.006118557415902615, 0.0272984616458416,

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作