five

KNoTE dataset

收藏
DataCite Commons2026-05-02 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19813475
下载链接
链接失效反馈
官方服务:
资源简介:
Project Overview KNoTE (Korean Novel TEI Encoded) dataset Unlike simple text conversion, this dataset follows the TEI (Text Encoding Initiative) P5 guidelines. It includes detailed metadata, character descriptions, linguistic variations (Hanja/Hangul), and semantic tagging. Key Features TEI Standard: Fully compliant with TEI P5 (<teiHeader>, <body>, <div>). Characters: Linked via xml:id and ref (e.g., <persName ref="#YB">). Linguistic Mapping: Original Hanja and modern Hangul mapped via <foreign xml:lang="zh">. Entities: Places (<placeName>), Dates (<date>), and Occupations (<occupation>). Scholarly Metadata: Includes source descriptions, publication history, and revision logs. XML Structure Example (Snippet) The dataset uses a hierarchical structure to capture both the content and the context of the literature: <?xml version="1.0" encoding="UTF-8"?> <TEI xmlns="http://www.tei-c.org/ns/1.0">   <teiHeader>     <fileDesc>       <titleStmt>         <title>낙오자</title>         <author>이익상</author>         <respStmt>           <resp>TEI 인코딩</resp>           <name>지해인<idno type="ISNI">0000 0005 2802 5223</idno></name>           <email>cihayin [at] gmail.com</email>         </respStmt>         <respStmt>           <resp>TEI 검수</resp>           <name>박선영<idno type="ORCID">0009-0001-1340-0455</idno></name>           <email>sun09125 [at] gmail.com</email>         </respStmt>       </titleStmt>       <publicationStmt>         <publisher>한국학중앙연구원 인문정보학과</publisher>       </publicationStmt>       <sourceDesc>         <bibl type="digitalSource" xml:lang="ko">           <title level="a">낙오자</title>           <author>이익상</author>           <publisher>Wikisource(한국어)</publisher>           <idno type="wikisource">https://ko.wikisource.org/wiki/낙오자</idno>           <idno type="wikisource-info">https://ko.wikisource.org/w/index.php?title=낙오자&amp;action=info</idno>           <note type="acquisition">작업자 지해인이 위키문헌 항목에서 raw data를 취득함.</note>         </bibl>       </sourceDesc>     </fileDesc>     <encodingDesc>       <projectDesc>         <p>본 전자본은 TEI P5 지침(TEI Lite)에 따라 구조화함.</p>       </projectDesc>     </encodingDesc>     <profileDesc>       <langUsage>         <language ident="ko">Korean</language>       </langUsage>       <textClass>         <keywords scheme="local">           <term>근현대 한국문학</term>           <term>단편소설</term>         </keywords>       </textClass>       <particDesc>         <listPerson>           <person xml:id="ZH">             <persName xml:lang="ko">진화</persName>             <persName xml:lang="zh">鎭華</persName>           </person>           <person xml:id="M">             <persName xml:lang="ko">M</persName>           </person>           <personGrp xml:id="EP">             <persName xml:lang="ko">모든 사람</persName>             <note>진화가 본 길가에 지나가는 모든 사람</note>           </personGrp>         </listPerson>       </particDesc>     </profileDesc>     <revisionDesc>       <change when="2025-12-07" who="#지해인">작업자 지해인이 TEI 인코딩을 완료함.</change>       <change when="2026-02-18" who="#박선영">작업자 박선영이 TEI 검수를 완료함.</change>     </revisionDesc>   </teiHeader>   <text>     <body>       <div>         <p>일 개월을 지나지 못하여 자기 수대(<foreign xml:lang="zh">數代</foreign>) 전래하는 주택을 훼철(<foreign xml:lang="zh">毁撤</foreign>)치 아니 못할 운명에 당한 <persName ref="#ZH">진화</persName>는 책보를 곁에 끼고 <orgName>C사</orgName> 정문을 나왔다. 문 앞에서 한 번 주저하며 뒤에 있는 현관을 돌아다보며, <said aloud="true" direct="false" who="#ZH">이곳에 다시 발을 들여놓으면 <rs ref="#ZH">나</rs>는 사람이 아니</said>라고 중얼거리며 나왔다. <said aloud="false" direct="false" mode="thought" who="#ZH">위선자, 협잡배들이 가면을 쓰고 권력하에서 굽실굽실 아첨하는 것을 차마 볼 수 없다</said>고 <persName ref="#ZH">진화(<foreign xml:lang="zh">鎭華</foreign>)</persName>는 생각했다. <rs ref="#ZH">그</rs>는 머리를 들어 가로에 분주히 다니는 <persName ref="EP">모든 사람</persName> 얼굴을 의미 있게 쳐다보았다. 다 평화로운 듯하다. <said who="#EP" aloud="false" direct="true" mode="thought" agent="#ZH"><rs ref="#ZH">너</rs>는 <rs ref="#ZH" type="epithet">낙오자</rs>이다······.</said> 조소하는 것 같다. <rs ref="#ZH">그</rs>의 머리에서는 한 달 지나면 집을 헐어야 하는 것이 간단없이 울리어 온다.</p>       </div>     </body>   </text> </TEI> List of Works No. Author Title (English / Transliteration) Date 1 Yi In-jik Tears of Blood (Hyeol-ui Nu) 1906 2 Yi Hae-jo The Iron World (Cheol-segye) 1908 3 Yi Kwang-su The Heartless (Mujeong - Short Story) 1910 4 Yi Hae-jo Blood of Flowers (Hwa-ui Hyeol) 1911.04 5 Kim Myeong-sun The Girl of Mystery (Uisim-ui Sonyeo) 1917.11 6 Na Hye-seok Kyung-hee 1918.03 7 Na Hye-seok To the Revived Granddaughter 1918.09 8 Kim Dong-in The Sorrows of the Weak 1919.02~03 9 Yi Ik-sang The Straggler (Nagoja) 1919.07.14 10 Hyun Jin-geon A Poor Wife (Bincheo) 1921.01 11 Na Hye-seok Gyu-won 1921.07 12 Hyun Jin-geon A Society That Drives You to Drink 1921.11 13 Choi Seo-hae Nostalgia (Hyangsu) 1924.04 14 Hyun Jin-geon A Lucky Day (Unsu Joeun Nal) 1924.06 15 Kim Dong-in Potato (Gamja) 1925.01 16 Hyun Jin-geon Director B and the Love Letters 1925.02 17 Na Do-hyang The Watermill (Mullebang-a) 1925.09 18 Bang Jeong-hwan For Our Friends 1927.02 19 Bang Jeong-hwan The Eternal Shirt (Mannyeon Shirt) 1927.03 20 Bang Jeong-hwan The Gold Watch 1929.01~02 21 Kim Dong-in Dr. K’s Research 1929.12 22 Kim Nam-cheon Water (Mul) 1933.06 23 Chae Man-sik Ready-made Life 1934.05~07 24 Kang Kyeong-ae Salt (Sogeum) 1934.05~10 25 Gye Yong-mook Adada the Idiot (Baekchi Adada) 1935 26 Kim Yu-jeong The Camellias (Dongbaek-kkot) 1936.05 27 Yi Sang The Wings (Nalgae) 1936.09 28 Yi Hyo-seok When Buckwheat Flowers Bloom 1936.10 29 Chae Man-sik Uncle Chi-suk 1938 30 Jeong In-taek Melancholy (Uuljeung) 1940.09 31 Kim Sa-ryang The Man Met in the Detention Center 1941 32 Ji Ha-ryeon The Journey (Dojeong) 1946.07 33 Kang So-cheon The Photo Studio that Takes Pictures of Dreams 1954.03
提供机构:
Zenodo
创建时间:
2026-04-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作