Claudinha Data Protection Law (LGPD) Corpus
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13371638
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains privacy policies paragraphs in Portuguese. Each paragraph was annotated by an expert annotator, using a guideline (DOI: 10.5281/zenodo.13371432). Two types of notes were made: the category of the Brazilian Data Protection (LGPD) Law that fits the text and the level of compliance with LGPD.
The categories are divided into three blocks: Omission of data required by law (block 1), Data processing (block 2), Unclear language and others (block 3). There are 3 levels of compliance, with level 1 being in compliance with the law, level 2 being partial potential non-compliance, and level 3 being potential total non-compliance.
There are 6341 distinct paragraphs. The corpus has more records (8341 clauses), as there are duplications, since a paragraph can belong to more than one guideline category. Pontetially non-compliant clauses corresponds to 1413 records (21.9%). Below, statistics regarding the number of paragraphs belonging to each category and frequencies of categories in privacy policies.
Category
Number of clauses
Document frequency
Block 1: Omission of data required by law
Access to data
283
61
Anonymization, blocking and deletion
204
46
Automated decision
45
19
Category of processed data
1427
63
Controller identification
107
47
Data correction
154
52
Duration of treatment
234
52
Existence of treatment
142
37
Express consent
176
41
ID and contact DPO
150
46
Non-consent
91
33
Personal data source
471
55
Portability
97
36
Purpose of sharing
119
16
Purpose of treatment
1620
71
Revoke consent
154
50
Right of deletion
173
40
Third party sharing
919
69
Block 2: Data processing
Advertising
215
38
Children data
118
37
Cookies
432
60
Consent by use
339
47
Other consents
73
30
Policy changes
210
61
"Take it or leave it"
50
21
Block 3: Unclear language and others
Generic expressions
244
37
Other unclear clauses
94
26
创建时间:
2024-08-25



