five

Clustering Based Technique for Outlier detection

收藏
DataCite Commons2025-04-27 更新2025-05-18 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=1c96c2371f80466f88ca0890fd17bada
下载链接
链接失效反馈
官方服务:
资源简介:
Certainly! Here's a description of the provided code:1. **MicroCluster Class**: This class represents a micro-cluster and its properties including `mD` (membership degree), `center` (center coordinates), `radius`, `edge_list` (list of neighboring micro-clusters), and `base_cluster_count` (number of base clusters within the micro-cluster).2. **distance Function**: This function calculates the Euclidean distance between two points in n-dimensional space.3. **update_center_radius Function**: This function updates the center and radius of a micro-cluster after incorporating a new point.4. **initialize_stage Function**: This function implements Algorithm 1 - Initializing Stage. It takes a new data point `s`, a list of micro-clusters `M_C`, a list of prospective micro-clusters `Pmc`, and radius parameters `RadiusM_N` and `RadiusM`. It updates the micro-cluster membership degrees and adjusts their centers and radii accordingly.5. **core_micro_cluster_formation Function**: This function implements Algorithm 2 - Core Micro-cluster Formation. It iteratively checks for intersecting micro-clusters and updates their edge lists to reflect their adjacency.6. **buffer_deletion Function**: This function implements Algorithm 3 - Buffer Deletion. It clears the list of prospective micro-clusters if it is not empty.7. **save_prospective_micro_clusters Function**: This function implements Algorithm 4 - Saving Prospective Microclusters. It adds outlier micro-clusters from the datastream to the list of prospective micro-clusters.8. **remove_outlier_micro_clusters Function**: This function implements Algorithm 5 - Removing Outlier Microclusters. It removes outlier micro-clusters from memory based on a specified threshold value and updates the edge lists of related core micro-clusters.9. **Usage Example**: This section demonstrates the usage of the implemented algorithms with sample data and parameters. It initializes data, executes the algorithms sequentially, and prints the resulting micro-cluster configurations after each algorithm execution.Overall, the provided code allows for the initialization, formation, and management of micro-clusters in a data stream mining context. It can be extended and customized for specific applications by adjusting the parameters and incorporating additional algorithms or functionalities.
提供机构:
Science Data Bank
创建时间:
2024-03-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作