Splicing Detection and Localization for Speech Deepfakes using Audio Novelty
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14223053
下载链接
链接失效反馈官方服务:
资源简介:
With the rapid progress of artificial intelligence over the last few years, the possibility of generating highly realistic multimedia content is within everyone's reach. This has paved the way for the creation of deepfakes, synthetic data produced using deep learning techniques that realistically represent people in behaviors not belonging to them. In the audio case, speech deepfakes can be used alone or combined with splicing techniques.This consists of replacing portions of authentic speech with synthetic elements, thereby altering the conveyed message and leading to significant threats.In this work, we address the problem of splicing detection and localization in speech deepfakes.We consider spliced audio tracks created by substituting parts of a pristine speech with synthetically generated segments. We design a system able to detect whether a manipulation takes place and localize it in time.The proposed method, trained exclusively on splicing-free audio tracks, extracts embeddings from the input signal through a sliding window. Then, it employs audio novelty to measure the similarity among consecutive signal sections and uses it to detect and localize splicing points.We evaluate our method on a state-of-the-art dataset as well as a novel, bias-free corpus specifically developed and released in this paper. The proposed approach is benchmarked against multiple baselines for both splicing detection and localization tasks.
创建时间:
2024-11-29



