kanoyo/kaggle

Name: kanoyo/kaggle
Creator: kanoyo
Published: 2024-02-23 19:58:29
License: 暂无描述

Hugging Face2024-02-23 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/kanoyo/kaggle

下载链接

链接失效反馈

官方服务：

资源简介：

# 🍏 Applio-RVC-Fork > [!NOTE] > Applio-RVC-Fork is designed to complement existing repositories, and as such, certain features may be in experimental stages, potentially containing bugs. Additionally, there might be instances of coding practices that could be improved or refined. It is not intended to replace any other repository. [![Discord](https://img.shields.io/badge/SUPPORT_DISCORD-37a779?style=for-the-badge)](https://discord.gg/IAHispano) [![Discord Bot](https://img.shields.io/badge/DISCORD_BOT-37a779?style=for-the-badge)](https://bot.applio.org) [![Docs](https://img.shields.io/badge/DOCS-37a779?style=for-the-badge)](https://docs.applio.org) ## 📚 Table of Contents _This README has been enhanced by incorporating the features introduced in Applio-RVC-Fork to the original [Mangio-RVC-Fork README](https://github.com/Mangio621/Mangio-RVC-Fork/blob/main/README.md), along with additional details and explanations._ 1. [Improvements of Applio Over RVC](#-improvements-of-applio-rvc-fork-over-rvc) 2. [Additional Features of This Repository](#️-additional-features-of-this-repository) 3. [Todo Tasks](#-todo-tasks) 4. [Installation](#-installation) 5. [Running the Web GUI (Inference & Train)](#-running-the-web-gui-inference--train) 6. [Running the CLI (Inference & Train)](#-running-the-cli-inference--train) 7. [Credits](#-credits) 8. [Thanks to all RVC, Mangio and Applio contributors](#-thanks-to-all-rvc-mangio-and-applio-contributors) ## 🎯 Improvements of Applio-RVC-Fork Over RVC _The comparisons are with respect to the original [Retrieval-based-Voice-Conversion-WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) repository._ ### f0 Inference Algorithm Overhaul - Applio features a comprehensive overhaul of the f0 inference algorithm, including: - Addition of the pyworld dio f0 method. - Alternative method for calculating crepe f0. - Introduction of the torchcrepe crepe-tiny model. - Customizable crepe_hop_length for the crepe algorithm via both the web GUI and CLI. ### f0 Crepe Pitch Extraction for Training - Works on paperspace machines but not local MacOS/Windows machines (Potential memory leak). ### Paperspace Integration (Under maintenance, so it cannot be used for the moment.) - Applio seamlessly integrates with Paperspace, providing the following features: - Paperspace argument on infer-web.py (--paperspace) for sharing a Gradio link. - A dedicated make file tailored for Paperspace users. ### Access to Tensorboard - Applio grants easy access to Tensorboard via a Makefile and a Python script. ### CLI Functionality - Applio introduces command-line interface (CLI) functionality, with the addition of the --cli flag in infer-web.py for CLI system usage. ### f0 Hybrid Estimation Method - Applio offers a novel f0 hybrid estimation method by calculating nanmedian for a specified array of f0 methods, ensuring the best results from multiple methods (CLI exclusive). - This hybrid estimation method is also available for f0 feature extraction during training. ### UI Changes #### Inference: - A complete interface redesign enhances user experience, with notable features such as: - Audio recording directly from the interface. - Convenient drop-down menus for audio and .index file selection. - An advanced settings section with new features like autotune and formant shifting. #### Training: - Improved training features include: - A total epoch slider now limited to 10,000. - Increased save frequency limit to 100. - Default recommended options for smoother setup. - Better adaptation to high-resolution screens. - A drop-down menu for dataset selection. - Enhanced saving system options, including Save all files, Save G and D files, and Save model for inference. #### UVR: - Applio ensures compatibility with all VR/MDX models for an extended range of possibilities. #### TTS (Text-to-Speech, New): - Introducing a new Text-to-Speech (TTS) feature using RVC models. - Support for multiple languages and Edge-tts/Google-tts. #### Resources (New): - Users can now upload models, backups, datasets, and audios from various storage services like Drive, Huggingface, Discord, and more. - Download audios from YouTube with the ability to automatically separate instrumental and vocals, offering advanced options and UVR support. #### Extra (New): - Combine instrumental and vocals with ease, including independent volume control for each track and the option to add effects like reverb, compressor, and noise gate. - Significant improvements in the processing interface, allowing tasks such as merging models, modifying information, obtaining information, or extracting models effortlessly. ## ⚙️ Additional Features of This Repository In addition to the aforementioned improvements, this repository offers the following features: ### Enhanced Tone Leakage Reduction - Implements tone leakage reduction by replacing source features with training-set features using top1 retrieval. This helps in achieving cleaner audio results. ### Efficient Training - Provides a seamless and speedy training experience, even on relatively modest graphics cards. The system is optimized for efficient resource utilization. ### Data Efficiency - Supports training with a small dataset, yielding commendable results, especially with audio clips of at least 10 minutes of low-noise speech. ### Overtraining Detection - This feature keeps track of the current progress trend and stops the training if no improvement is found after 100 epochs. - During the 100 epochs with no improvement, no progress is saved. This allows you to continue training from the best-found epoch. - A `.pth` file of the best epoch is saved in the logs folder under `name_[epoch].pth`, and in the weights folder as `name_fittest.pth`. These files are the same. ### Mode Collapse Detection - This feature restarts training before a mode collapse by lowering the batch size until it can progress past the mode collapse. - If a mode collapse is overcome but another one occurs later, it will reset the batch size to its initial setting. This helps maintain training speed when dealing with multiple collapses. ## 📝 Todo Tasks - [ ] **Investigate GPU Detection Issue:** Address the GPU detection problem and ensure proper utilization of Nvidia GPU. - [ ] **Fix Mode Collapse Prevention Feature:** Refine the mode collapse prevention feature to maintain graph consistency during retraining. - [ ] **Resolve CUDA Compatibility Issue:** Investigate and resolve the cuFFT error related to CUDA compatibility. - [ ] **Refactor infer-web.py:** Organize the code of infer-web.py into different files for each tab, enhancing modularity. - [ ] **Expand UVR Model Options:** Integrate additional UVR models to provide users with more options and flexibility. - [ ] **Enhance Installation Process:** Improve the system installation process for better user experience and clarity. - [ ] **Implement Automatic Updates:** Add automatic update functionality to keep the application current with the latest features. - [ ] **Multilingual Support:** Include more translations for various languages. - [ ] **Diversify TTS Methods:** Introduce new TTS methods and enhance customization options for a richer user experience. - [ ] **CLI Improvement:** Enhance the CLI functionality and introduce a pipeline for a more streamlined user experience. - [ ] **Dependency Updates:** Keep dependencies up-to-date by regularly updating to the latest versions. - [ ] **Dataset Creation Assistant:** Develop an assistant for creating datasets to simplify and guide users through the process. ## ✨ Installation ### Automatic installation (Windows): To quickly and effortlessly install Applio along with all the necessary models and configurations on Windows, you can use the [install_Applio.bat](https://github.com/IAHispano/Applio-RVC-Fork/releases) script available in the releases section. ### Manual installation (Windows/MacOS): **Note for MacOS Users**: When using `faiss 1.7.2` under MacOS, you may encounter a Segmentation Fault: 11 error. To resolve this issue, install `faiss-cpu 1.7.0` using the following command if you're installing it manually with pip: ```bash pip install faiss-cpu==1.7.0 ``` Additionally, you can install Swig on MacOS using brew: ```bash brew install swig ``` Install requirements: _Before this install ffmpeg, wget, git and python (This fork just works with 3.9.X on Linux)_ ```bash wget https://github.com/IAHispano/Applio-RVC-Fork/releases/download/v2.0.0/install_Applio-linux.sh chmod +x install_Applio-linux.sh && ./install_Applio-linux.sh ``` ### Manual installation (Paperspace): ```bash cd Applio-RVC-Fork make install # Do this everytime you start your paperspace machine ``` ## 🪄 Running the Web GUI (Inference & Train) _Use --paperspace or --colab if on cloud system._ ```bash python infer-web.py --pycmd python --port 3000 ``` ## 💻 Running the CLI (Inference & Train) ```bash python infer-web.py --pycmd python --cli ``` ```bash Applio-RVC-Fork CLI Welcome to the CLI version of RVC. Please read the documentation on README.MD to understand how to use this app. You are currently in 'HOME': go home : Takes you back to home with a navigation list. go infer : Takes you to inference command execution. go pre-process : Takes you to training step.1) pre-process command execution. go extract-feature : Takes you to training step.2) extract-feature command execution. go train : Takes you to training step.3) being or continue training command execution. go train-feature : Takes you to the train feature index command execution. go extract-model : Takes you to the extract small model command execution. HOME: ``` Typing 'go infer' for example will take you to the infer page where you can then enter in your arguments that you wish to use for that specific page. For example typing 'go infer' will take you here: ```bash HOME: go infer You are currently in 'INFER': arg 1) model name with .pth in ./weights: mi-test.pth arg 2) source audio path: myFolder\MySource.wav arg 3) output file name to be placed in './audio-outputs': MyTest.wav arg 4) feature index file path: logs/mi-test/added_IVF3042_Flat_nprobe_1.index arg 5) speaker id: 0 arg 6) transposition: 0 arg 7) f0 method: harvest (pm, harvest, crepe, crepe-tiny) arg 8) crepe hop length: 160 arg 9) harvest median filter radius: 3 (0-7) arg 10) post resample rate: 0 arg 11) mix volume envelope: 1 arg 12) feature index ratio: 0.78 (0-1) arg 13) Voiceless Consonant Protection (Less Artifact): 0.33 (Smaller number = more protection. 0.50 means Dont Use.) Example: mi-test.pth saudio/Sidney.wav myTest.wav logs/mi-test/added_index.index 0 -2 harvest 160 3 0 1 0.95 0.33 INFER: <INSERT ARGUMENTS HERE OR COPY AND PASTE THE EXAMPLE> ``` ## 🏆 Credits Applio owes its existence to the collaborative efforts of various repositories, including Mangio-RVC-Fork, and all the other credited contributors. Without their contributions, Applio would not have been possible. Therefore, we kindly request that if you appreciate the work we've accomplished, you consider exploring the projects mentioned in our credits. Our goal is not to supplant RVC or Mangio; rather, we aim to provide a contemporary and up-to-date alternative for the entire community. - [VITS](https://github.com/jaywalnut310/vits) by jaywalnut310 - [Retrieval-based-Voice-Conversion-WebUI](https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI) by RVC-Project - [Mangio-RVC-Fork](https://github.com/Mangio621/Mangio-RVC-Fork) by Mangio621 - [Mangio-RVC-Tweaks](https://github.com/alexlnkp/Mangio-RVC-Tweaks) by alexlnkp - [RVG_tts](https://github.com/Foxify52/RVG_tts) by Foxify52 - [RMVPE](https://github.com/Dream-High/RMVPE) by Dream-High - [ContentVec](https://github.com/auspicious3000/contentvec/) by auspicious3000 - [HIFIGAN](https://github.com/jik876/hifi-gan) by jik876 - [Gradio](https://github.com/gradio-app/gradio) by gradio-app - [FFmpeg](https://github.com/FFmpeg/FFmpeg) by FFmpeg - [Ultimate Vocal Remover](https://github.com/Anjok07/ultimatevocalremovergui) by Anjok07 - [audio-slicer](https://github.com/openvpi/audio-slicer) by openvpi - [Ilaria-Audio-Analyzer](https://github.com/TheStingerX/Ilaria-Audio-Analyzer) by Ilaria > [!WARNING] > If you believe you've made contributions to the code utilized in Applio and should be acknowledged in the credits, please feel free to open a pull request (PR). It's possible that we may have unintentionally overlooked your contributions, and we appreciate your proactive approach in ensuring proper recognition. ## 🙏 Thanks to all RVC, Mangio and Applio contributors ### RVC: <a href="https://github.com/liujing04/Retrieval-based-Voice-Conversion-WebUI/graphs/contributors" target="_blank"> <img src="https://contrib.rocks/image?repo=liujing04/Retrieval-based-Voice-Conversion-WebUI" /> </a> ### Applio & Mangio: <a href="https://github.com/IAHispano/Applio-RVC-Fork/graphs/contributors" target="_blank"> <img src="https://contrib.rocks/image?repo=IAHispano/Applio-RVC-Fork" /> </a>

提供机构：

kanoyo

原始信息汇总

Applio-RVC-Fork 数据集概述

改进点

f0 推理算法全面改进

增加了 pyworld dio f0 方法。
提供了 crepe f0 计算的替代方法。
引入了 torchcrepe crepe-tiny 模型。
通过 web GUI 和 CLI 可自定义 crepe_hop_length。

f0 Crepe 音高提取用于训练

适用于 Paperspace 机器，但不适用于本地 MacOS/Windows 机器（可能存在内存泄漏）。

Paperspace 集成（维护中，暂时无法使用）

提供了 Paperspace 参数在 infer-web.py 中（--paperspace），用于共享 Gradio 链接。
为 Paperspace 用户定制了专门的 make 文件。

Tensorboard 访问

通过 Makefile 和 Python 脚本轻松访问 Tensorboard。

CLI 功能

引入了命令行界面（CLI）功能，通过 infer-web.py 中的 --cli 标志使用。

f0 混合估计方法

通过计算指定 f0 方法数组的 nanmedian，提供最佳结果（CLI 独有）。
在训练期间也可用于 f0 特征提取。

用户界面变化

完全重新设计了界面，增强了用户体验。
新增了直接从界面录制音频、下拉菜单选择音频和 .index 文件等功能。
高级设置部分新增了自动调谐和共振峰偏移等特性。

训练改进

总周期滑块现在限制为 10,000。
保存频率限制增加到 100。
默认推荐选项，适应高分辨率屏幕。
下拉菜单选择数据集，增强保存系统选项。

UVR 兼容性

确保与所有 VR/MDX 模型兼容。

TTS（文本到语音，新增）

使用 RVC 模型引入新的文本到语音（TTS）功能。
支持多种语言和 Edge-tts/Google-tts。

资源（新增）

用户可以从各种存储服务上传模型、备份、数据集和音频。
从 YouTube 下载音频，自动分离乐器和人声，支持 UVR。

额外功能（新增）

轻松合并乐器和人声，独立音量控制，添加效果如混响、压缩器和噪声门。
显著改进处理界面，轻松执行合并模型、修改信息、获取信息或提取模型等任务。

其他特性

增强音调泄漏减少

通过使用训练集特征替换源特征，实现更清晰的音频结果。

高效训练

即使在相对低端的显卡上也能提供快速无缝的训练体验。

数据效率

支持小数据集训练，尤其是至少 10 分钟低噪声语音的音频片段。

过度训练检测

跟踪当前进度趋势，如果在 100 个周期内没有改进，则停止训练。
在无改进的 100 个周期内不保存进度，允许从最佳发现的周期继续训练。

模式崩溃检测

在模式崩溃前重启训练，降低批量大小，直到能够通过模式崩溃。
如果克服了模式崩溃但后来又发生，则重置批量大小，保持训练速度。

待办任务

[ ] 调查 GPU 检测问题，确保 Nvidia GPU 的正确利用。
[ ] 修复模式崩溃预防功能，保持图表一致性。
[ ] 解决与 CUDA 兼容性相关的 cuFFT 错误。
[ ] 重构 infer-web.py，将代码组织到每个标签的不同文件中。
[ ] 扩展 UVR 模型选项，提供更多选择和灵活性。
[ ] 改进系统安装过程，提升用户体验和清晰度。
[ ] 实现自动更新功能，保持应用程序最新。
[ ] 增加更多语言的翻译支持。
[ ] 引入新的 TTS 方法，增强自定义选项。
[ ] 改进 CLI 功能，引入管道，提供更流畅的用户体验。
[ ] 定期更新依赖项到最新版本。
[ ] 开发数据集创建助手，简化用户创建数据集的过程。

安装

自动安装（Windows）

使用 install_Applio.bat 脚本快速安装。

手动安装（Windows/MacOS）

对于 MacOS 用户，安装 faiss-cpu 1.7.0 以解决 Segmentation Fault: 11 错误。
安装需求前，确保已安装 ffmpeg、wget、git 和 python（Linux 上仅支持 3.9.X）。

手动安装（Paperspace）

使用 make install 命令进行安装。

运行 Web GUI（推理与训练）

使用 python infer-web.py --pycmd python --port 3000 命令运行。

运行 CLI（推理与训练）

使用 python infer-web.py --pycmd python --cli 命令运行。

致谢

Applio 的开发得益于 Mangio-RVC-Fork 和其他贡献者的共同努力。
感谢所有 RVC、Mangio 和 Applio 的贡献者。

5,000+

优质数据集

54 个

任务类型

进入经典数据集