five

sKT-Ai-Labs/HIN

收藏
Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/sKT-Ai-Labs/HIN
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation tags: - skt-ai - st-x - hinglish - large-scale - shrijan - ѕктαι - ∂αтαѕєт - мσ∂єℓ тяαιηιηg нєℓρƒυℓℓ language: - en - hi pretty_name: sʜᴏʀᴛ ᴅᴀᴛᴀsᴇᴛ 𝟷𝟶ᴍ size_categories: - 1M<n<10M --- 1 # 🚀 **SKT-HIN** 🇮🇳 # SKT AI LABS <svg width="100%" height="150" viewBox="0 0 600 150" xmlns="http://www.w3.org/2000/svg"> <defs> <linearGradient id="rainbow" x1="0%" y1="0%" x2="100%" y2="0%"> <stop offset="0%" stop-color="#FF0000"> <animate attributeName="stop-color" values="#FF0000;#FF7F00;#FFFF00;#00FF00;#0000FF;#4B0082;#9400D3;#FF0000" dur="5s" repeatCount="indefinite" /> </stop> <stop offset="100%" stop-color="#9400D3"> <animate attributeName="stop-color" values="#9400D3;#FF0000;#FF7F00;#FFFF00;#00FF00;#0000FF;#4B0082;#9400D3" dur="5s" repeatCount="indefinite" /> </stop> </linearGradient> </defs> <text x="50%" y="50%" dy=".35em" text-anchor="middle" font-family="Arial, Helvetica, sans-serif" font-size="28" font-weight="bold" fill="url(#rainbow)" opacity="1"> SKT AI LABS <animate attributeName="opacity" values="0;1;1;0" dur="6s" repeatCount="indefinite" begin="0s" /> </text> <text x="50%" y="80%" dy=".35em" text-anchor="middle" font-family="Courier New, monospace" font-size="18" fill="#888" opacity="1"> The Sovereign AI for India <animate attributeName="opacity" values="0;1;1;0" dur="6s" repeatCount="indefinite" begin="0.5s" /> </text> </svg> ### The Sovereign LLM Development For India (Project Surya) <p align="center"> <img src="https://huggingface.co/spaces/sKT-Ai-Labs/README/resolve/main/-p7u27d.jpg" width="350" style="border-radius: 25px; border: 3px solid #3b82f6; box-shadow: 0 10px 30px rgba(0,0,0,0.5);"> <br> ### ✨ **Overview** This dataset is a monumental collection of **320k high-quality conversation pairs** crafted in **Hinglish** (Hindi + English). It is meticulously engineered to empower Large Language Models (LLMs) with a deep understanding of Indian linguistic nuances and conversational contexts. --- ### 📊 **Key Features** * **✅ High Accuracy:** Every question is strictly mapped to a verified and factually correct response. * **🌈 Diversity:** Built with multiple response templates to prevent repetitive patterns and mode collapse. * **💎 Clean Data:** Each record features a unique UUID and rich metadata for seamless tracking and filtering. * **🇮🇳 Localization:** Perfect balance of Hindi and English for native-feel AI interactions. --- ### 📋 **Use Cases & Usage** 1. **🤖 Model Training:** Ideal for fine-tuning or building new LLMs from scratch. 2. **🎓 Education & Tech:** Perfect for training specialized chatbots in academic and technical domains. 3. **🔍 Linguistic Research:** Great for studying code-switching patterns in South Asian languages. --- ### 📜 **Licensing & Authorship** * **⚖️ License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) * **✍️ Authors:** Created by the **SKT AI TEAM** under the leadership of **SKT TEAM**. --- > **Note:** For future conversations, > **Contact US** -- * **SKTai@europe.com**
提供机构:
sKT-Ai-Labs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作