five

Two Tower Recommendation Model Training

收藏
Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/cc45e324-1523-4d8d-a2a0-b59eb7858e04/Databricks_Two-Tower-Recommendation-Model-Training
下载链接
链接失效反馈
官方服务:
资源简介:
**Overview** This is a sample implementation of the [Two Tower Recommendation Model](https://cloud.google.com/blog/products/ai-machine-learning/scaling-deep-retrieval-tensorflow-two-towers-architecture) on Databricks with the following features: 1. [TorchRec](https://pytorch.org/torchrec/): for handling large datasets with many categorical features (where the embedding tables can't fit inside one GPU) 2. [TorchDistributor](https://docs.databricks.com/en/machine-learning/train-model/distributed-training/spark-pytorch-distributor.html): for doing distributed training on Databricks 3. [Mosaic StreamingDataset](https://docs.mosaicml.com/projects/streaming/en/stable/): for efficient data loading in a distributed environment **Use case** The Two Tower model is an effective collaborative flltering architecture that places disconnected categories (like users and products) in the same vector space for recommendations. The example provided uses the [Learning from Sets of Items dataset](https://grouplens.org/datasets/learning-from-sets-of-items-2019/) to show that in action. The example notebook can be repurposed for your own recommendation use case. **Product details** For more specific details, refer to the embedded notebook which contains a guide for how to train the Two Tower model on Databricks. **Licenses** - The "Learning from Sets of Items" dataset has the Creative Commons 4.0 license. - The implementation of Two Tower on Databricks is based on this [implementation](https://github.com/pytorch/torchrec/blob/main/examples/retrieval/two_tower_train.py#L75) within TorchRec with the BSD-3-Clause License.
提供机构:
Databricks
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作