Two Tower Recommendation Model Training
收藏Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/cc45e324-1523-4d8d-a2a0-b59eb7858e04/Databricks_Two-Tower-Recommendation-Model-Training
下载链接
链接失效反馈官方服务:
资源简介:
**Overview**
This is a sample implementation of the [Two Tower Recommendation Model](https://cloud.google.com/blog/products/ai-machine-learning/scaling-deep-retrieval-tensorflow-two-towers-architecture) on Databricks with the following features:
1. [TorchRec](https://pytorch.org/torchrec/): for handling large datasets with many categorical features (where the embedding tables can't fit inside one GPU)
2. [TorchDistributor](https://docs.databricks.com/en/machine-learning/train-model/distributed-training/spark-pytorch-distributor.html): for doing distributed training on Databricks
3. [Mosaic StreamingDataset](https://docs.mosaicml.com/projects/streaming/en/stable/): for efficient data loading in a distributed environment
**Use case**
The Two Tower model is an effective collaborative flltering architecture that places disconnected categories (like users and products) in the same vector space for recommendations. The example provided uses the [Learning from Sets of Items dataset](https://grouplens.org/datasets/learning-from-sets-of-items-2019/) to show that in action. The example notebook can be repurposed for your own recommendation use case.
**Product details**
For more specific details, refer to the embedded notebook which contains a guide for how to train the Two Tower model on Databricks.
**Licenses**
- The "Learning from Sets of Items" dataset has the Creative Commons 4.0 license.
- The implementation of Two Tower on Databricks is based on this [implementation](https://github.com/pytorch/torchrec/blob/main/examples/retrieval/two_tower_train.py#L75) within TorchRec with the BSD-3-Clause License.
提供机构:
Databricks



