Recommender systems • 33 implied HN points • 06 Jan 24
- Training an early ranker to mimic the final ranker can improve top-line metrics and reduce costs
- Knowledge distillation involves training a student model, the early ranker, to learn from a teacher model, the final ranker
- Implementing knowledge distillation through shared or auxiliary tasks can increase alignment between the early and final rankers