手把手教你构建多义务 深度学习实战 多标签模型
多义务多标签模型是现代机器学习中的基础架构,这个义务在概念上很繁难 -训练一个模型同时预测多个义务的多个输入。
在本文中,咱们将基于盛行的 MovieLens 数据集,经常使用稠密特色来创立一个多义务多标签模型,并逐渐引见整个环节。所以本文将涵盖数据预备、模型构建、训练循环、模型诊断,最后经常使用 Ray Serve 部署模型的所有流程。
1.设置环境
在深化代码之前,请确保装置了必要的库(以下不是详尽列表):
pip install pandas scikit-learn torch ray[serve] matplotlib requests tensorboard
咱们在这里经常使用的数据集足够小,所以可以经常使用 CPU 启动训练。
2.预备数据集
咱们将从创立用于解决 MovieLens 数据集的下载、预解决的类开局,而后将数据宰割为训练集和测试集。
MovieLens数据集蕴含无关用户、电影及其评分的消息,咱们将用它来预测评分(回归义务)和用户能否青睐这部电影(二元分类义务)。
import osimport pandas as pdfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import LabelEncoderimport torchfrom torch.utils.data import,):print("Initializing MovieLensDataset...")if not os.path.exists(data_dir):os.makedirs(data_dir)if):if split == "train":data = self.train_dfelif split == "test":data = self.test_dfelse:raise ValueError("Invalid split. Choose 'train' or 'test'.")dense_features = torch.tensor(data[['user', 'movie']].values, dtype=torch.long)labels = torch.tensor(data[['rating', 'liked']].values, dtype=torch.float32)return dense_features, labelsdef get_encoders(self):return self.user_encoder, self.movie_encoder
定义了 MovieLensDataset,就可以将训练集和评价集加载到内存中
# Example usage with a single if you are using# a GPUdataset = MovieLensDataset(dataset_version="small")print("Getting training)print("Getting testing)# Create>
3.定义多义务多标签模型
咱们将定义一个基本的 PyTorch 模型,解决两个义务:预测评分(回归)和用户能否青睐这部电影(二元分类)。
模型经常使用稠密嵌入来示意用户和电影,并有共享层,这些共享层会输入到两个独自的输入层。
经过在义务之间共享一些层,并为每个特定义务的输入设置独自的层,该模型应用了共享示意,同时依然针对每个义务定制其预测。
from torch import nnclass MultiTaskMovieLensModel(nn.Module):def __init__(self, n_users, n_movies, embedding_size, hidden_size):super(MultiTaskMovieLensModel, self).__init__()self.user_embedding = nn.Embedding(n_users, embedding_size)self.movie_embedding = nn.Embedding(n_movies, embedding_size)self.shared_layer = nn.Linear(embedding_size * 2, hidden_size)self.shared_activation = nn.ReLU()self.task1_fc = nn.Linear(hidden_size, 1)self.task2_fc = nn.Linear(hidden_size, 1)self.task2_activation = nn.Sigmoid()def forward(self, x):user = x[:, 0]movie = x[:, 1]user_embed = self.user_embedding(user)movie_embed = self.movie_embedding(movie)combined = torch.cat((user_embed, movie_embed), dim=1)shared_out = self.shared_activation(self.shared_layer(combined))rating_out = self.task1_fc(shared_out)liked_out = self.task2_fc(shared_out)liked_out = self.task2_activation(liked_out)return rating_out, liked_out
输入 (x):
用户和电影嵌入:
衔接:
共享层:
义务特定输入:
前往 :
模型前往两个输入:
4.训练循环
首先,用一些恣意选用的超参数(嵌入维度和暗藏层中的神经元数量)实例化咱们的模型。关于回归义务将经常使用均方误差损失,关于分类义务,将经常使用二元交叉熵。
咱们可以经过它们的初始值来归一化两个损失,以确保它们都大抵处于相似的尺度(这里也可以经常使用不确定性加权来归一化损失)
而后将经常使用数据加载器训练模型,并跟踪两个义务的损失。损失将被绘制成图表,以可视化模型在评价集上随期间的学习和泛化状况。
import torch.optim as optimimport matplotlib.pyplot as plt# Check if GPU is availabledevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")print(f"Using device: {device}")embedding_size = 16hidden_size = 32n_users = len(dataset.get_encoders()[0].classes_)n_movies = len(dataset.get_encoders()[1].classes_)model = MultiTaskMovieLensModel(n_users, n_movies, embedding_size, hidden_size).to(device)criterion_rating = nn.MSELoss()criterion_liked = nn.BCELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)train_rating_losses, train_liked_losses = [], []eval_rating_losses, eval_liked_losses = [], []epochs = 10# used for loss normalizationinitial_loss_rating = Noneinitial_loss_liked = Nonefor epoch in range(epochs):model.train()running_loss_rating = 0.0running_loss_liked = 0.0for dense_features, labels in train_loader:optimizer.zero_grad()dense_features = dense_features.to(device)labels = labels.to(device)rating_pred, liked_pred = model(dense_features)rating_target = labels[:, 0].unsqueeze(1)liked_target = labels[:, 1].unsqueeze(1)loss_rating = criterion_rating(rating_pred, rating_target)loss_liked = criterion_liked(liked_pred, liked_target)# Set initial lossesif initial_loss_rating is None:initial_loss_rating = loss_rating.item()if initial_loss_liked is None:initial_loss_liked = loss_liked.item()# Normalize lossesloss = (loss_rating / initial_loss_rating) + (loss_liked / initial_loss_liked)loss.backward()optimizer.step()running_loss_rating += loss_rating.item()running_loss_liked += loss_liked.item()train_rating_losses.append(running_loss_rating / len(train_loader))train_liked_losses.append(running_loss_liked / len(train_loader))model.eval()eval_loss_rating = 0.0eval_loss_liked = 0.0with torch.no_grad():for dense_features, labels in test_loader:dense_features = dense_features.to(device)labels = labels.to(device)rating_pred, liked_pred = model(dense_features)rating_target = labels[:, 0].unsqueeze(1)liked_target = labels[:, 1].unsqueeze(1)loss_rating = criterion_rating(rating_pred, rating_target)loss_liked = criterion_liked(liked_pred, liked_target)eval_loss_rating += loss_rating.item()eval_loss_liked += loss_liked.item()eval_rating_losses.append(eval_loss_rating / len(test_loader))eval_liked_losses.append(eval_loss_liked / len(test_loader))print(f'Epoch {epoch+1}, Train Rating Loss: {train_rating_losses[-1]}, Train Liked Loss: {train_liked_losses[-1]}, Eval Rating Loss: {eval_rating_losses[-1]}, Eval Liked Loss: {eval_liked_losses[-1]}')# Plotting lossesplt.figure(figsize=(14, 6))plt.subplot(1, 2, 1)plt.plot(train_rating_losses, label='Train Rating Loss')plt.plot(eval_rating_losses, label='Eval Rating Loss')plt.xlabel('Epoch')plt.ylabel('Loss')plt.title('Rating Loss')plt.legend()plt.subplot(1, 2, 2)plt.plot(train_liked_losses, label='Train Liked Loss')plt.plot(eval_liked_losses, label='Eval Liked Loss')plt.xlabel('Epoch')plt.ylabel('Loss')plt.title('Liked Loss')plt.legend()plt.tight_layout()plt.show()
还可以经过应用 Tensorboard 监控训练的环节
from torch.utils.tensorboard import SummaryWriter# Check if GPU is availabledevice = torch.device("cuda" if torch.cuda.is_available() else "cpu")print(f"Using device: {device}")# Model and Training Setupembedding_size = 16hidden_size = 32n_users = len(user_encoder.classes_)n_movies = len(movie_encoder.classes_)model = MultiTaskMovieLensModel(n_users, n_movies, embedding_size, hidden_size).to(device)criterion_rating = nn.MSELoss()criterion_liked = nn.BCELoss()optimizer = optim.Adam(model.parameters(), lr=0.001)epochs = 10# used for loss normalizationinitial_loss_rating = Noneinitial_loss_liked = None# TensorBoard setupwriter = SummaryWriter(log_dir='runs/multitask_movie_lens')# Training Loop with TensorBoard Loggingfor epoch in range(epochs):model.train()running_loss_rating = 0.0running_loss_liked = 0.0for batch_idx, (dense_features, labels) in enumerate(train_loader):# Move>