第七周 马铃薯的病害识别

发布时间:2026/7/4 3:23:30
第七周 马铃薯的病害识别 本文为365天深度学习训练营中的学习记录博客原作者K同学啊语言环境Python3.13编译器jupyter notebook深度学习环境Pytorch torch 2.12cuda12.8 torchvision 0.27我们使用的是马铃薯病害数据集该数据集包含表现出各种疾病的马铃薯植物的高分辨率图像包括早期疫病、晚期疫病和健康叶子。它旨在帮助开发和测试图像识别模型以实现准确的疾病检测和分类从而促进农业诊断的进步。一、 前期准备配置GPU略可参见之前文章2. 导入数据import os,PIL,random,pathlib data_dir ./data/PotatoPlants/ data_dir pathlib.Path(data_dir) data_paths list(data_dir.glob(*)) classeNames [str(path).split(\\)[2] for path in data_paths] classeNames输出数据[Early_blight, healthy, Late_blight]# 关于transforms.Compose的更多介绍可以参考https://blog.csdn.net/qq_38251616/article/details/124878863 train_transforms transforms.Compose([ transforms.Resize([224, 224]), # 将输入图片resize成统一尺寸 # transforms.RandomHorizontalFlip(), # 随机水平翻转 transforms.ToTensor(), # 将PIL Image或numpy.ndarray转换为tensor并归一化到[0,1]之间 transforms.Normalize( # 标准化处理--转换为标准正太分布高斯分布使模型更容易收敛 mean[0.485, 0.456, 0.406], std[0.229, 0.224, 0.225]) # 其中 mean[0.485,0.456,0.406]与std[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。 ]) test_transform transforms.Compose([ transforms.Resize([224, 224]), # 将输入图片resize成统一尺寸 transforms.ToTensor(), # 将PIL Image或numpy.ndarray转换为tensor并归一化到[0,1]之间 transforms.Normalize( # 标准化处理--转换为标准正太分布高斯分布使模型更容易收敛 mean[0.485, 0.456, 0.406], std[0.229, 0.224, 0.225]) # 其中 mean[0.485,0.456,0.406]与std[0.229,0.224,0.225] 从数据集中随机抽样计算得到的。 ]) total_data datasets.ImageFolder(./data/PotatoPlants/,transformtrain_transforms) total_data输出Dataset ImageFolder Number of datapoints: 2152 Root location: ./data/PotatoPlants/ StandardTransform Transform: Compose( Resize(size[224, 224], interpolationbilinear, max_sizeNone, antialiasTrue) ToTensor() Normalize(mean[0.485, 0.456, 0.406], std[0.229, 0.224, 0.225]) )total_data.class_to_idx输出{Early_blight: 0, Late_blight: 1, healthy: 2}3. 划分数据集train_size int(0.8 * len(total_data)) test_size len(total_data) - train_size train_dataset, test_dataset torch.utils.data.random_split(total_data, [train_size, test_size]) train_dataset, test_dataset(torch.utils.data.dataset.Subset at 0x21fca73acf0, torch.utils.data.dataset.Subset at 0x21dc5aa9090)batch_size 32 train_dl torch.utils.data.DataLoader(train_dataset, batch_sizebatch_size, shuffleTrue, num_workers1) test_dl torch.utils.data.DataLoader(test_dataset, batch_sizebatch_size, shuffleTrue, num_workers1)for X, y in test_dl: print(Shape of X [N, C, H, W]: , X.shape) print(Shape of y: , y.shape, y.dtype) break输出Shape of X [N, C, H, W]: torch.Size([32, 3, 224, 224]) Shape of y: torch.Size([32]) torch.int64二、手动搭建VGG-16模型VVG-16结构说明13个卷积层Convolutional Layer分别用blockX_convX表示3个全连接层Fully connected Layer分别用fcX与predictions表示5个池化层Pool layer分别用blockX_pool表示VGG-16包含了16个隐藏层13个卷积层和3个全连接层故称为VGG-16这里用一个视频来展示VGG-16的传播过程VGG-16的传播过程1. 搭建模型import torch.nn.functional as F class vgg16(nn.Module): def __init__(self): super(vgg16, self).__init__() # 卷积块1 self.block1 nn.Sequential( nn.Conv2d(3, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.MaxPool2d(kernel_size(2, 2), stride(2, 2)) ) # 卷积块2 self.block2 nn.Sequential( nn.Conv2d(64, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(128, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.MaxPool2d(kernel_size(2, 2), stride(2, 2)) ) # 卷积块3 self.block3 nn.Sequential( nn.Conv2d(128, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.MaxPool2d(kernel_size(2, 2), stride(2, 2)) ) # 卷积块4 self.block4 nn.Sequential( nn.Conv2d(256, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.MaxPool2d(kernel_size(2, 2), stride(2, 2)) ) # 卷积块5 self.block5 nn.Sequential( nn.Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)), nn.ReLU(), nn.MaxPool2d(kernel_size(2, 2), stride(2, 2)) ) # 全连接网络层用于分类 self.classifier nn.Sequential( nn.Linear(in_features512*7*7, out_features4096), nn.ReLU(), nn.Linear(in_features4096, out_features4096), nn.ReLU(), nn.Linear(in_features4096, out_features3) ) def forward(self, x): x self.block1(x) x self.block2(x) x self.block3(x) x self.block4(x) x self.block5(x) x torch.flatten(x, start_dim1) x self.classifier(x) return x device cuda if torch.cuda.is_available() else cpu print(Using {} device.format(device)) model vgg16().to(device) model代码输出Using cuda devicevgg16( (block1): Sequential( (0): Conv2d(3, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (1): ReLU() (2): Conv2d(64, 64, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (3): ReLU() (4): MaxPool2d(kernel_size(2, 2), stride(2, 2), padding0, dilation1, ceil_modeFalse) ) (block2): Sequential( (0): Conv2d(64, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (1): ReLU() (2): Conv2d(128, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (3): ReLU() (4): MaxPool2d(kernel_size(2, 2), stride(2, 2), padding0, dilation1, ceil_modeFalse) ) (block3): Sequential( (0): Conv2d(128, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (1): ReLU() (2): Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (3): ReLU() (4): Conv2d(256, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (5): ReLU() (6): MaxPool2d(kernel_size(2, 2), stride(2, 2), padding0, dilation1, ceil_modeFalse) ) (block4): Sequential( (0): Conv2d(256, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (1): ReLU() (2): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (3): ReLU() (4): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (5): ReLU() (6): MaxPool2d(kernel_size(2, 2), stride(2, 2), padding0, dilation1, ceil_modeFalse) ) (block5): Sequential( (0): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (1): ReLU() (2): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (3): ReLU() (4): Conv2d(512, 512, kernel_size(3, 3), stride(1, 1), padding(1, 1)) (5): ReLU() (6): MaxPool2d(kernel_size(2, 2), stride(2, 2), padding0, dilation1, ceil_modeFalse) ) (classifier): Sequential( (0): Linear(in_features25088, out_features4096, biasTrue) (1): ReLU() (2): Linear(in_features4096, out_features4096, biasTrue) (3): ReLU() (4): Linear(in_features4096, out_features3, biasTrue) ) )2. 查看模型详情# 统计模型参数量以及其他指标 import torchsummary as summary summary.summary(model, (3, 224, 224))---------------------------------------------------------------- Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 224, 224] 1,792 ReLU-2 [-1, 64, 224, 224] 0 Conv2d-3 [-1, 64, 224, 224] 36,928 ReLU-4 [-1, 64, 224, 224] 0 MaxPool2d-5 [-1, 64, 112, 112] 0 Conv2d-6 [-1, 128, 112, 112] 73,856 ReLU-7 [-1, 128, 112, 112] 0 Conv2d-8 [-1, 128, 112, 112] 147,584 ReLU-9 [-1, 128, 112, 112] 0 MaxPool2d-10 [-1, 128, 56, 56] 0 Conv2d-11 [-1, 256, 56, 56] 295,168 ReLU-12 [-1, 256, 56, 56] 0 Conv2d-13 [-1, 256, 56, 56] 590,080 ReLU-14 [-1, 256, 56, 56] 0 Conv2d-15 [-1, 256, 56, 56] 590,080 ReLU-16 [-1, 256, 56, 56] 0 MaxPool2d-17 [-1, 256, 28, 28] 0 Conv2d-18 [-1, 512, 28, 28] 1,180,160 ReLU-19 [-1, 512, 28, 28] 0 Conv2d-20 [-1, 512, 28, 28] 2,359,808 ReLU-21 [-1, 512, 28, 28] 0 Conv2d-22 [-1, 512, 28, 28] 2,359,808 ReLU-23 [-1, 512, 28, 28] 0 MaxPool2d-24 [-1, 512, 14, 14] 0 Conv2d-25 [-1, 512, 14, 14] 2,359,808 ReLU-26 [-1, 512, 14, 14] 0 Conv2d-27 [-1, 512, 14, 14] 2,359,808 ReLU-28 [-1, 512, 14, 14] 0 Conv2d-29 [-1, 512, 14, 14] 2,359,808 ReLU-30 [-1, 512, 14, 14] 0 MaxPool2d-31 [-1, 512, 7, 7] 0 Linear-32 [-1, 4096] 102,764,544 ReLU-33 [-1, 4096] 0 Linear-34 [-1, 4096] 16,781,312 ReLU-35 [-1, 4096] 0 Linear-36 [-1, 3] 12,291 Total params: 134,272,835 Trainable params: 134,272,835 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 218.52 Params size (MB): 512.21 Estimated Total Size (MB): 731.30三、 训练模型1. 编写训练函数# 训练循环 def train(dataloader, model, loss_fn, optimizer): size len(dataloader.dataset) # 训练集的大小 num_batches len(dataloader) # 批次数目, (size/batch_size向上取整) train_loss, train_acc 0, 0 # 初始化训练损失和正确率 for X, y in dataloader: # 获取图片及其标签 X, y X.to(device), y.to(device) # 计算预测误差 pred model(X) # 网络输出 loss loss_fn(pred, y) # 计算网络输出和真实值之间的差距targets为真实值计算二者差值即为损失 # 反向传播 optimizer.zero_grad() # grad属性归零 loss.backward() # 反向传播 optimizer.step() # 每一步自动更新 # 记录acc与loss train_acc (pred.argmax(1) y).type(torch.float).sum().item() train_loss loss.item() train_acc / size train_loss / num_batches return train_acc, train_loss2. 编写测试函数测试函数和训练函数大致相同但是由于不进行梯度下降对网络权重进行更新所以不需要传入优化器def test (dataloader, model, loss_fn): size len(dataloader.dataset) # 测试集的大小 num_batches len(dataloader) # 批次数目, (size/batch_size向上取整) test_loss, test_acc 0, 0 # 当不进行训练时停止梯度更新节省计算内存消耗 with torch.no_grad(): for imgs, target in dataloader: imgs, target imgs.to(device), target.to(device) # 计算loss target_pred model(imgs) loss loss_fn(target_pred, target) test_loss loss.item() test_acc (target_pred.argmax(1) target).type(torch.float).sum().item() test_acc / size test_loss / num_batches return test_acc, test_loss3. 正式训练model.train()、model.eval()训练营往期文章中有详细的介绍。如果将优化器换成 SGD 会发生什么呢请自行探索接下来发生的诡异事件的原因。import copy optimizer torch.optim.Adam(model.parameters(), lr 1e-4) loss_fn nn.CrossEntropyLoss() # 创建损失函数 epochs 40 train_loss [] train_acc [] test_loss [] test_acc [] best_acc 0 # 设置一个最佳准确率作为最佳模型的判别指标 for epoch in range(epochs): model.train() epoch_train_acc, epoch_train_loss train(train_dl, model, loss_fn, optimizer) model.eval() epoch_test_acc, epoch_test_loss test(test_dl, model, loss_fn) # 保存最佳模型到 best_model if epoch_test_acc best_acc: best_acc epoch_test_acc best_model copy.deepcopy(model) train_acc.append(epoch_train_acc) train_loss.append(epoch_train_loss) test_acc.append(epoch_test_acc) test_loss.append(epoch_test_loss) # 获取当前的学习率 lr optimizer.state_dict()[param_groups][0][lr] template (Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}) print(template.format(epoch1, epoch_train_acc*100, epoch_train_loss, epoch_test_acc*100, epoch_test_loss, lr)) # 保存最佳模型到文件中 PATH ./best_model.pth # 保存的参数文件名 torch.save(model.state_dict(), PATH) print(Done)输出Epoch: 1, Train_acc:46.3%, Train_loss:0.927, Test_acc:43.4%, Test_loss:0.910, Lr:1.00E-04 Epoch: 2, Train_acc:46.9%, Train_loss:0.901, Test_acc:49.0%, Test_loss:0.915, Lr:1.00E-04 Epoch: 3, Train_acc:47.5%, Train_loss:0.907, Test_acc:49.0%, Test_loss:0.924, Lr:1.00E-04 Epoch: 4, Train_acc:46.1%, Train_loss:0.904, Test_acc:43.4%, Test_loss:0.926, Lr:1.00E-04 Epoch: 5, Train_acc:47.4%, Train_loss:0.905, Test_acc:58.7%, Test_loss:0.875, Lr:1.00E-04 Epoch: 6, Train_acc:69.0%, Train_loss:0.680, Test_acc:76.1%, Test_loss:0.530, Lr:1.00E-04 Epoch: 7, Train_acc:83.9%, Train_loss:0.460, Test_acc:85.2%, Test_loss:0.410, Lr:1.00E-04 Epoch: 8, Train_acc:85.9%, Train_loss:0.421, Test_acc:86.8%, Test_loss:0.360, Lr:1.00E-04 Epoch: 9, Train_acc:82.0%, Train_loss:0.489, Test_acc:83.5%, Test_loss:0.395, Lr:1.00E-04 Epoch:10, Train_acc:87.8%, Train_loss:0.325, Test_acc:74.7%, Test_loss:0.833, Lr:1.00E-04 Epoch:11, Train_acc:89.9%, Train_loss:0.279, Test_acc:89.1%, Test_loss:0.260, Lr:1.00E-04 Epoch:12, Train_acc:91.5%, Train_loss:0.230, Test_acc:91.2%, Test_loss:0.213, Lr:1.00E-04 Epoch:13, Train_acc:92.5%, Train_loss:0.218, Test_acc:88.9%, Test_loss:0.243, Lr:1.00E-04 Epoch:14, Train_acc:94.0%, Train_loss:0.170, Test_acc:92.8%, Test_loss:0.155, Lr:1.00E-04 Epoch:15, Train_acc:95.0%, Train_loss:0.132, Test_acc:94.0%, Test_loss:0.205, Lr:1.00E-04 Epoch:16, Train_acc:94.5%, Train_loss:0.164, Test_acc:95.4%, Test_loss:0.130, Lr:1.00E-04 Epoch:17, Train_acc:96.8%, Train_loss:0.095, Test_acc:95.8%, Test_loss:0.099, Lr:1.00E-04 Epoch:18, Train_acc:97.7%, Train_loss:0.063, Test_acc:94.9%, Test_loss:0.189, Lr:1.00E-04 Epoch:19, Train_acc:97.3%, Train_loss:0.070, Test_acc:96.3%, Test_loss:0.090, Lr:1.00E-04 Epoch:20, Train_acc:98.4%, Train_loss:0.045, Test_acc:96.1%, Test_loss:0.117, Lr:1.00E-04 Epoch:21, Train_acc:98.6%, Train_loss:0.043, Test_acc:96.1%, Test_loss:0.092, Lr:1.00E-04 Epoch:22, Train_acc:98.1%, Train_loss:0.050, Test_acc:92.6%, Test_loss:0.221, Lr:1.00E-04 Epoch:23, Train_acc:96.7%, Train_loss:0.084, Test_acc:95.6%, Test_loss:0.098, Lr:1.00E-04 Epoch:24, Train_acc:98.6%, Train_loss:0.036, Test_acc:97.4%, Test_loss:0.093, Lr:1.00E-04 Epoch:25, Train_acc:99.4%, Train_loss:0.020, Test_acc:97.4%, Test_loss:0.090, Lr:1.00E-04 Epoch:26, Train_acc:98.6%, Train_loss:0.048, Test_acc:95.1%, Test_loss:0.141, Lr:1.00E-04 Epoch:27, Train_acc:98.3%, Train_loss:0.054, Test_acc:95.6%, Test_loss:0.127, Lr:1.00E-04 Epoch:28, Train_acc:98.4%, Train_loss:0.039, Test_acc:97.9%, Test_loss:0.062, Lr:1.00E-04 Epoch:29, Train_acc:99.7%, Train_loss:0.011, Test_acc:94.4%, Test_loss:0.209, Lr:1.00E-04 Epoch:30, Train_acc:99.0%, Train_loss:0.022, Test_acc:96.8%, Test_loss:0.112, Lr:1.00E-04 Epoch:31, Train_acc:99.7%, Train_loss:0.008, Test_acc:96.3%, Test_loss:0.139, Lr:1.00E-04 Epoch:32, Train_acc:98.5%, Train_loss:0.047, Test_acc:97.4%, Test_loss:0.059, Lr:1.00E-04 Epoch:33, Train_acc:99.8%, Train_loss:0.009, Test_acc:97.0%, Test_loss:0.078, Lr:1.00E-04 Epoch:34, Train_acc:98.0%, Train_loss:0.061, Test_acc:97.2%, Test_loss:0.087, Lr:1.00E-04 Epoch:35, Train_acc:99.2%, Train_loss:0.020, Test_acc:96.1%, Test_loss:0.108, Lr:1.00E-04 Epoch:36, Train_acc:99.7%, Train_loss:0.015, Test_acc:95.8%, Test_loss:0.190, Lr:1.00E-04 Epoch:37, Train_acc:99.8%, Train_loss:0.008, Test_acc:97.0%, Test_loss:0.157, Lr:1.00E-04 Epoch:38, Train_acc:97.4%, Train_loss:0.064, Test_acc:94.2%, Test_loss:0.218, Lr:1.00E-04 Epoch:39, Train_acc:99.3%, Train_loss:0.023, Test_acc:98.1%, Test_loss:0.075, Lr:1.00E-04 Epoch:40, Train_acc:99.8%, Train_loss:0.009, Test_acc:96.8%, Test_loss:0.083, Lr:1.00E-04 Done四、 结果可视化1. Loss与Accuracy图import matplotlib.pyplot as plt #隐藏警告 import warnings warnings.filterwarnings(ignore) #忽略警告信息 plt.rcParams[font.sans-serif] [SimHei] # 用来正常显示中文标签 plt.rcParams[axes.unicode_minus] False # 用来正常显示负号 plt.rcParams[figure.dpi] 100 #分辨率 from datetime import datetime current_time datetime.now() # 获取当前时间 epochs_range range(epochs) plt.figure(figsize(12, 3)) plt.subplot(1, 2, 1) plt.plot(epochs_range, train_acc, labelTraining Accuracy) plt.plot(epochs_range, test_acc, labelTest Accuracy) plt.legend(loclower right) plt.title(Training and Validation Accuracy) plt.xlabel(current_time) # 打卡请带上时间戳否则代码截图无效 plt.subplot(1, 2, 2) plt.plot(epochs_range, train_loss, labelTraining Loss) plt.plot(epochs_range, test_loss, labelTest Loss) plt.legend(locupper right) plt.title(Training and Validation Loss) plt.show()2. 指定图片进行预测from PIL import Image classes list(total_data.class_to_idx) def predict_one_image(image_path, model, transform, classes): test_img Image.open(image_path).convert(RGB) plt.imshow(test_img) # 展示预测的图片 test_img transform(test_img) img test_img.to(device).unsqueeze(0) model.eval() output model(img) _,pred torch.max(output,1) pred_class classes[pred] print(f预测结果是{pred_class})# 预测训练集中的某张照片 predict_one_image(image_path./data/PotatoPlants/Early_blight/042135e2-e126-4900-9212-d42d900b8125___RS_Early.B 8791.JPG, modelmodel, transformtrain_transforms, classesclasses)预测结果是Early_blight3. 模型评估best_model.eval() epoch_test_acc, epoch_test_loss test(test_dl, best_model, loss_fn) epoch_test_acc, epoch_test_loss代码输出(0.9814385150812065, 0.07529437092099604)