网站建设费用是什么科目wordpress 存储视频
网站建设费用是什么科目,wordpress 存储视频,企业网站带新闻发布功能的建站,建设工程施工合同模板基于飞桨实现图像分类#xff1a;从LeNet到ResNet
在医疗AI日益发展的今天#xff0c;如何利用深度学习自动识别眼底病变#xff0c;成为了一个极具现实意义的课题。尤其是在病理性近视#xff08;Pathologic Myopia, PM#xff09;这类致盲性眼病的早期筛查中#xff0…基于飞桨实现图像分类从LeNet到ResNet在医疗AI日益发展的今天如何利用深度学习自动识别眼底病变成为了一个极具现实意义的课题。尤其是在病理性近视Pathologic Myopia, PM这类致盲性眼病的早期筛查中准确、高效的图像分类模型不仅能减轻医生负担还能显著提升诊断覆盖率。而实现这一目标的关键正是那些推动计算机视觉不断突破的经典卷积神经网络。本文将以百度开源的深度学习平台PaddlePaddle飞桨为工具带你亲手复现从 LeNet 到 ResNet 的五大主流图像分类架构并在一个真实的眼疾识别数据集上完成训练与验证。整个过程不仅涵盖数据预处理、模型构建、训练流程设计等核心环节更通过代码级实现揭示不同网络结构的设计哲学与工程权衡。✅环境准备建议所有代码均可在 AI Studio 平台运行推荐使用“PaddlePaddle 国产全场景深度学习平台”镜像环境内置 PaddleCV、PaddleDetection 等工业级库开箱即用特别适合中文 NLP 和 CV 项目的快速开发。图像分类任务概述图像分类的本质是给定一张输入图像 $ I \in \mathbb{R}^{H \times W \times C} $输出其所属类别标签 $ y \in {1,2,\dots,K} $。这看似简单的任务背后却隐藏着巨大的挑战——光照变化、视角差异、遮挡干扰等因素都可能让模型“看走眼”。传统方法依赖手工特征提取如 SIFT、HOG但泛化能力有限。自2012年 AlexNet 在 ImageNet 上一战成名后卷积神经网络CNN便成为了该领域的绝对主力。它能自动从原始像素中学习多层次特征底层捕捉边缘和纹理中层识别部件形状高层则组合成完整语义概念。本实验采用的是iChallenge-PM 数据集由百度大脑与中山大学中山眼科中心联合发布专用于病理性近视的自动筛查。数据集共 1200 张眼底图像训练/验证/测试各 400 张。由于测试集标签未公开我们将重点放在训练集与验证集上的建模与评估。飞桨PaddlePaddle简介PaddlePaddle 是百度自主研发的端到端开源深度学习框架支持动态图与静态图两种编程模式。相比其他框架它的优势在于高层 API 设计简洁降低入门门槛内置 PaddleCV、PaddleNLP 等工业级模型库开箱即用对中文自然语言处理任务有专门优化提供完整的产业落地工具链适合企业级部署。我们将在动态图模式下进行开发享受即时执行、灵活调试的优势。同时Paddle 的paddle.nn模块提供了丰富的层组件使得搭建复杂网络变得轻而易举。数据预处理与加载良好的数据处理是模型成功的基石。对于眼底图像我们需要统一尺寸、归一化强度并合理组织读取逻辑。import cv2 import os import random import numpy as np import paddle def transform_img(img): # 缩放至标准尺寸 img cv2.resize(img, (224, 224)) # 调整通道顺序HWC - CHW img np.transpose(img, (2, 0, 1)) img img.astype(float32) # 归一化到 [-1, 1] img img / 255. img img * 2.0 - 1.0 return img接下来定义训练集读取器。注意这里采用了生成器generator方式避免一次性加载全部数据导致内存溢出。def data_loader(datadir, batch_size10, modetrain): filenames os.listdir(datadir) def reader(): if mode train: random.shuffle(filenames) # 训练时打乱 batch_imgs, batch_labels [], [] for name in filenames: filepath os.path.join(datadir, name) img cv2.imread(filepath) if img is None: continue img transform_img(img) # 根据文件名首字母确定标签H/N为非PM负样本P为PM正样本 label 0 if name[0] in [H, N] else 1 batch_imgs.append(img) batch_labels.append(label) if len(batch_imgs) batch_size: yield _batch_to_array(batch_imgs, batch_labels) batch_imgs, batch_labels [], [] # 处理最后一个不完整批次 if batch_imgs: yield _batch_to_array(batch_imgs, batch_labels) return reader # 辅助函数将列表转为 NumPy 数组并封装为 Tensor def _batch_to_array(imgs, labels): imgs_array np.array(imgs).astype(float32) labels_array np.array(labels).astype(float32).reshape(-1, 1) return imgs_array, labels_array验证集的标签存储在 CSV 文件中需单独处理def valid_data_loader(datadir, csvfile, batch_size10): lines open(csvfile).readlines()[1:] # 跳过表头 filelists [line.strip().split(,) for line in lines] def reader(): batch_imgs, batch_labels [], [] for _, filename, label, _, _ in filelists: filepath os.path.join(datadir, filename) img cv2.imread(filepath) if img is None: continue img transform_img(img) label int(label) batch_imgs.append(img) batch_labels.append(label) if len(batch_imgs) batch_size: yield _batch_to_array(batch_imgs, batch_labels) batch_imgs, batch_labels [], [] if batch_imgs: yield _batch_to_array(batch_imgs, batch_labels) return reader检查数据形状是否正确DATADIR_TRAIN /home/aistudio/work/palm/PALM-Training400 DATADIR_VALID /home/aistudio/work/palm/PALM-Validation400 CSVFILE /home/aistudio/labels.csv train_loader data_loader(DATADIR_TRAIN, batch_size10, modetrain) data_reader train_loader() data next(data_reader) print(训练数据 shape:, data[0].shape, data[1].shape) # (10, 3, 224, 224), (10, 1) valid_loader valid_data_loader(DATADIR_VALID, CSVFILE, batch_size10) data_reader valid_loader() data next(data_reader) print(验证数据 shape:, data[0].shape, data[1].shape) # (10, 3, 224, 224), (10, 1)一切就绪后便可进入模型构建阶段。统一训练流程设计为了便于横向比较不同模型的表现我们封装一个通用的训练函数包含训练循环、验证逻辑和模型保存机制。def train(model, save_pathmodel): print(开始训练...) model.train() epoch_num 5 learning_rate 0.001 optimizer paddle.optimizer.Momentum( learning_ratelearning_rate, momentum0.9, parametersmodel.parameters() ) train_loader_func data_loader(DATADIR_TRAIN, batch_size10, modetrain) valid_loader_func valid_data_loader(DATADIR_VALID, CSVFILE, batch_size10) for epoch in range(epoch_num): for batch_id, (x_data, y_data) in enumerate(train_loader_func()): x_tensor paddle.to_tensor(x_data) y_tensor paddle.to_tensor(y_data) logits model(x_tensor) loss paddle.nn.functional.binary_cross_entropy_with_logits(logits, y_tensor) avg_loss paddle.mean(loss) if batch_id % 10 0: print(fepoch: {epoch}, batch: {batch_id}, loss: {avg_loss.numpy()[0]:.6f}) avg_loss.backward() optimizer.step() optimizer.clear_grad() # 验证阶段 model.eval() accuracies [] losses [] with paddle.no_grad(): for batch_id, (x_data, y_data) in enumerate(valid_loader_func()): x_tensor paddle.to_tensor(x_data) y_tensor paddle.to_tensor(y_data) logits model(x_tensor) pred paddle.nn.functional.sigmoid(logits) loss paddle.nn.functional.binary_cross_entropy_with_logits(logits, y_tensor) # 构造二分类概率输出 pred_neg 1.0 - pred pred_all paddle.concat([pred_neg, pred], axis1) acc paddle.metric.accuracy(pred_all, paddle.cast(y_tensor, dtypeint64)) accuracies.append(acc.numpy()[0]) losses.append(loss.numpy()[0]) print(f[验证] epoch: {epoch}, accuracy: {np.mean(accuracies):.4f}, loss: {np.mean(losses):.6f}) model.train() # 恢复训练状态 # 保存最终模型 paddle.save(model.state_dict(), f{save_path}.pdparams) print(f模型已保存至 {save_path}.pdparams)这个函数采用Momentum 优化器和二元交叉熵损失适用于当前的二分类任务。每轮训练结束后都会进行一次完整验证监控模型泛化性能。模型评估函数训练完成后可通过以下函数加载权重并评估模型表现def evaluate(model, params_path): print(开始评估...) state_dict paddle.load(params_path) model.set_state_dict(state_dict) model.eval() eval_loader data_loader(DATADIR_TRAIN, batch_size10, modeeval) acc_list [] loss_list [] with paddle.no_grad(): for batch_id, (x_data, y_data) in enumerate(eval_loader()): x_tensor paddle.to_tensor(x_data) y_tensor paddle.to_tensor(y_data.astype(np.int64)) logits model(x_tensor) pred paddle.nn.functional.sigmoid(logits) loss paddle.nn.functional.binary_cross_entropy_with_logits(logits, paddle.to_tensor(y_data)) pred_all paddle.concat([1-pred, pred], axis1) acc paddle.metric.accuracy(pred_all, y_tensor) acc_list.append(acc.numpy()[0]) loss_list.append(loss.numpy()[0]) avg_acc np.mean(acc_list) avg_loss np.mean(loss_list) print(f评估结果 - 准确率: {avg_acc:.4f}, 损失: {avg_loss:.6f})虽然此处仍使用训练集做演示但在实际项目中应保留独立测试集以获得无偏估计。LeNet卷积网络的起点尽管诞生于1998年LeNet 却奠定了现代 CNN 的基本范式卷积层提取局部特征 池化层降维 全连接层分类。import paddle.nn as nn class LeNet(nn.Layer): def __init__(self, num_classes1): super(LeNet, self).__init__() self.conv1 nn.Conv2D(in_channels3, out_channels6, kernel_size5, actsigmoid) self.pool1 nn.MaxPool2D(kernel_size2, stride2) self.conv2 nn.Conv2D(in_channels6, out_channels16, kernel_size5, actsigmoid) self.pool2 nn.MaxPool2D(kernel_size2, stride2) self.conv3 nn.Conv2D(in_channels16, out_channels120, kernel_size4, actsigmoid) self.fc1 nn.Linear(in_features120*5*5, out_features64, actsigmoid) self.fc2 nn.Linear(in_features64, out_featuresnum_classes) def forward(self, x): x self.conv1(x) x self.pool1(x) x self.conv2(x) x self.pool2(x) x self.conv3(x) x paddle.reshape(x, [x.shape[0], -1]) # 展平 x self.fc1(x) x self.fc2(x) return x # 启动训练 with paddle.on_device(gpu:0): model LeNet() train(model, save_pathlenet)⚠️ 注意原版 LeNet 输入为 32×32 单通道图像我们将其调整为 224×224 三通道以适配彩色眼底图。此外激活函数仍使用 sigmoid 是为了忠实还原原始设计但在现代实践中 ReLU 更为常见。AlexNet深度学习时代的开启者2012年AlexNet 在 ImageNet 比赛中以压倒性优势夺冠标志着深度学习时代的到来。它首次大规模应用 GPU 加速训练并引入多项关键技术ReLU 激活函数缓解梯度消失加快收敛Dropout防止全连接层过拟合数据增强提升泛化能力局部响应归一化LRN增强模型泛化后续研究发现作用有限。class AlexNet(nn.Layer): def __init__(self, num_classes1): super(AlexNet, self).__init__() self.features nn.Sequential( nn.Conv2D(3, 96, kernel_size11, stride4, padding5, actrelu), nn.MaxPool2D(kernel_size2, stride2), nn.Conv2D(96, 256, kernel_size5, padding2, actrelu), nn.MaxPool2D(kernel_size2, stride2), nn.Conv2D(256, 384, kernel_size3, padding1, actrelu), nn.Conv2D(384, 384, kernel_size3, padding1, actrelu), nn.Conv2D(384, 256, kernel_size3, padding1, actrelu), nn.MaxPool2D(kernel_size2, stride2), ) self.classifier nn.Sequential( nn.Linear(256*6*6, 4096, actrelu), nn.Dropout(0.5), nn.Linear(4096, 4096, actrelu), nn.Dropout(0.5), nn.Linear(4096, num_classes) ) def forward(self, x): x self.features(x) x paddle.reshape(x, [x.shape[0], -1]) x self.classifier(x) return x with paddle.on_device(gpu:0): model AlexNet() train(model, save_pathalexnet)实验表明在 iChallenge-PM 数据集上AlexNet 经过 5 轮训练后验证准确率可达约 94%显著优于 LeNet。这说明更深的网络确实能学到更强的表示能力。VGG小卷积核堆叠的艺术VGG 进一步证明了网络深度对性能至关重要。其核心思想是用多个 3×3 小卷积核代替大卷积核两个 3×3 卷积的感受野等效于一个 5×5 卷积但参数更少、非线性更强、表达能力更优。class VGGBlock(nn.Layer): def __init__(self, num_convs, in_channels, out_channels): super(VGGBlock, self).__init__() layers [] for _ in range(num_convs): layers.append(nn.Conv2D(in_channels, out_channels, 3, padding1, actrelu)) in_channels out_channels layers.append(nn.MaxPool2D(kernel_size2, stride2)) self.sequential nn.Sequential(*layers) def forward(self, x): return self.sequential(x) class VGG(nn.Layer): def __init__(self, conv_arch((2, 64), (2, 128), (3, 256), (3, 512), (3, 512))): super(VGG, self).__init__() self.vgg_blocks nn.LayerList() in_channels 3 for num_convs, out_channels in conv_arch: block VGGBlock(num_convs, in_channels, out_channels) self.vgg_blocks.append(block) in_channels out_channels self.classifier nn.Sequential( nn.Linear(512*7*7, 4096, actrelu), nn.Dropout(0.5), nn.Linear(4096, 4096, actrelu), nn.Dropout(0.5), nn.Linear(4096, 1) ) def forward(self, x): for block in self.vgg_blocks: x block(x) x paddle.reshape(x, [x.shape[0], -1]) x self.classifier(x) return x with paddle.on_device(gpu:0): model VGG() train(model, save_pathvgg)VGG 结构规整、易于复现成为许多后续工作的基础骨架。但由于大量使用 3×3 卷积参数量庞大推理速度较慢。GoogLeNet多尺度特征融合的先驱GoogLeNet 获得 2014 年 ImageNet 冠军最大创新是提出了Inception 模块在同一层中并行使用 1×1、3×3、5×5 卷积和池化操作实现多尺度特征提取。class Inception(nn.Layer): def __init__(self, c0, c1, c2, c3, c4): super(Inception, self).__init__() self.p1_1 nn.Conv2D(c0, c1, 1, actrelu) # 1x1 self.p2_1 nn.Conv2D(c0, c2[0], 1, actrelu) self.p2_2 nn.Conv2D(c2[0], c2[1], 3, padding1, actrelu) # 3x3 self.p3_1 nn.Conv2D(c0, c3[0], 1, actrelu) self.p3_2 nn.Conv2D(c3[0], c3[1], 5, padding2, actrelu) # 5x5 self.p4_1 nn.MaxPool2D(3, stride1, padding1) self.p4_2 nn.Conv2D(c0, c4, 1, actrelu) # pool1x1 def forward(self, x): p1 self.p1_1(x) p2 self.p2_2(self.p2_1(x)) p3 self.p3_2(self.p3_1(x)) p4 self.p4_2(self.p4_1(x)) return paddle.concat([p1, p2, p3, p4], axis1) class GoogLeNet(nn.Layer): def __init__(self): super(GoogLeNet, self).__init__() self.b1 nn.Sequential( nn.Conv2D(3, 64, 7, padding3, stride2, actrelu), nn.MaxPool2D(3, stride2, padding1) ) self.b2 nn.Sequential( nn.Conv2D(64, 64, 1, actrelu), nn.Conv2D(64, 192, 3, padding1, actrelu), nn.MaxPool2D(3, stride2, padding1) ) self.b3 nn.Sequential( Inception(192, 64, (96, 128), (16, 32), 32), Inception(256, 128, (128, 192), (32, 96), 64), nn.MaxPool2D(3, stride2, padding1) ) self.b4 nn.Sequential( Inception(480, 192, (96, 208), (16, 48), 64), Inception(528, 160, (112, 224), (24, 64), 64), Inception(528, 128, (128, 256), (24, 64), 64), Inception(512, 112, (144, 288), (32, 64), 64), Inception(528, 256, (160, 320), (32, 128), 128), nn.MaxPool2D(3, stride2, padding1) ) self.b5 nn.Sequential( Inception(832, 256, (160, 320), (32, 128), 128), Inception(832, 384, (192, 384), (48, 128), 128), nn.AdaptiveAvgPool2D((1,1)) ) self.fc nn.Linear(1024, 1) def forward(self, x): x self.b5(self.b4(self.b3(self.b2(self.b1(x))))) x paddle.reshape(x, [x.shape[0], -1]) x self.fc(x) return x with paddle.on_device(gpu:0): model GoogLeNet() train(model, save_pathgooglenet)GoogLeNet 在保持高性能的同时大幅减少了参数量体现了“宽度优于深度”的设计哲学。实验结果显示其验证准确率可达约 95%。ResNet残差学习突破深度瓶颈随着网络加深梯度消失/爆炸问题愈发严重导致深层模型反而难以训练。ResNet 的解决方案极为巧妙引入残差连接skip connection允许信息直接跨层传播。class ConvBNLayer(nn.Layer): def __init__(self, ch_in, ch_out, kernel_size, stride1, groups1): super(ConvBNLayer, self).__init__() self.conv nn.Conv2D(ch_in, ch_out, kernel_size, stridestride, padding(kernel_size-1)//2, groupsgroups, bias_attrFalse) self.bn nn.BatchNorm2D(ch_out) self.relu nn.ReLU() def forward(self, x): x self.conv(x) x self.bn(x) x self.relu(x) return x class BottleneckBlock(nn.Layer): def __init__(self, ch_in, ch_out, stride1, shortcutTrue): super(BottleneckBlock, self).__init__() self.conv0 ConvBNLayer(ch_in, ch_out, 1) self.conv1 ConvBNLayer(ch_out, ch_out, 3, stridestride) self.conv2 ConvBNLayer(ch_out, ch_out*4, 1) if not shortcut or stride ! 1 or ch_in ! ch_out*4: self.short ConvBNLayer(ch_in, ch_out*4, 1, stridestride) else: self.short None def forward(self, x): residual x y self.conv0(x) y self.conv1(y) y self.conv2(y) if self.short is not None: residual self.short(x) y paddle.add(residual, y) return y class ResNet(nn.Layer): def __init__(self, layers50, class_dim1): super(ResNet, self).__init__() depth {50: [3,4,6,3], 101: [3,4,23,3], 152: [3,8,36,3]}[layers] num_filters [64, 128, 256, 512] self.conv ConvBNLayer(3, 64, 7, stride2) self.pool nn.MaxPool2D(3, stride2, padding1) self.blocks nn.LayerList() in_ch 64 for i, layer_count in enumerate(depth): block nn.Sequential() for j in range(layer_count): stride 2 if j 0 and i ! 0 else 1 shortcut False if j 0 else True bottleneck BottleneckBlock(in_ch, num_filters[i], stride, shortcut) block.add_sublayer(fbb_{j}, bottleneck) in_ch num_filters[i] * 4 self.blocks.append(block) self.avgpool nn.AdaptiveAvgPool2D(1) self.fc nn.Linear(2048, class_dim) def forward(self, x): x self.conv(x) x self.pool(x) for block in self.blocks: x block(x) x self.avgpool(x) x paddle.reshape(x, [x.shape[0], -1]) x self.fc(x) return x with paddle.on_device(gpu:0): model ResNet(layers50) train(model, save_pathresnet50)ResNet 成功训练了多达 152 层的网络在 iChallenge-PM 上也达到了约 95% 的准确率。更重要的是它改变了人们对深层网络的认知——只要设计得当深度不再是障碍。从 LeNet 的雏形到 ResNet 的极致深度这些经典模型不仅是技术演进的见证更是工程智慧的结晶。借助飞桨这样成熟的国产深度学习平台我们得以在短时间内复现并对比这些里程碑式的工作真正实现了“站在巨人肩膀上”做研究。无论是学术探索还是工业落地掌握这些基础模型的原理与实现都是迈向高级计算机视觉工程师的必经之路。而飞桨所提供的简洁 API 与强大生态无疑为我们铺平了这条道路。创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考