2014年提出

## Inception块

4个路径从不同层面抽取信息，然后再输出通道维合并

inception块相比3X3和5×5卷积层有更少的参数和计算复杂度

Inception后续有很多变种

Inception块：

```import torch
from torch import nn
from torch.nn import functional as F
from d2l import d2ltorch as d2l
class Inception(nn.Module):
# `c1`--`c4` 是每条路径的输出通道数
def __init__(self,in_channels,c1,c2,c3,c4,**kwargs):
super(Inception,self).__init__(**kwargs)
# 线路1，单1 x 1卷积层
self.p1_1 = nn.Conv2d(in_channels, c1, kernel_size=1)
# 线路2，1 x 1卷积层后接3 x 3卷积层
self.p2_1 = nn.Conv2d(in_channels,c2[0],kernel_size=1)
# 线路3，1 x 1卷积层后接5 x 5卷积层
self.p3_1 = nn.Conv2d(in_channels,c3[0],kernel_size=1)
# 线路4，3 x 3最⼤汇聚层后接1 x 1卷积层
self.p4_2 = nn.Conv2d(in_channels,c4,kernel_size=1)

def forward(self,x):
p1_1 = F.relu(self.p1_1(x))
p2_1 = F.relu(self.p2_1(x))
p2_2 = F.relu(self.p2_2(p2_1))
p3_1 = F.relu(self.p3_1(x))
p3_2 = F.relu(self.p3_2(p3_1))
p4_1 = F.relu(self.p4_1(x))
p4_2 = F.relu(self.p4_2(p4_1))
p1,p2,p3,p4 = p1_1,p2_2,p3_2,p4_2
# 在通道维度上连结输出

## 模型

Inception块 之间的最⼤汇聚层可降低维度。

```# 接下来我们自己动手实现这个网络
# 第⼀个模块使⽤64个通道,7X7卷积层
b1 = nn.Sequential(
nn.ReLU(),
)
b2 = nn.Sequential(
nn.Conv2d(64,64,kernel_size=1),
nn.ReLU(),
)
# stage3开始就要使用inception块了
b3 = nn.Sequential(
# 输入192 输出 64 + 128 + 32 + 32 = 256
Inception(in_channels= 192,c1 = 64,c2=(96,128),c3=(16,32),c4=32),
# 输出为 128 + 192+ 96+ 64 = 480
Inception(256, 128, (128, 192), (32, 96), 64),
)
# stage4 stage3能写出来，4就照样做就可以了
b4 = nn.Sequential(
Inception(480, 192, (96, 208), (16, 48), 64),
Inception(512, 160, (112, 224), (24, 64), 64),
Inception(512, 128, (128, 256), (24, 64), 64),
Inception(512, 112, (144, 288), (32, 64), 64),
Inception(528, 256, (160, 320), (32, 128), 128),
)
# stage5是输出层
b5 = nn.Sequential(
Inception(832, 256, (160, 320), (32, 128), 128),
Inception(832, 384, (192, 384), (48, 128), 128),
# 自适应平均池化层, 能够自动选择步幅和填充，比较方便
nn.Flatten()
)
# 这里b1-b5都是nn.Sequential，但是好像可以直接嵌套进新的nn.Sequential
net = nn.Sequential(b1, b2, b3, b4, b5, nn.Linear(1024, 10))
# 这里输入不是224X224，改为96X96，加快训练速度
X = torch.rand(size=(1, 1, 96, 96))
for layer in net:
X = layer(X)
print(layer.__class__.__name__,'output shape:\t', X.shape)

lr, num_epochs, batch_size = 0.1, 10, 128