expand 算子

用法介绍

``````oneflow.expand(input, *sizes)
``````

`expand`

）。

`expand_size`

，新增维度也就相当于将整个输入张量进行复制

具体示例

``````input_shape = [4, 3, 1, 2]
exand_size  = [4, 3, 5, 2]
# 下面这些 expand_size 的设置都是合法的
#             [-1, 3, 5, 2]
#             [-1, -1, 5, 2]
#             [-1, -1, 5, -1]
#             [4, -1, 5, 2]
#             [4, -1, 5, -1]
#             [4, 3, 5, -1]
out_shape   = [4, 3, 5, 2]
``````

``````input_shape =       [1, 4, 3, 5]
exand_size  = [2, 1, 2, 4, 3, 5]
# 下面这些 expand_size 的设置都是合法的
#             [2, 1, 2, -1, 3, 5]
#             [2, 1, 2, -1, -1, 5]
#             [2, 1, 2, -1, -1, -1]
#             [2, 1, 2, 4, -1, 5]
#             [2, 1, 2, 4, -1, -1]
#             [2, 1, 2, 4, 3, -1]
out_shape   = [2, 1, 2, 4, 3, 5]
``````

单卡视角实现思路

举个例子：

``````input_shape = [6, 3, 4, 5]
stride      = [60, 20, 5, 1] # 下面会介绍 stide 的计算方法
input[x, y, z, k] == input_flatten[x * 60 + y * 20 + z * 5 + k * 1]
``````

`stride`

示例代码：

``````# 最后一维初始化为1
stride = [1]
# 从后往前生成 stride
for i in range(len(input_shape) - 2, -1, -1):
# 在 stride 数组开头插入元素
stride.insert(0, input_stride[0] * input_shape[i + 1])
``````

，该`output_stride`

```expand_size
i
-1
output_stirde[i] = stride[i]```

```expand_size
i
1
output_stride[i] = 0```

```expand_size
i
output_stride[i] = 0```

计算`output_stirde`的示例代码：

``````output_stride = []
diff = len(expand_size) - len(input_shape)
for i in range(len(expand_size) - 1, -1, -1):
if i >= diff:
if expand_size[i] == -1 or expand_size[i] == input_shape[i - diff]:
output_stride.insert(0, input_stride[i - diff])
else:
assert expand_size[i] >= 1 and input_shape[i - diff] == 1
output_stride.insert(0, 0)
else:
assert expand_size[i] >= 1
output_stride.insert(0, 0)
``````

举个例子：

``````input_shape   =       [4, 1, 3, 5]
stride        =     [15, 15, 5, 1]
exand_size    = [2, 1, 4, 4, 3, 5]
output_stride = [0, 0, 15, 0, 5, 1]
# 输出张量意位置的索引 (x, y, z, k, v, w)
output[x, y, z, k, v, w] = input_flatten[x * 0 + y * 0 + z * 15 + k * 0 + v * 5 + w * 1]
# 反向的计算逻辑
input_grad_flatten[x * 0 + y * 0 + z * 15 + k * 0 + v * 5 + w * 1] += output_grad[x, y, z, k, v, w]
``````

多卡一致性视角

`OneFlow 提出了一致性视角（consistent view）的概念，用于简化分布式训练。 简单而言，在 OneFlow 的一致性视角下，集群被抽象为一台“超级计算设备”。 用户无需关心集群中计算和通信的细节，只需要关心逻辑上的数据与计算。 可以像单机单卡那样去思考要和编程，就能进行分布式训练。`

`sbp 是 OneFlow 发明的概念，描述了在一致性视角下的 数据与集群中真实的物理设备上的数据的映射关系。 它由 split, broadcast, partial 的首字母组合而成。`

`split`

`broadcast`

`partial`

。主要原因在于`expand`

举个具体的例子：

``````logical input_shape =      [4, 3, 1, 2]
logical stride =           [6, 2, 2, 1]
logical expand_size =   [2, 4, 3, 4, 2]
logical output_stride = [0, 6, 2, 0, 1]
``````

，也就是对最后一维度进行切分。且设置该逻辑张量放置在两张卡上，则每张卡上的真实物理形状为：

``````physical input_shape = [4, 3, 1, 1]
physical stride      = [3, 1, 1, 1]
``````

`output_stride`

``````physical expand_size   =  [2, 4, 3, 4, 1]
physical output_stride =  [0, 3, 1, 0, 1]
``````

为什幺`expand_size`需要修改呢？

，而如果`expand_size`

，则在每个设备上的输出大小是`[2, 4, 3, 4, 2]`
，则输出对应的逻辑形状则是`[2, 4, 3, 4, 4]`
，则输出结果最后一维就比原来多了。

`output_stride`

repeat 算子

用法介绍

``````oneflow.repeat(input, *sizes)
``````

`repeat`

）。

`repeat_size`

`repeat_size`

，假设某维度设为`n`
， 则先当于输入张量对应维度复制`n-1`

， 则先当于将输入张量复制`n-1`

`repeat_size`

，但是这里不考虑这种情况

具体示例

``````input_shape   =       [4, 1, 3, 5]
repeat_size   = [2, 1, 2, 4, 1, 1]
output_shape  = [2, 1, 8, 4, 3, 5]
``````

与 expand 算子的联系

举些例子：

``````input_shape   = [5]
repeat_size   = [3]
output_shape  = [15]
# 等价与以下操作
input_shape            =    [5]
reshaped_input_shape   = [1, 5]
expand_size            = [3, 5]
output_shape           = [3, 5]
reshaped_output_shape  =   [15]
``````

``````input_shape   = [3, 1, 5]
repeat_size   = [5, 3, 1]
output_shape  = [15, 3, 5]
# 等价于以下操作
input_shape           =    [3, 1, 5]
reshaped_input_shape  = [1, 3, 1, 5]
expand_size           = [5, 3, 3, 5]
output_shape          = [5, 3, 3, 5]
reshaped_output_shape =   [15, 3, 5]
``````

``````input_shape   =    [3, 1, 5]
repeat_size   = [2, 5, 3, 1]
output_shape  = [2, 15, 3, 5]
# 等价与以下操作
input_shape            =       [3, 1, 5]
reshaped_input_shape   =    [1, 3, 1, 5]
expand_size            = [2, 5, 3, 3, 5]
output_shape           = [2, 5, 3, 3, 5]
reshaped_output_shape  =   [2, 15, 3, 5]
``````

+`expand`
+`reshape`

`repeat_size`

计算示例代码：

``````input_reshape = []
output_reshape = []
expand_size = []
diff = len(repeat_size) - len(input_shape)
for i in range(len(repeat_size) - 1, -1, -1):
if i >= diff:
if repeat_size[i] > 1:
if input_shape[i - diff] > 1:
input_reshape.insert(0, input_shape[i - diff])
input_reshape.insert(0, 1)
expand_size.insert(0, input_shape[i - diff])
expand_size.insert(0, repeat_size[i])
output_reshape.insert(0,
input_shape[i - diff] * repeat_size[i])
else:
input_reshape.insert(0, input_shape[i - diff])
expand_size.insert(0, repeat_size[i])
output_reshape.insert(0, repeat_size[i])
else:
input_reshape.insert(0, input_shape[i - diff])
expand_size.insert(0, input_shape[i - diff])
output_reshape.insert(0, input_shape[i - diff])
else: # 新增的维度
expand_size.insert(0, repeat_size[i])
output_reshape.insert(0, repeat_size[i])
new_tensor = flow.reshape(input, input_reshape)
tmp_tensor = new_tensor.expand(*expand_size)
out = flow.reshape(tmp_tensor, output_reshape)
``````

`expand`

参考资料

https://discuss.pytorch.org/t/contigious-vs-non-contigious-tensor/30107/2

https://docs.oneflow.org/v0.5.0/parallelism/02_sbp.html

https://docs.oneflow.org/v0.5.0/parallelism/01_introduction.html#_3