Press "Enter" to skip to content

详解如何获取深度学习模型中间层的输出值

本站内容均来自兴趣收集,如不慎侵害的您的相关权益,请留言告知,我们将尽快删除.谢谢.

1. 引入

 

深度学习模型,大都是多层的网络,各个层可能各有不同(, , Flatten, Activation, BatchNormalization, GlobalAveragePooling2D,Conv2D, MaxPooling2D, ZeroPadding2D,LSTM)。

 

有时候我们需要获取多层网络中某一层的输出值,用于做可视化,或者Embedding。

 

下面就以一个例子为例说明如何获取神经网络某一层的输出值。

 

2. 构建网络,各层加上name

 

本文构建的多层网络模型如下:

 

from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras_self_attention import SeqSelfAttention
import keras
import numpy as np
num_classes = 6 # classification number
x_train = np.random.randn(100, 15, 20, 3) # x_train.shape = (100, 15, 20, 3)
y_train = np.random.randint(1, size=(100,num_classes)) # y_train.shape = (100, 6)
input_shape = x_train.shape[-3:]# (15, 20, 3)
model = Sequential()
model.add(Conv2D(32, (2,11), activation='relu', padding='same', input_shape=input_shape, name='layer_conv_1'))
model.add(Conv2D(32, (2,11), activation='relu', padding='same', name='layer_conv_2'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 1), name='layer_mp_1'))
model.add(Conv2D(128, (2,7), activation='relu', padding='same', name='layer_conv_3'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 1), name='layer_mp_2'))
model.add( keras.layers.Reshape((48,128), name='layer_rsp_1') )
model.add( SeqSelfAttention( attention_type=SeqSelfAttention.ATTENTION_TYPE_MUL, name='layer_attention_1') )
model.add(Flatten(name='layer_flatten_1'))
model.add(Dense(440, activation='relu', name='layer_dense_1'))
model.add(Dropout(0.5, name='layer_dropout_1'))
model.add(Dense(num_classes, activation='softmax', name='layer_dense_2'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()

 

这里注意几点:

 

 

    1. 要在模型每一层都加入name属性,比如(name=’layer_conv_1’)

 

    1. 模型的训练数据是用numpy生成的随机数,具体参见注释

 

    1. 使用的版本:tensorflow==2.4.0,keras==2.4.3,python==3.7

 

 

模型构建完成后,参数如下所示

 

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
layer_conv_1 (Conv2D)        (None, 15, 20, 32)        2144      
_________________________________________________________________
layer_conv_2 (Conv2D)        (None, 15, 20, 32)        22560     
_________________________________________________________________
layer_mp_1 (MaxPooling2D)    (None, 7, 18, 32)         0         
_________________________________________________________________
layer_conv_3 (Conv2D)        (None, 7, 18, 128)        57472     
_________________________________________________________________
layer_mp_2 (MaxPooling2D)    (None, 3, 16, 128)        0         
_________________________________________________________________
layer_rsp_1 (Reshape)        (None, 48, 128)           0         
_________________________________________________________________
layer_attention_1 (SeqSelfAt (None, 48, 128)           16385     
_________________________________________________________________
layer_flatten_1 (Flatten)    (None, 6144)              0         
_________________________________________________________________
layer_dense_1 (Dense)        (None, 440)               2703800   
_________________________________________________________________
layer_dropout_1 (Dropout)    (None, 440)               0         
_________________________________________________________________
layer_dense_2 (Dense)        (None, 6)                 2646      
=================================================================
Total params: 2,805,007
Trainable params: 2,805,007
Non-trainable params: 0
_________________________________________________________________

 

3. 获取中间层的输出

 

 

    1. 模型构建完成后,需要先训练模型,这样才能得到各层的权重:

 

 

model.fit(x_train, y_train,batch_size=64,epochs=2,verbose=1)

 

 

    1. 模型训练完成后,指定好要输出的中间层名字,并建立从输入到输出的

函数对象

 

 

import keras.backend as k
layer_name = 'layer_conv_3'
layer_output = model.get_layer(layer_name).output # get output by layer name
layer_input = model.input
output_func = k.function([layer_input], [layer_output]) # construct function

 

这里设置layer_name为layer_conv_3,表示我们想获取name为layer_conv_3的这一层的输出值。

 

 

    1. 获取输入输出的数据维度

 

 

(1)原始输入数据,有100个样本,每个样本都是三维的。x_train.shape = (100, 15, 20, 3)

 

x_preproc = x_train

 

(2)中间层layer_conv_3的输出数据维度

 

output_shape = output_func([x_preproc[0][None, ...]])[0].shape
print(output_shape)# (1, 7, 18, 128)

 

获取到的中间层的输出维度,是(1, 7, 18, 128)。从第2部分中模型参数输出中可知,这个数据正好就是和(None, 7, 18, 128) 匹配的,只是这里能用代码动态获取到这个值而已。

 

(3)输入100个样本,则中间层layer_conv_3对应这100个样本的输出数据维度:

 

activations = np.zeros((x_preproc.shape[0],) + output_shape[1:], dtype=float32)
print(activations.shape) # (100, 7, 18, 128)

 

最终,activations是一个全零的array,这就是存储中间层最终输出值的地方。

 

 

    1. 把x_train送入模型,获取最终中间层的输出值

 

 

batch_size = 8
for batch_index in range(int(np.ceil(x_preproc.shape[0] / float(batch_size)))):
    begin, end = batch_index * batch_size, min((batch_index + 1) * batch_size, x_preproc.shape[0])
    activations[begin:end] = output_func([x_preproc[begin:end]])[0]

 

这里按照各个batch_size来迭代,依次把x_train的一部分送入函数对象
,获取这个部分对应的中间层输出,多次迭代后就能得到完整的中间层输出的数据了。

 

 

    1. 最终得到的中间层输出数据

 

 

最终得到的中间层输出数据存储在activations中,它的shape为(100, 7, 18, 128),数据示意如下:

 

array([[[[0.26599178, 0.73428845, 0.16254057, ..., 0.06111438,
          0.22640981, 0.35272944],
         [0.283226  , 0.6876849 , 0.03790125, ..., 0.34651148,
          0.10112678, 0.29799798],
         [0.54103273, 0.9894899 , 0.10496318, ..., 0.7219487 ,
          0.06900553, 0.38379622],
         ...,
         [0.277767  , 0.47766227, 0.19746281, ..., 0.91875714,
          0.028616  , 0.41216236],
         [0.17200978, 0.47316927, 0.12632905, ..., 0.82960546,
          0.12838002, 0.15124908],
         [0.2677489 , 0.29046324, 0.16919036, ..., 0.86634046,
          0.14427625, 0.09604399]],
         ...,
         [0.17622252, 0.2056456 , 0.13514128, ..., 0.52697086,
          0.05512847, 0.45330787],
         [0.13311246, 0.13437134, 0.17722939, ..., 0.41641268,
          0.        , 0.35246032],
         [0.03272216, 0.07479057, 0.04990054, ..., 0.29089817,
          0.0585143 , 0.17479342]]]], dtype=float32)

 

4. 总结

 

本文讲解了获取keras模型的中间层输出值的具体步骤,这是参考了文献1中get_activations
函数写出来的,完整的函数请参考文献1。

 

5. 参考

 

stackoverflow.com/questions/4…

Be First to Comment

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注