Press "Enter" to skip to content

Tensorflow2 图像分类-Flowers数据及分类代码详解 1、基本步骤(1) 检查和熟悉数据(2) 构建输入管道…

本站内容均来自兴趣收集,如不慎侵害的您的相关权益,请留言告知,我们将尽快删除.谢谢.

目录

 

6.5 训练结果可视化及分析

 

1、基本步骤

 

(1) 检查和熟悉数据

 

(2) 构建输入管道

 

(3) 构建模型

 

(4) 训练模型

 

(5) 测试模型

 

(6) 改进模型并重复此过程

 

2.数据下载

 

2.1 环境

 

需要使用到的库如下:

 

import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

 

2.2 数据说明

 

下载flowers样本数据,该样本数据包含以下类别数据:

 

flower_photo/

 

daisy/菊花/

 

dandelion/蒲公英/

 

roses/玫瑰/

 

sunflowers/向日葵/

 

tulips/郁金香/

 

数据下载源代码如下:

 

import pathlib
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

 

以上这段代码中函数的完整参数如下:

 

参考原文:https://blog.csdn.net/qq_39507748/article/details/104997553

 

tf.keras.utils.get_file(
    fname, origin, untar=False, md5_hash=None, 
    file_hash=None,cache_subdir='datasets', 
    hash_algorithm='auto', extract=False,
    archive_format='auto', cache_dir=None
)

 

参数说明--
fname:文件名,如果指定了绝对路径"/path/to/file.txt",则文件将会保存到该位置
origin:文件的URL
untar:boolean,文件是否需要解压缩
md5_hash:MD5哈希值,用于数据校验,支持sha256和md5哈希
cache_subdir:用于缓存数据的文件夹,若指定绝对路径"/path/to/folder"则将存放在该路径下
hash_algorithm:选择文件校验的哈希算法,可选项有'md5', 'sha256', 和'auto'. 
默认'auto'自动检测使用的哈希算法
extract:若为True则试图提取文件,例如tar或zip 
archive_format:试图提取的文件格式,可选为'auto', 'tar', 'zip', 
和None. 'tar' 包括tar, tar.gz, tar.bz文件. 默认'auto'是['tar', 'zip']. None或空列表将返回没有匹配
cache_dir:文件缓存后的地址,若为None,则默认存放在根目录的.keras文件夹中

 

运行结果如下:

 

 

pathlib.Path(data_dir)函数路径转换说明:

 

import pathlib
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True)
print(data_dir)
print(type(data_dir))
data_dir = pathlib.Path(data_dir)
print(data_dir)
print(type(data_dir))

 

C:\Users\Administrator\.keras\datasets\flower_photos
<class 'str'>
C:\Users\Administrator\.keras\datasets\flower_photos
<class 'pathlib.WindowsPath'>

 

我们在路径下查看该数据:数据已经被解压出来。

 

 

 

daisy/菊花/                    数量:633

 

dandelion/蒲公英/         数量:898

 

roses/玫瑰/                    数量:641

 

sunflowers/向日葵/        数量:699

 

tulips/郁金香/                 数量:799

 

共计:3670

 

LICENSE.txt 中包含了图片作者和网址信息:

 

 

打开一些数据查看图片的大小,发现图片的像素大小,形状都不一样。部分图片如下:

 

 

2.3 数据查看代码

 

源代码:str类型的data_dir没有glob函数,因此上文中转化为“pathlib.WindowsPath”类型

 

image_count = len(list(data_dir.glob('**.jpg')))
print(image_count)
roses = list(data_dir.glob('roses/*'))
img0=PIL.Image.open(str(roses[0]))
plt.imshow(img0)
plt.show()
batch_size = 32
img_height = 180
img_width = 180
#It's good practice to use a validation split when developing your model.
# Let's use 80% of the images for training, and 20% for validation.
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)
class_names = train_ds.class_names
print(class_names)
#图片可视化
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(30):
    ax = plt.subplot(3, 10, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")
plt.show()
for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixels values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))
"""
num_classes = 5
model = Sequential([
  layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])
"""
data_augmentation = keras.Sequential(
  [
    layers.experimental.preprocessing.RandomFlip("horizontal",
                                                 input_shape=(img_height,
                                                              img_width,
                                                              3)),
    layers.experimental.preprocessing.RandomRotation(0.1),
    layers.experimental.preprocessing.RandomZoom(0.1),
  ]
)
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
  for i in range(9):
    augmented_images = data_augmentation(images)
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(augmented_images[0].numpy().astype("uint8"))
    plt.axis("off")
num_classes = 5
model = Sequential([
  data_augmentation,
  layers.experimental.preprocessing.Rescaling(1./255),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Dropout(0.2),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.summary()
epochs=15
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)
"""
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
"""
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs_range = range(epochs)
plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"
sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)
img = keras.preprocessing.image.load_img(
    sunflower_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print(
    "This image most likely belongs to {} with a {:.2f} percent confidence."
    .format(class_names[np.argmax(score)], 100 * np.max(score))
)

Be First to Comment

发表评论

您的电子邮箱地址不会被公开。 必填项已用*标注