PyTorch の動作確認をしてみた（１２）

1. 環境は、Window 10 Home (64bit) 上で行った。

2. Anaconda3 (64bit) – Spyder上で、動作確認を行った。

3. python のバージョンは、python 3.7.0 である。

4. pytorch のバージョンは、pytorch 0.4.1 である。

5. GPU は, NVIDIA社の GeForce GTX 1050 である。

6. CPU は, Intel社の Core(TM) i7-7700HQ である。

今回確認した内容は、現場で使える! PyTorch開発入門深層学習モデルの作成とアプリケーションへの実装 (AI & TECHNOLOGY) の 4.4 CNN回帰モデルによる画像の高解像化(P.088 – P.092) である。

※1. プログラムの詳細は, 書籍を参考(P.088 – P.092)にして下さい.
※2. 書籍上は, 訓練したCNNモデルの保存／ロードについて, 記載されてなかったので, 動作確認時に, 参照サイト① ～参照サイト③ を参考に, プログラムを追記した.
-> なお, 訓練時に, epoch = 10 で, 800秒以上かかることが分かったため, 訓練したCNNモデルを, 保存したいという動機があったためである.
※3. 高解像度化の動作確認は, 訓練したCNNモデルをロードする形で行ったので, 非常に短時間(1秒未満)で, 画像ファイルの出力を確認できた.

■CNNの訓練(書籍CNNの訓練を抜粋・加筆).

# -*- coding: utf-8 -*-
# 1. library import.
from __future__ import print_function
import torch
from torch import nn, optim
from torchvision.datasets import ImageFolder
from torchvision import transforms
from torch.utils.data import DataLoader
import time, math, os
from tqdm import tqdm
import shutil

# 2. to save torch model.
# Best way to save a trained model in PyTorch?
# https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch
# 
# pytorch/examples
# https://github.com/pytorch/examples/blob/0984955bb8525452d1c0e4d14499756eae76755b/imagenet/main.py#L139-L145
def save_checkpoint(state, is_best, filename='checkpoint.pth.tar'):
    torch.save(state, filename)
    if is_best:
        shutil.copyfile(filename, 'model_best.pth.tar')

～(略)～

# 4. declare a training helper function.
def train_net(net, train_loader, test_loader, save_model_file_name,
        optimizer_cls = optim.Adam,
        loss_fn = nn.MSELoss(),
        n_iter = 10, device = "cpu"):
    train_losses, val_acc = [], []
    optimizer = optimizer_cls(net.parameters())
    best_prec1 = 0
～(略)～

        # Best way to save a trained model in PyTorch?
        # https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch
        # 
        # pytorch/examples
        # https://github.com/pytorch/examples/blob/0984955bb8525452d1c0e4d14499756eae76755b/imagenet/main.py#L139-L145
        prec1 = val_acc[-1]
        is_best = prec1 > best_prec1
        best_prec1 = max(prec1, best_prec1)
        save_checkpoint({
            'epoch': epoch + 1,
            'state_dict': net.state_dict(),
            'best_prec1': best_prec1,
            'optimizer' : optimizer.state_dict(),
        }, is_best, filename=save_model_file_name)

～(略)～

# 6. get training data and test data.
start = time.time()
folder_path = os.path.expanduser('~')
folder_path = folder_path + '\\.spyder-py3\\pytorch\\lfw-deepfunneled'
train_data = DownSizedPairImageFolder(folder_path + "\\train", 
            transform=transforms.ToTensor())
test_data = DownSizedPairImageFolder(folder_path + "\\test", 
            transform=transforms.ToTensor())

# 7. Create DataLoader with a batch size of 32 respectively.
batch_size = 32
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

# 8. declare network.
# torch.nn
# https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d
# ex.
# Input:  torch.Size([1, 3, 128, 128])
# Output: torch.Size([1, 3, 512, 512])
net = nn.Sequential(
    # 次元の計算について, 公式サイト通りとする.
    # ex. 
    # kernel_size = (2, 3) ならば, kernel_size[0] = 2, kernel_size[1] = 3
    # kernel_size = 5      ならば, kernel_size[0] = 5, kernel_size[1] = 5
    # padding = (3, 5)     ならば, padding[0] = 3, padding[1] = 5
    # padding = 2          ならば, padding[0] = 2, padding[1] = 2
    # stride = (4, 5)      ならば, stride[0] = 4, stride[1] = 5
    # stride = 3           ならば, stride[0] = 3, stride[1] = 3
    # dilation は, ここでは, 初期値 (1, 1) を 使用するため, dilation[0] = 1, dilation[1] = 1
    # output_padding は, ここでは, 初期値 0 を 使用するため, output_padding[0] = 0, output_padding[1] = 0
    
    # Conv2d
    # Input:  (N, Cin, Hin, Win)
    # Output: (N, Cout, Hout, Wout)
    # Hout = (Hin + 2 × padding[0] − dilation[0] × (kernel_size[0] − 1) − 1) / stride[0] + 1
    # Wout = (Win + 2 × padding[1] − dilation[1] × (kernel_size[1] − 1) − 1) / stride[1] + 1

    # ConvTranspose2d
    # Input:  (N, Cin, Hin, Win)
    # Output: (N, Cout, Hout, Wout)
    # Hout = (Hin − 1) × stride[0] − 2 × padding[0] + kernel_size[0] + output_padding[0]
    # Wout = (Win − 1) × stride[1] − 2 × padding[1] + kernel_size[1] + output_padding[1]
    
～(略)～

    # 4. ConvTranspose2d
    # Input:  (1, 64, 256, 256)
    # Hout =  (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512
    # Wout =  (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512
    # Output: (1, 3, 512, 512)
    nn.ConvTranspose2d(64, 3, 4, stride = 2, padding = 1),
    
)
 
# 9. execute training
net.to("cuda:0")
save_model_file_name = folder_path + '\\checkpoint.pth.tar'
train_net(net, train_loader, test_loader, save_model_file_name, device = "cuda:0")

# 10. display processing time.
end = time.time()
print('--------------------------------------------------')
print('Elapsed Time: ' + str(end - start) + "[sec]")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

# -*- coding: utf-8 -*-

# 1. library import.

from __future__ import print_function

import torch

from torch import nn, optim

from torchvision.datasets import ImageFolder

from torchvision import transforms

from torch.utils.data import DataLoader

import time, math, os

from tqdm import tqdm

import shutil

# 2. to save torch model.

# Best way to save a trained model in PyTorch?

# https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch

# pytorch/examples

# https://github.com/pytorch/examples/blob/0984955bb8525452d1c0e4d14499756eae76755b/imagenet/main.py#L139-L145

def save_checkpoint(state, is_best, filename='checkpoint.pth.tar'):

torch.save(state, filename)

if is_best:

shutil.copyfile(filename, 'model_best.pth.tar')

～(略)～

# 4. declare a training helper function.

def train_net(net, train_loader, test_loader, save_model_file_name,

optimizer_cls = optim.Adam,

loss_fn = nn.MSELoss(),

n_iter = 10, device = "cpu"):

train_losses, val_acc = [], []

optimizer = optimizer_cls(net.parameters())

best_prec1 = 0

～(略)～

# Best way to save a trained model in PyTorch?

# https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch

# pytorch/examples

# https://github.com/pytorch/examples/blob/0984955bb8525452d1c0e4d14499756eae76755b/imagenet/main.py#L139-L145

prec1 = val_acc[-1]

is_best = prec1 > best_prec1

best_prec1 = max(prec1, best_prec1)

save_checkpoint({

'epoch': epoch + 1,

'state_dict': net.state_dict(),

'best_prec1': best_prec1,

'optimizer' : optimizer.state_dict(),

}, is_best, filename=save_model_file_name)

～(略)～

# 6. get training data and test data.

start = time.time()

folder_path = os.path.expanduser('~')

folder_path = folder_path + '\\.spyder-py3\\pytorch\\lfw-deepfunneled'

train_data = DownSizedPairImageFolder(folder_path + "\\train",

transform=transforms.ToTensor())

test_data = DownSizedPairImageFolder(folder_path + "\\test",

transform=transforms.ToTensor())

# 7. Create DataLoader with a batch size of 32 respectively.

batch_size = 32

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)

test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

# 8. declare network.

# torch.nn

# https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d

# ex.

# Input: torch.Size([1, 3, 128, 128])

# Output: torch.Size([1, 3, 512, 512])

net = nn.Sequential(

# 次元の計算について, 公式サイト通りとする.

# ex.

# kernel_size = (2, 3) ならば, kernel_size[0] = 2, kernel_size[1] = 3

# kernel_size = 5 ならば, kernel_size[0] = 5, kernel_size[1] = 5

# padding = (3, 5) ならば, padding[0] = 3, padding[1] = 5

# padding = 2 ならば, padding[0] = 2, padding[1] = 2

# stride = (4, 5) ならば, stride[0] = 4, stride[1] = 5

# stride = 3 ならば, stride[0] = 3, stride[1] = 3

# dilation は, ここでは, 初期値 (1, 1) を使用するため, dilation[0] = 1, dilation[1] = 1

# output_padding は, ここでは, 初期値 0 を使用するため, output_padding[0] = 0, output_padding[1] = 0

# Conv2d

# Input: (N, Cin, Hin, Win)

# Output: (N, Cout, Hout, Wout)

# Hout = (Hin + 2 × padding[0] − dilation[0] × (kernel_size[0] − 1) − 1) / stride[0] + 1

# Wout = (Win + 2 × padding[1] − dilation[1] × (kernel_size[1] − 1) − 1) / stride[1] + 1

# ConvTranspose2d

# Input: (N, Cin, Hin, Win)

# Output: (N, Cout, Hout, Wout)

# Hout = (Hin − 1) × stride[0] − 2 × padding[0] + kernel_size[0] + output_padding[0]

# Wout = (Win − 1) × stride[1] − 2 × padding[1] + kernel_size[1] + output_padding[1]

～(略)～

# 4. ConvTranspose2d

# Input: (1, 64, 256, 256)

# Hout = (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512

# Wout = (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512

# Output: (1, 3, 512, 512)

nn.ConvTranspose2d(64, 3, 4, stride = 2, padding = 1),

)

# 9. execute training

net.to("cuda:0")

save_model_file_name = folder_path + '\\checkpoint.pth.tar'

train_net(net, train_loader, test_loader, save_model_file_name, device = "cuda:0")

# 10. display processing time.

end = time.time()

print('--------------------------------------------------')

print('Elapsed Time: ' + str(end - start) + "[sec]")

■実行結果.

100%|██████████| 409/409 [01:21<00:00,  6.13it/s]
0 0.014687953655173077 18.33038706467884 24.7860811423676
100%|██████████| 409/409 [01:18<00:00,  5.31it/s]
1 0.003026380704050943 25.190764407316067 26.22962871283715
100%|██████████| 409/409 [01:19<00:00,  6.08it/s]
2 0.0028060169180076397 25.519097148560505 26.402582314008022
100%|██████████| 409/409 [01:19<00:00,  5.14it/s]
3 0.0025529012736957746 25.929659799933184 26.431970727737788
100%|██████████| 409/409 [01:19<00:00,  5.91it/s]
4 0.0024290909298438313 26.14556227644325 25.60002797809263
100%|██████████| 409/409 [01:19<00:00,  5.13it/s]
5 0.002380632517590645 26.233076385723564 26.9437039437746
100%|██████████| 409/409 [01:19<00:00,  6.03it/s]
6 0.002286262329767998 26.4087393936757 27.172655655185363
100%|██████████| 409/409 [01:19<00:00,  5.36it/s]
7 0.0022002258360804875 26.575327399369503 27.21442603141393
100%|██████████| 409/409 [01:19<00:00,  5.68it/s]
8 0.002206177676715922 26.563595141487912 26.89212311452963
100%|██████████| 409/409 [01:20<00:00,  5.75it/s]
9 0.002144348673001023 26.68704596568539 27.371439209672104
--------------------------------------------------
Elapsed Time: 804.4375095367432[sec]

100%|██████████| 409/409 [01:21<00:00, 6.13it/s]

0 0.014687953655173077 18.33038706467884 24.7860811423676

100%|██████████| 409/409 [01:18<00:00, 5.31it/s]

1 0.003026380704050943 25.190764407316067 26.22962871283715

100%|██████████| 409/409 [01:19<00:00, 6.08it/s]

2 0.0028060169180076397 25.519097148560505 26.402582314008022

100%|██████████| 409/409 [01:19<00:00, 5.14it/s]

3 0.0025529012736957746 25.929659799933184 26.431970727737788

100%|██████████| 409/409 [01:19<00:00, 5.91it/s]

4 0.0024290909298438313 26.14556227644325 25.60002797809263

100%|██████████| 409/409 [01:19<00:00, 5.13it/s]

5 0.002380632517590645 26.233076385723564 26.9437039437746

100%|██████████| 409/409 [01:19<00:00, 6.03it/s]

6 0.002286262329767998 26.4087393936757 27.172655655185363

100%|██████████| 409/409 [01:19<00:00, 5.36it/s]

7 0.0022002258360804875 26.575327399369503 27.21442603141393

100%|██████████| 409/409 [01:19<00:00, 5.68it/s]

8 0.002206177676715922 26.563595141487912 26.89212311452963

100%|██████████| 409/409 [01:20<00:00, 5.75it/s]

9 0.002144348673001023 26.68704596568539 27.371439209672104

--------------------------------------------------

Elapsed Time: 804.4375095367432[sec]

■CNNモデルによる高解像度化の動作確認(書籍上の高解像度化の内容を抜粋・加筆).

# -*- coding: utf-8 -*-
# 1. library import.
from __future__ import print_function
import torch
from torchvision.utils import save_image
from torch import nn, optim
from torchvision.datasets import ImageFolder
from torchvision import transforms
from torch.utils.data import DataLoader
import time, os

# 2. declare network.
# torch.nn
# https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d
# ex.
# Input:  torch.Size([1, 3, 128, 128])
# Output: torch.Size([1, 3, 512, 512])
net = nn.Sequential(
    # 次元の計算について, 公式サイト通りとする.
    # ex. 
    # kernel_size = (2, 3) ならば, kernel_size[0] = 2, kernel_size[1] = 3
    # kernel_size = 5      ならば, kernel_size[0] = 5, kernel_size[1] = 5
    # padding = (3, 5)     ならば, padding[0] = 3, padding[1] = 5
    # padding = 2          ならば, padding[0] = 2, padding[1] = 2
    # stride = (4, 5)      ならば, stride[0] = 4, stride[1] = 5
    # stride = 3           ならば, stride[0] = 3, stride[1] = 3
    # dilation は, ここでは, 初期値 (1, 1) を 使用するため, dilation[0] = 1, dilation[1] = 1
    # output_padding は, ここでは, 初期値 0 を 使用するため, output_padding[0] = 0, output_padding[1] = 0
    
    # Conv2d
    # Input:  (N, Cin, Hin, Win)
    # Output: (N, Cout, Hout, Wout)
    # Hout = (Hin + 2 × padding[0] − dilation[0] × (kernel_size[0] − 1) − 1) / stride[0] + 1
    # Wout = (Win + 2 × padding[1] − dilation[1] × (kernel_size[1] − 1) − 1) / stride[1] + 1

    # ConvTranspose2d
    # Input:  (N, Cin, Hin, Win)
    # Output: (N, Cout, Hout, Wout)
    # Hout = (Hin − 1) × stride[0] − 2 × padding[0] + kernel_size[0] + output_padding[0]
    # Wout = (Win − 1) × stride[1] − 2 × padding[1] + kernel_size[1] + output_padding[1]
    
    
～(略)～

    # 4. ConvTranspose2d
    # Input:  (1, 64, 256, 256)
    # Hout =  (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512
    # Wout =  (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512
    # Output: (1, 3, 512, 512)
    nn.ConvTranspose2d(64, 3, 4, stride = 2, padding = 1),
    
)

～(略)～

# 6. enlarge with Bilinear.
# warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
# bl_recon = torch.nn.functional.upsample(x, 128, mode = "bilinear", align_corners = True)
bl_recon = torch.nn.functional.interpolate(x, 128, mode = "bilinear", align_corners = True)

# 7. enlarge with CNN.
# RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
# -> net.to("cuda:0") を 追加して, 上記 error を 回避した.
net.to("cuda:0")
# Saving and loading a model in Pytorch?
# https://discuss.pytorch.org/t/saving-and-loading-a-model-in-pytorch/2610/49
# -> 上記サイトを確認後, 以下のような形で, 保存したCNNモデルを呼び出す形で対応した.
checkpoint = torch.load(folder_path + '\\checkpoint.pth.tar')
net.load_state_dict(checkpoint['state_dict'])
optimizer = optim.Adam(net.parameters())
optimizer.load_state_dict(checkpoint['optimizer']) 
cnn_recon = net(x.to("cuda:0")).to("cpu")

# 8. combine the image of original, and of Bilinear, and of CNN by torch.cat and 
# write it out to the image file with save_image.
save_image(torch.cat([y, bl_recon, cnn_recon], 0), folder_path + "\\cnn_upscale.jpg", nrow = 5)

# 9. display processing time.
end = time.time()
print('--------------------------------------------------')
print('Elapsed Time: ' + str(end - start) + "[sec]")

# -*- coding: utf-8 -*-

# 1. library import.

from __future__ import print_function

import torch

from torchvision.utils import save_image

from torch import nn, optim

from torchvision.datasets import ImageFolder

from torchvision import transforms

from torch.utils.data import DataLoader

import time, os

# 2. declare network.

# torch.nn

# https://pytorch.org/docs/stable/nn.html#torch.nn.ConvTranspose2d

# ex.

# Input: torch.Size([1, 3, 128, 128])

# Output: torch.Size([1, 3, 512, 512])

net = nn.Sequential(

# 次元の計算について, 公式サイト通りとする.

# ex.

# kernel_size = (2, 3) ならば, kernel_size[0] = 2, kernel_size[1] = 3

# kernel_size = 5 ならば, kernel_size[0] = 5, kernel_size[1] = 5

# padding = (3, 5) ならば, padding[0] = 3, padding[1] = 5

# padding = 2 ならば, padding[0] = 2, padding[1] = 2

# stride = (4, 5) ならば, stride[0] = 4, stride[1] = 5

# stride = 3 ならば, stride[0] = 3, stride[1] = 3

# dilation は, ここでは, 初期値 (1, 1) を使用するため, dilation[0] = 1, dilation[1] = 1

# output_padding は, ここでは, 初期値 0 を使用するため, output_padding[0] = 0, output_padding[1] = 0

# Conv2d

# Input: (N, Cin, Hin, Win)

# Output: (N, Cout, Hout, Wout)

# Hout = (Hin + 2 × padding[0] − dilation[0] × (kernel_size[0] − 1) − 1) / stride[0] + 1

# Wout = (Win + 2 × padding[1] − dilation[1] × (kernel_size[1] − 1) − 1) / stride[1] + 1

# ConvTranspose2d

# Input: (N, Cin, Hin, Win)

# Output: (N, Cout, Hout, Wout)

# Hout = (Hin − 1) × stride[0] − 2 × padding[0] + kernel_size[0] + output_padding[0]

# Wout = (Win − 1) × stride[1] − 2 × padding[1] + kernel_size[1] + output_padding[1]

～(略)～

# 4. ConvTranspose2d

# Input: (1, 64, 256, 256)

# Hout = (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512

# Wout = (256 - 1) × 2 - 2 × 1 + 4 + 0 = 512

# Output: (1, 3, 512, 512)

nn.ConvTranspose2d(64, 3, 4, stride = 2, padding = 1),

)

～(略)～

# 6. enlarge with Bilinear.

# warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")

# bl_recon = torch.nn.functional.upsample(x, 128, mode = "bilinear", align_corners = True)

bl_recon = torch.nn.functional.interpolate(x, 128, mode = "bilinear", align_corners = True)

# 7. enlarge with CNN.

# RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

# -> net.to("cuda:0") を追加して, 上記 error を回避した.

net.to("cuda:0")

# Saving and loading a model in Pytorch?

# https://discuss.pytorch.org/t/saving-and-loading-a-model-in-pytorch/2610/49

# -> 上記サイトを確認後, 以下のような形で, 保存したCNNモデルを呼び出す形で対応した.

checkpoint = torch.load(folder_path + '\\checkpoint.pth.tar')

net.load_state_dict(checkpoint['state_dict'])

optimizer = optim.Adam(net.parameters())

optimizer.load_state_dict(checkpoint['optimizer'])

cnn_recon = net(x.to("cuda:0")).to("cpu")

# 8. combine the image of original, and of Bilinear, and of CNN by torch.cat and

# write it out to the image file with save_image.

save_image(torch.cat([y, bl_recon, cnn_recon], 0), folder_path + "\\cnn_upscale.jpg", nrow = 5)

# 9. display processing time.

end = time.time()

print('--------------------------------------------------')

print('Elapsed Time: ' + str(end - start) + "[sec]")

■実行結果.

--------------------------------------------------
Elapsed Time: 0.3804628849029541[sec]

1 2	-------------------------------------------------- Elapsed Time: 0.3804628849029541[sec]

■以上の実行結果から, 以下のことが分かった.

1. CNNモデルのロード.
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
上記のようなerrorが出力されたため, net.to("cuda:0") をいったん呼び出す必要があることが分かった.

2. CNNモデルの保存先.
ここでは, [.spyder-py3\pytorch\lfw-deepfunneled]フォルダ内に, [checkpoint.pth.tar]ファイルとして, 保存された.

3. 高解像度化の出力画像の保存先.
ここでは, [.spyder-py3\pytorch\lfw-deepfunneled]フォルダ内に, [cnn_upscale.jpg]ファイルとして, 保存された.

4. 警告.
torch.nn.functional.upsample で, warnings が出力されたので, torch.nn.functional.interpolate で, 動作確認した.

1. CNNモデルのロード.

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

上記のようなerrorが出力されたため, net.to("cuda:0") をいったん呼び出す必要があることが分かった.

2. CNNモデルの保存先.

ここでは, [.spyder-py3\pytorch\lfw-deepfunneled]フォルダ内に, [checkpoint.pth.tar]ファイルとして, 保存された.

3. 高解像度化の出力画像の保存先.

ここでは, [.spyder-py3\pytorch\lfw-deepfunneled]フォルダ内に, [cnn_upscale.jpg]ファイルとして, 保存された.

4. 警告.

torch.nn.functional.upsample で, warnings が出力されたので, torch.nn.functional.interpolate で, 動作確認した.

■参照サイト
【参照URL①】Saving and loading a model in Pytorch?
【参照URL②】pytorch/examples
【参照URL③】Best way to save a trained model in PyTorch?

■参考書籍
現場で使える! PyTorch開発入門深層学習モデルの作成とアプリケーションへの実装 (AI & TECHNOLOGY)

コメントを残す コメントをキャンセル

コメントを残すコメントをキャンセル