深度學習練手小常式:cifar-10
(配了個背景圖,好好看呀~都是我老婆,諸位拔刀吧!)
原本是打算用ImageNet做一個idea的驗證,很尷尬的是,顯存不夠用??,batch設得好小(天吶,顯存不夠用多次困擾我這個窮逼),所以就用Cifar-10和cifar-100來做吧,結果又發現,,,,這圖像太小了,,,經不住pooling啊~後來的後來,看了篇ICLR2017的文章,發現我的idea沒有意義,,,,,,所以,就乾脆把自己寫的cifar-10的程序作為入門深度學習的練手demo,找點自我安慰??,也算是沒白忙活。
環境:python-3.6、Pytorch-1.0。
配置:i5-6300、GTX-1060-3g
cifar數據集詳情就不做過多介紹了,百度一下,就有很多相關的博客。下面的demo使用的是cifar-10,cifar-100大家可以自己跑一下,相關的數據格式都已經處理好了,網盤鏈接如下:
https://pan.baidu.com/s/1pE4kMrAp9UEpBXv7pm7zMg提取碼:nm39
文件下CIFAR下一共包括三個子文件夾:

cifar-10和cifar-100,以及存放代碼的demo-code(不要問我為啥不上傳到github上,大概是我懶)cifar-10和cifar-100里都同樣包括:

cifar_train和cifar_test以及cifar-10(100)-origin。
其中cifar_origin是原始數據集,未經過處理。cifar-train和cifar_test里是我已經處理好的數據,後面再說。我們先以cifar-10為例,原始數據集包括5個已分好的batch以及test_batch:

每個batch包括一個超級大的dict,其中包括圖像的標籤、圖像像素數據、圖片名字(這個用不上)等等,我們看一下dict中的內容,截取幾個片段:


我們主要用的就是dict[blabels]和dict[bdata]這兩個。上面的讀取代碼用的是官方提供的:
import pickle
def unpickle(file):
with open(file, rb) as fo:
dict = pickle.load(fo, encoding=bytes)
return dict
每個batch的dict[blabels]會返回一個10000長的list,其中是對應該batch的10000個圖像的labels(0-9),dict[bdata]會返回一個(10000,3072)的數組,其中10000表示10000個圖片,3072=32x32x3代表圖像像素總數。demo-code文件中的python代碼:cifar_tools.py中,對這些數據進行了處理,將5個batch的圖像數據與標籤在了一起,得到一個(50000,3072)的imagedata-array,與10000的label-array同時這段代碼還會讀取存放,具體代碼如下:
import pickle
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import h5py
def unpickle(file):
with open(file, rb) as fo:
dict = pickle.load(fo, encoding=bytes)
return dict
def get_train_data(path, model=None):
# 此段代碼用於訓練/測試時讀取已處理好的數據用的
# 獲取圖像數據
file = h5py.File(path + /cifar_ + model + .h5,r)
img_datas = file[data][:]
file.close()
# 獲取標籤數據
label_datas = np.loadtxt(path + /cifar_ + model + .txt)
return img_datas, label_datas
if __name__ == "__main__":
file = CIFAR/cifar-10/cifar-10-origin/
data_batch = [data_batch_1,data_batch_2,data_batch_3,data_batch_4,data_batch_5]
img_flattens = []
label_list = []
for filename in data_batch:
path = file + filename
batch = unpickle(path)
print(batch)
exit()
# 獲取當前 batch 的標籤
labels = batch[blabels]
label_list.append(labels)
# 獲取當前 batch 的圖像信息
img_datas = batch[bdata]
# 獲取當前 batch 的圖像名字
img_names = batch[bfilenames]
#print(img_data.shape)
# 還原圖像
for img_data in img_datas:
img_r = np.expand_dims(img_data[:1024].reshape(32,32),axis=2)
img_g = np.expand_dims(img_data[1024:2048].reshape(32,32),axis=2)
img_b = np.expand_dims(img_data[2048:].reshape(32,32),axis=2)
img = np.concatenate([img_r, img_g, img_b], axis=2)
# 存儲圖像數據
img_flattens.append(img.reshape(1,-1))
#img = Image.fromarray(img)
#img.show()
# 整合所有的batch的圖片信息,shape=50000 x 3072
img_flattens = np.concatenate(img_flattens)
print(img_flattens.shape)
file = h5py.File(CIFAR/cifar-10/cifar_train/cifar_train.h5,w)
file[data] = img_flattens
file.close()
# 整合所有的batch的標籤信息,shape= 50000
label_list = np.array(label_list).reshape(1,-1)[0]
print(label_list)
np.savetxt(CIFAR/cifar-10/cifar_train/cifar_train.txt,label_list)
#img = img_flattens[100].reshape(32,32,3)
#img = Image.fromarray(img)
#img.show()
處理好的圖像數據我是用python的h5py來保存的,所以需要確保你的python有h5py這個庫函數(Anaconda3中已經都配置好了相關庫)。這裡要說一個地方,就是還原圖像那一塊:

這一塊,我最開始是直接將原始的(10000,3072)的數據reshape成(1000,32,32,3)結果發現,出來的圖像長這樣:

即使我單個把(1,3072)的reshape成(32,32,3)的圖像,也是上面那樣,所以我才用了上面那段代碼看起來有點麻煩的方法才還原出原始圖像:

這是只青蛙,,,青蛙,,,為什麼我的一分鐘只有59秒了?由於是32x32的圖像,所以放大了就是這樣的馬賽克畫質,說起馬賽克,,,我想。。。。。打住打住!
另外,大家記得要改一下相關路徑(不過,處理好的數據已經都給大家了,所以這個代碼意義並不大~)。最後我們就可以得到處理好的數據了:


然後我們就可以用pytorch搭建的簡單的CNN訓練就好了:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import cifar_tools as tools
import adabound #這個adabound是CVPR2019的工作的開源,如果沒裝請刪掉此行代碼
import os
PATH_to_SAVE = model/#model/cifar-10-model.pt
class Net(nn.Module):
def __init__(self, model=None):
super(Net, self).__init__()
# input_size : 3 x H x W
self.model = model
if self.model == simple:
......
if self.model == plain:
......
if self.model == residual:
......
# output :
self.fc1 = nn.Linear(64*8*8, 384)
self.fc2 = nn.Linear(384, 192)
self.fc3 = nn.Linear(192, 10)
def num_flat_features(self, x):
size = x.size()[1:]
num_features = 1
for s in size:
num_features *= s
return num_features
def forward(self, net_0):
if self.model == simple:
......
if self.model == plain:
......
if self.model == residual:
......
return output
if __name__ == "__main__":
net = Net(simple)
#print(net)
print(torch.cuda.device_count())
device = torch.device("cuda")
net.to(device)
criterion = nn.CrossEntropyLoss()
lr = 0.01
optimizer = optim.SGD(net.parameters(), lr=lr)
#optimizer = adabound.AdaBound(net.parameters())
print(-------------- get train data ----------------)
path_to_data = CIFAR/cifar-10/cifar_train/
X, Y = tools.get_train_data(path_to_data, train)
batch = 100
x_image_total = X.reshape(-1,batch,32*32*3)
y_label_total = Y.reshape(-1,batch,1)
print(-- image train data size : , x_image_total.shape)
print(-- label train data size : , y_label_total.shape)
running_loss = 0.0
loss_all = []
accuracy = 0.0
accuracy_all = []
train_epochs = 70
print("-------------- start training ----------------")
for epoch in range(train_epochs):
if epoch == int(train_epochs * 0.5) or epoch == int(train_epochs * 0.75):
lr = lr*0.1
for param_group in optimizer.param_groups:
param_group[lr] = lr
print("------------- ",epoch," epoch -------------")
for i in range(y_label_total.shape[0]):
imgbatch = torch.tensor(x_image_total[i], dtype=torch.float).to(device).view(-1, 3, 32, 32)
labelbatch = torch.tensor(y_label_total[i], dtype=torch.float).view(-1).to(device)
optimizer.zero_grad()
pred = net(imgbatch)
# 計算當前的預測準確度
accuracy_now = (torch.argmax(torch.softmax(pred.data,1), 1) == labelbatch.long()).sum().item() / batch
accuracy += accuracy_now
accuracy_all.append(accuracy_now)
# 計算當前的預測損失
loss = criterion(pred, labelbatch.long())
loss_all.append(loss.item())
# 建立反向傳播的OP
loss.backward()
# 優化、更新
optimizer.step()
running_loss += loss.item()
if (i+1) % 100 == 0:
print(i+1, " steps, the loss : ", running_loss / 100.0, " the accuracy : ", accuracy / 100.0)
running_loss = 0.0
accuracy = 0.0
if epoch % 5 ==0:
torch.save(net.state_dict(), PATH_to_SAVE+/simple/+cifar-10-model-+str(epoch)+-epoch.pt)
# save trained model
loss_all = np.array(loss_all)
accuracy_all = np.array(accuracy_all)
np.savetxt(train_results/simple/train_error.txt, loss_all)
np.savetxt(train_results/simple/train_accuracy.txt, accuracy_all)
#torch.save(net.state_dict(), PATH_to_SAVE)
# load trained model
print("-------------- start testing ----------------")
print(-------------- get test data ----------------)
path_to_data = CIFAR/cifar-10/cifar_test/
X, Y = tools.get_train_data(path_to_data, test)
batch = 1000
x_image_total = X.reshape(-1,batch,32*32*3)
y_label_total = Y.reshape(-1,batch,1)
print(-- image test data size : , x_image_total.shape)
print(-- label test data size : , y_label_total.shape)
accuracy = 0.0
model_name_list = os.listdir(PATH_to_SAVE + /simple/)
#print(model_name_list)
for model in model_name_list:
net.load_state_dict(torch.load(PATH_to_SAVE + /simple/ + model), strict=False)
net.eval()
with torch.no_grad():
for i in range(y_label_total.shape[0]):
imgbatch = torch.tensor(x_image_total[i], dtype=torch.float).to(device).view(-1, 3, 32, 32)
labelbatch = torch.tensor(y_label_total[i], dtype=torch.long).view(-1).to(device)
pred = net(imgbatch)
accuracy += (torch.argmax(torch.softmax(pred.data,1), 1) == labelbatch).sum().item() / batch
print("The test accuracy of " + model + " : ", accuracy / y_label_total.shape[0])
accuracy = 0
網路的主體代碼我都用......略掉了,不然太長了,占篇幅,具體的大家還請自己打開代碼文件查看,這裡主要寫了simple、plain、residual三種結構,simple就是非常簡單的幾層CNN串聯,residual就是deeper一些的殘差網路,plain就是把residual中的skip connect全去掉,為啥我要寫這些,這大概也是,,我閑的慌(實際上一開始是為了驗證某個idea)。另外,需要補充一下,一開始import adabound,是用了CVPR2019的開源工作,想試一下好不好使,不過在cifar上,似乎無法發揮出它的強勢之處。另外cifar這個數據集很容易過擬合,可能會給你的調參帶來很大的挑戰,嘿嘿嘿~
小常式就寫到這,只要配置好了pytorch1.0就可以跑這一段代碼了,希望大家煉丹愉快,有什麼問題的就在下面評論留言吧~
推薦閱讀:
TAG:深度學習(DeepLearning) | PyTorch | 計算機視覺 |
