pytorch學習筆記(2)—構建數據類、圖像預處理、讀寫模型

05-11

2. pytorch讀數據

可以numpy讀數據，然後torch.from_numpy轉化成torch數據。pytorch中提供了torchvision包可以讀入常用的圖像數據集CIFAR10,MNIST,也有針對於這些圖像的簡單變換。

import torchvision.datasetsimport torch.utils.data.DataLoaderimport torchvision.transforms as transforms

讀圖像常用的包有 opencv、scikit-image、matplot等

import cv2import matplotlib.pyplot as pltimport skimage

讀文件常用的包有pandas(解析cvs文件)...

import pandas as pd

2.1 數據集類

在pytorch中，用torch.utils.data.Dataset描述數據集類，在使用自己的數據時，需要重寫len和getitem兩個方法。數據集一般用dict封裝，以faceLandmark為例，構造函數init讀取的是圖像的索引，len則返回數據集的長度，getitem則把數據集封裝成dict類型，用來讀取某個特定的圖像

class FaceLandmarksDataset(Dataset): """Face Landmarks dataset.""" def __init__(self, csv_file, root_dir, transform=None): """ Args: csv_file (string): Path to the csv file with annotations. root_dir (string): Directory with all the images. transform (callable, optional): Optional transform to be applied on a sample. """ self.landmarks_frame = pd.read_csv(csv_file) self.root_dir = root_dir self.transform = transform def __len__(self): return len(self.landmarks_frame) def __getitem__(self, idx): img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0]) image = io.imread(img_name) landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix() landmarks = landmarks.astype(float).reshape(-1, 2) sample = {image: image, landmarks: landmarks} if self.transform: sample = self.transform(sample) return sample

定義好數據類之後，讀取數據時只要實例化這個類即可，如下表示實例化類並顯示前四幅圖像

face_dataset = FaceLandmarksDataset(csv_file=faces/face_landmarks.csv, root_dir=faces/)fig = plt.figure()for i in range(len(face_dataset)): sample = face_dataset[i] print(i, sample[image].shape, sample[landmarks].shape) ax = plt.subplot(1, 4, i + 1) plt.tight_layout() ax.set_title(Sample #{}.format(i)) ax.axis(off) show_landmarks(**sample) if i == 3: plt.show() break

2.2圖像數據的變換

圖像調整大小主要藉助skimage包中的transform來實現：

img=transform.resize(old_image,(new_h,new_w))

這裡，我們可以用別人寫好的用於變換的類：

lass Rescale(object): """Rescale the image in a sample to a given size. Args: output_size (tuple or int): Desired output size. If tuple, output is matched to output_size. If int, smaller of image edges is matched to output_size keeping aspect ratio the same. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) self.output_size = output_size def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] h, w = image.shape[:2] if isinstance(self.output_size, int): if h > w: new_h, new_w = self.output_size * h / w, self.output_size else: new_h, new_w = self.output_size, self.output_size * w / h else: new_h, new_w = self.output_size new_h, new_w = int(new_h), int(new_w) img = transform.resize(image, (new_h, new_w)) # h and w are swapped for landmarks because for images, # x and y axes are axis 1 and 0 respectively landmarks = landmarks * [new_w / w, new_h / h] return {image: img, landmarks: landmarks}class RandomCrop(object): """Crop randomly the image in a sample. Args: output_size (tuple or int): Desired output size. If int, square crop is made. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) if isinstance(output_size, int): self.output_size = (output_size, output_size) else: assert len(output_size) == 2 self.output_size = output_size def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] h, w = image.shape[:2] new_h, new_w = self.output_size top = np.random.randint(0, h - new_h) left = np.random.randint(0, w - new_w) image = image[top: top + new_h, left: left + new_w] landmarks = landmarks - [left, top] return {image: image, landmarks: landmarks}class ToTensor(object): """Convert ndarrays in sample to Tensors.""" def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] # swap color axis because # numpy image: H x W x C # torch image: C X H X W image = image.transpose((2, 0, 1)) return {image: torch.from_numpy(image), landmarks: torch.from_numpy(landmarks)}

調整圖像數據的格式主要用np中的transpose來實現,注意在numpy 中圖像以H x W x C的格式存儲，在torch中，圖像用C x H x W存儲。另外torchvision中的transfomrs.Compose可以把一系列變換組合起來，變換後的數據集可以實例化為：

transformed_dataset = FaceLandmarksDataset(csv_file=faces/face_landmarks.csv, root_dir=faces/, transform=transforms.Compose([ Rescale(256), RandomCrop(224), ToTensor() ]))for i in range(len(transformed_dataset)): sample = transformed_dataset[i] print(i, sample[image].size(), sample[landmarks].size()) if i == 3: break

當然,torchvision也提供了現成的transforms，如下：

import torchfrom torchvision import transforms, datasetsdata_transform = transforms.Compose([ transforms.RandomSizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ])hymenoptera_dataset = datasets.ImageFolder(root=hymenoptera_data/train, transform=data_transform)dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset, batch_size=4, shuffle=True, num_workers=4)

這種變換的方式最常見，

==transforms.Compose==

就是把各種變換組合，

==transforms.RandomSizeCrop(224)==

就是隨機剪切，最最重要的==transforms.ToTensor()==

就是把形狀為[H, W, C]的取值為[0,255]的numpy.ndarray，轉換成形狀為[C, H, W]，取值為[0, 1.0]的torch.FloadTensor。

還原時，使用==torch.clamp(0,1)==

最後

==transforms.Normalize==是最常見的歸一化手段，把圖像從[0,1]轉化成[-1,1 ]

2.3 讀數據

我們可以寫個循環每次實例化一個數據類，然後把實例出來的圖片放入網路，但這樣做就忽略了深度學習中常用的Batch, shuffling 和multiprocessing.這些問題可以用pytorch 中的Dataloader來解決。導入data loader 包

import torch.utils.data.DataLoader

DataLoarder接受一個Dataset類，可以定義batch size shuffle，以及線程個數

dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)

2.4 讀寫模型

訓練好的模型可以保存參數或者整個模型，設model為某個模型的實例，保存參數可用以下語句

torch.save(model.state_dict(),./pretrained/model.pth)

讀取的時候則是

model.load_state_dict(torch.load(./pretrained/model.pth))