pytorch學習筆記(2)—構建數據類、圖像預處理、讀寫模型
2. pytorch讀數據
可以numpy讀數據,然後torch.from_numpy轉化成torch數據。pytorch中提供了torchvision包可以讀入常用的圖像數據集CIFAR10,MNIST,也有針對於這些圖像的簡單變換。
import torchvision.datasetsimport torch.utils.data.DataLoaderimport torchvision.transforms as transforms
讀圖像常用的包有 opencv、scikit-image、matplot等
import cv2import matplotlib.pyplot as pltimport skimage
讀文件常用的包有pandas(解析cvs文件)...
import pandas as pd
2.1 數據集類
在pytorch中,用torch.utils.data.Dataset描述數據集類,在使用自己的數據時,需要重寫len和getitem兩個方法。數據集一般用dict封裝,以faceLandmark為例,構造函數init讀取的是圖像的索引,len則返回數據集的長度,getitem則把數據集封裝成dict類型,用來讀取某個特定的圖像
class FaceLandmarksDataset(Dataset): """Face Landmarks dataset.""" def __init__(self, csv_file, root_dir, transform=None): """ Args: csv_file (string): Path to the csv file with annotations. root_dir (string): Directory with all the images. transform (callable, optional): Optional transform to be applied on a sample. """ self.landmarks_frame = pd.read_csv(csv_file) self.root_dir = root_dir self.transform = transform def __len__(self): return len(self.landmarks_frame) def __getitem__(self, idx): img_name = os.path.join(self.root_dir, self.landmarks_frame.iloc[idx, 0]) image = io.imread(img_name) landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix() landmarks = landmarks.astype(float).reshape(-1, 2) sample = {image: image, landmarks: landmarks} if self.transform: sample = self.transform(sample) return sample
定義好數據類之後,讀取數據時只要實例化這個類即可,如下表示實例化類並顯示前四幅圖像
face_dataset = FaceLandmarksDataset(csv_file=faces/face_landmarks.csv, root_dir=faces/)fig = plt.figure()for i in range(len(face_dataset)): sample = face_dataset[i] print(i, sample[image].shape, sample[landmarks].shape) ax = plt.subplot(1, 4, i + 1) plt.tight_layout() ax.set_title(Sample #{}.format(i)) ax.axis(off) show_landmarks(**sample) if i == 3: plt.show() break
2.2圖像數據的變換
圖像調整大小主要藉助skimage包中的transform來實現:
img=transform.resize(old_image,(new_h,new_w))
這裡,我們可以用別人寫好的用於變換的類:
lass Rescale(object): """Rescale the image in a sample to a given size. Args: output_size (tuple or int): Desired output size. If tuple, output is matched to output_size. If int, smaller of image edges is matched to output_size keeping aspect ratio the same. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) self.output_size = output_size def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] h, w = image.shape[:2] if isinstance(self.output_size, int): if h > w: new_h, new_w = self.output_size * h / w, self.output_size else: new_h, new_w = self.output_size, self.output_size * w / h else: new_h, new_w = self.output_size new_h, new_w = int(new_h), int(new_w) img = transform.resize(image, (new_h, new_w)) # h and w are swapped for landmarks because for images, # x and y axes are axis 1 and 0 respectively landmarks = landmarks * [new_w / w, new_h / h] return {image: img, landmarks: landmarks}class RandomCrop(object): """Crop randomly the image in a sample. Args: output_size (tuple or int): Desired output size. If int, square crop is made. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) if isinstance(output_size, int): self.output_size = (output_size, output_size) else: assert len(output_size) == 2 self.output_size = output_size def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] h, w = image.shape[:2] new_h, new_w = self.output_size top = np.random.randint(0, h - new_h) left = np.random.randint(0, w - new_w) image = image[top: top + new_h, left: left + new_w] landmarks = landmarks - [left, top] return {image: image, landmarks: landmarks}class ToTensor(object): """Convert ndarrays in sample to Tensors.""" def __call__(self, sample): image, landmarks = sample[image], sample[landmarks] # swap color axis because # numpy image: H x W x C # torch image: C X H X W image = image.transpose((2, 0, 1)) return {image: torch.from_numpy(image), landmarks: torch.from_numpy(landmarks)}
調整圖像數據的格式主要用np中的transpose來實現,注意在numpy 中圖像以H x W x C的格式存儲,在torch中,圖像用C x H x W存儲。另外torchvision中的transfomrs.Compose可以把一系列變換組合起來,變換後的數據集可以實例化為:
transformed_dataset = FaceLandmarksDataset(csv_file=faces/face_landmarks.csv, root_dir=faces/, transform=transforms.Compose([ Rescale(256), RandomCrop(224), ToTensor() ]))for i in range(len(transformed_dataset)): sample = transformed_dataset[i] print(i, sample[image].size(), sample[landmarks].size()) if i == 3: break
當然,torchvision也提供了現成的transforms,如下:
import torchfrom torchvision import transforms, datasetsdata_transform = transforms.Compose([ transforms.RandomSizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ])hymenoptera_dataset = datasets.ImageFolder(root=hymenoptera_data/train, transform=data_transform)dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset, batch_size=4, shuffle=True, num_workers=4)
這種變換的方式最常見,
==transforms.Compose==
就是把各種變換組合,==transforms.RandomSizeCrop(224)==
就是隨機剪切,最最重要的==transforms.ToTensor()==就是把形狀為[H, W, C]的取值為[0,255]的numpy.ndarray,轉換成形狀為[C, H, W],取值為[0, 1.0]的torch.FloadTensor。還原時,使用==torch.clamp(0,1)==最後==transforms.Normalize==是最常見的歸一化手段,把圖像從[0,1]轉化成[-1,1 ]
2.3 讀數據
我們可以寫個循環每次實例化一個數據類,然後把實例出來的圖片放入網路,但這樣做就忽略了深度學習中常用的Batch, shuffling 和multiprocessing.這些問題可以用pytorch 中的Dataloader來解決。導入data loader 包
import torch.utils.data.DataLoader
DataLoarder接受一個Dataset類,可以定義batch size shuffle,以及線程個數
dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)
2.4 讀寫模型
訓練好的模型可以保存參數或者整個模型,設model為某個模型的實例,保存參數可用以下語句
torch.save(model.state_dict(),./pretrained/model.pth)
讀取的時候則是
model.load_state_dict(torch.load(./pretrained/model.pth))
推薦閱讀:
※pytorch例子-強化學習(DQN)
※DenseNet論文翻譯及pytorch實現解析(下)
※Pytorch筆記02-torch.nn以及torch.optim
※一個優雅的框架 | Pytorch 初體驗
※總結近期CNN模型的發展(一)
TAG:PyTorch |
