精华内容
下载资源
问答
  • pytorch中的transforms模块中包含了很多种对图像数据进行变换的函数,这些都是在我们进行图像数据读入步骤中必不可少的,下面我们讲解几种最常用的函数,详细的内容还请参考pytorch官方文档(放在文末)。...
  • 这是基于Opencv的torchvision“ transforms”的重写。 所有功能仅取决于cv2和pytorch(无PIL)。 如,cv2比PIL快三倍。 转换中的大多数函数都被重新实现,除了: 在原始版本中已弃用的ToPILImage(我们使用过的...
  • transforms.CenterCrop(size) 将给定的PIL.Image进行中心切割,得到给定的size,size可以是tuple,(target_height, target_width)。size也可以是一个Integer,在这种情况下,切出来的图片的形状是正方形。 size可以...
  • transforms3d, 3维空间变换 Transforms3d在各种几何转换之间转换的代码。将 rotations/zooms/shears/translations 构成仿射矩阵;将仿射矩阵分解为 rotations/zooms/shears/transl
  • Providing a concise introduction to the theory and practice of Fourier transforms, this book is invaluable to students of physics, electrical and electronic engineering, and computer science....
  • Tables Of Integral Transforms.Volume I.PDF Tables Of Integral Transforms.Volume I.PDF
  • 这是关于H变换理论与应用的电子书,高清,最新版本,经典著作,英文版
  • opencv_transforms 该存储库旨在快速替代。 此仓库使用OpenCV为PyTorch计算机视觉管道提供快速图像增强。 我编写这段代码是因为基于Pillow的Torchvision转换由于缓慢的图像增强而使我的GPU饿死了。 要求 OpenCV有效...
  • radon_transforms拉东变换与反变换
  • transforms.zip

    2020-08-09 22:27:17
    简单粗暴PyTorch之transforms详解中的代码与数据集
  • Transform techniques have become familiar to recent generations of undergraduates in various areas of mathematics, science, and engineering. The principal integral transform that is perhaps best known...
  • 为了使配置文件开箱即用,请将maltego_transforms移至/ opt / Maltego_HackerTarget 。 如果路径不同,则可以从管理转换屏幕进行更新。 获取一个API密钥。 免费用户每天可通过API获得200个请求(无需密钥)。 在...
  • rect-zoom-transforms 计算从一个矩形缩放到第一个矩形内的另一个矩形所需的缩放和平移。 用法 var rectZoomTransforms = require ( 'rect-zoom-transforms' ) ; // originalRect and targetRect are expected to ...
  • Arthur Erdelyi - Tables of integral transforms. Vol.1.-McGraw-Hill Inc.,US (1954)
  • 关于pytorch当中常用的图像变化方法-transforms—对应的上篇代码的各个参数的介绍及如何使用
  • PyTorch框架中有一个非常重要且好用的包:torchvision,该包主要由3个子包组成,分别是:torchvision.datasets、torchvision.models、torchvision.transforms。这3个子包的具体介绍可以参考官网:...

    1. 介绍

    PyTorch框架中有一个非常重要且好用的包:torchvision,该包主要由3个子包组成,分别是:torchvision.datasets、torchvision.models、torchvision.transforms。这3个子包的具体介绍可以参考官网:http://pytorch.org/docs/master/torchvision/index.html。具体代码可以参考github:https://github.com/pytorch/vision/tree/master/torchvision

    这篇博客介绍torchvision.transformas。torchvision.transforms这个包中包含resize、crop等常见的data augmentation操作,基本上PyTorch中的data augmentation操作都可以通过该接口实现。该包主要包含两个脚本:transformas.py和functional.py,前者定义了各种data augmentation的类,在每个类中通过调用functional.py中对应的函数完成data augmentation操作。

    2. 使用例子

    import torchvision
    import torch
    
    data_transforms = {
        'train': transforms.Compose([
    	    transforms.ToPILImage(),
    	    transforms.Resize(256),
    	    # transforms.RandomResizedCrop(224,scale=(0.5,1.0)),
    	    transforms.RandomHorizontalFlip(),
    	    transforms.ToTensor(),
    	    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
        ]),
        'val': transforms.Compose([
            transforms.Resize(256),
            transforms.CenterCrop(224),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
        ]),
    
    image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), transform=data_transforms[x], loader=None),for x in ['train', 'val']}
    

    上面一段代码就是图像预处理的操作。以前在tensorflow的框架内处理起来很麻烦的数据,在这里就几句代码可以搞定了,很爽。大致讲解一下代码。

    transforms.Compose函数就是将transforms组合在一起;而每一个transforms都有自己的功能。最终只要使用定义好的train_transformer 就可以按照循序处理transforms的要求的。上面的代码中:

    **transforms.ToPILImage()**是转换数据格式,把数据转换为tensfroms格式。只有转换为tensfroms格式才能进行后面的处理。

    **transforms.Resize(256)**是按照比例把图像最小的一个边长放缩到256,另一边按照相同比例放缩。

    **transforms.RandomResizedCrop(224,scale=(0.5,1.0))**是把图像按照中心随机切割成224正方形大小的图片。

    transforms.ToTensor() 转换为tensor格式,这个格式可以直接输入进神经网络了。

    **transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])**对像素值进行归一化处理。

    3. transforms处理方法

    3.1 裁剪-Crop

    • 随机裁剪:transforms.RandomCrop
      class

      torchvision.transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode=‘constant’)
      功能:依据给定的size随机裁剪
      参数:
      size- (sequence or int),若为sequence,则为(h,w),若为int,则(size,size)
      padding-(sequence or int, optional),此参数是设置填充多少个pixel。
      当为int时,图像上下左右均填充int个,例如padding=4,则上下左右均填充4个pixel,若为3232,则会变成4040。
      当为sequence时,若有2个数,则第一个数表示左右扩充多少,第二个数表示上下的。当有4个数时,则为左,上,右,下。
      fill- (int or tuple) 填充的值是什么(仅当填充模式为constant时有用)。int时,各通道均填充该值,当长度为3的tuple时,表示RGB通道需要填充的值。
      padding_mode- 填充模式,这里提供了4种填充模式,1.constant,常量。2.edge 按照图片边缘的像素值来填充。3.reflect,暂不了解。 4. symmetric,暂不了解。
    • 中心裁剪:transforms.CenterCrop class torchvision.transforms.CenterCrop(size) 功能:依据给定的size从中心裁剪 参数: size-
      (sequence or int),若为sequence,则为(h,w),若为int,则(size,size)
    • 随机长宽比裁剪 transforms.RandomResizedCrop class torchvision.transforms.RandomResizedCrop(size, scale=(0.08, 1.0),
      ratio=(0.75, 1.3333333333333333), interpolation=2)
      功能:随机大小,随机长宽比裁剪原始图片,最后将图片resize到设定好的size 参数: size- 输出的分辨率 scale-
      随机crop的大小区间,如scale=(0.08, 1.0),表示随机crop出来的图片会在的0.08倍至1倍之间。 ratio-
      随机长宽比设置 interpolation- 插值的方法,默认为双线性插值(PIL.Image.BILINEAR)
    • 上下左右中心裁剪:transforms.FiveCrop class torchvision.transforms.FiveCrop(size)
      功能:对图片进行上下左右以及中心裁剪,获得5张图片,返回一个4D-tensor 参数: size- (sequence or
      int),若为sequence,则为(h,w),若为int,则(size,size)
    • 上下左右中心裁剪后翻转: transforms.TenCrop class torchvision.transforms.TenCrop(size, vertical_flip=False)
      功能:对图片进行上下左右以及中心裁剪,然后全部翻转(水平或者垂直),获得10张图片,返回一个4D-tensor。 参数: size-
      (sequence or int),若为sequence,则为(h,w),若为int,则(size,size) vertical_flip
      (bool) - 是否垂直翻转,默认为flase,即默认为水平翻转

    3.2 翻转和旋转——Flip and Rotation

    • 依概率p水平翻转transforms.RandomHorizontalFlip class torchvision.transforms.RandomHorizontalFlip(p=0.5)
      功能:依据概率p对PIL图片进行水平翻转 参数: p- 概率,默认值为0.5
    • 依概率p垂直翻转transforms.RandomVerticalFlip class torchvision.transforms.RandomVerticalFlip(p=0.5) 功能:依据概率p对PIL图片进行垂直翻转
      参数: p- 概率,默认值为0.5
    • 随机旋转:transforms.RandomRotation class torchvision.transforms.RandomRotation(degrees, resample=False,
      expand=False, center=None) 功能:依degrees随机旋转一定角度 参数: degress- (sequence
      or float or int) ,若为单个数,如 30,则表示在(-30,+30)之间随机旋转
      若为sequence,如(30,60),则表示在30-60度之间随机旋转 resample- 重采样方法选择,可选
      PIL.Image.NEAREST, PIL.Image.BILINEAR, PIL.Image.BICUBIC,默认为最近邻
      expand- ? center- 可选为中心旋转还是左上角旋转

    3.3 图像变换

    • resize:transforms.Resize class torchvision.transforms.Resize(size, interpolation=2) 功能:重置图像分辨率 参数: size- If size is an int, if height >
      width, then image will be rescaled to (size * height / width,
      size),所以建议size设定为h*w interpolation- 插值方法选择,默认为PIL.Image.BILINEAR
    • 标准化:transforms.Normalize class torchvision.transforms.Normalize(mean, std)
      功能:对数据按通道进行标准化,即先减均值,再除以标准差,注意是 hwc
    • 转为tensor:transforms.ToTensor class torchvision.transforms.ToTensor 功能:将PIL Image或者 ndarray 转换为tensor,并且归一化至[0-1]
      注意事项:归一化至[0-1]是直接除以255,若自己的ndarray数据尺度有变化,则需要自行修改。
    • 填充:transforms.Pad class torchvision.transforms.Pad(padding, fill=0, padding_mode=‘constant’) 功能:对图像进行填充 参数: padding-(sequence or
      int, optional),此参数是设置填充多少个pixel。
      当为int时,图像上下左右均填充int个,例如padding=4,则上下左右均填充4个pixel,若为3232,则会变成4040。
      当为sequence时,若有2个数,则第一个数表示左右扩充多少,第二个数表示上下的。当有4个数时,则为左,上,右,下。 fill-
      (int or tuple)
      填充的值是什么(仅当填充模式为constant时有用)。int时,各通道均填充该值,当长度为3的tuple时,表示RGB通道需要填充的值。
      padding_mode- 填充模式,这里提供了4种填充模式,1.constant,常量。2.edge
      按照图片边缘的像素值来填充。3.reflect,? 4. symmetric,?
    • 修改亮度、对比度和饱和度:transforms.ColorJitter class torchvision.transforms.ColorJitter(brightness=0, contrast=0,
      saturation=0, hue=0) 功能:修改修改亮度、对比度和饱和度
    • 转灰度图:transforms.Grayscale class torchvision.transforms.Grayscale(num_output_channels=1) 功能:将图片转换为灰度图
      参数: num_output_channels- (int) ,当为1时,正常的灰度图,当为3时, 3 channel with r ==
      g == b
    • 线性变换:transforms.LinearTransformation() class torchvision.transforms.LinearTransformation(transformation_matrix)
      功能:对矩阵做线性变化,可用于白化处理! whitening: zero-center the data, compute the
      data covariance matrix 参数: transformation_matrix (Tensor) – tensor [D
      x D], D = C x H x W
    • 仿射变换:transforms.RandomAffine class torchvision.transforms.RandomAffine(degrees, translate=None,
      scale=None, shear=None, resample=False, fillcolor=0) 功能:仿射变换
    • 依概率p转为灰度图:transforms.RandomGrayscale class torchvision.transforms.RandomGrayscale(p=0.1)
      功能:依概率p将图片转换为灰度图,若通道数为3,则3 channel with r == g == b
    • 将数据转换为PILImage:transforms.ToPILImage class torchvision.transforms.ToPILImage(mode=None) 功能:将tensor 或者
      ndarray的数据转换为 PIL Image 类型数据 参数: mode- 为None时,为1通道,
      motde=3通道默认转换为RGB,4通道默认转换为RGBA
    • transforms.Lambda
      Apply a user-defined lambda as a transform. 暂不了解,待补充。

    3.4 对transforms操作,使数据增强更灵活

    PyTorch不仅可设置对图片的操作,还可以对这些操作进行随机选择、组合

    • transforms.RandomChoice(transforms) 功能:从给定的一系列transforms中选一个进行操作,randomly picked from a list
    • transforms.RandomApply(transforms, p=0.5) 功能:给一个transform加上概率,以一定的概率执行该操作
    • **transforms.RandomOrder 功能:**将transforms中的操作顺序随机打乱

    4. 最后

    展开全文
  • transforms的使用方法

    2021-10-28 19:11:05
    一、transforms的介绍 在pytorch中,图像的预处理过程中常常需要对图片的格式、尺寸等做一系列的变化,这就需要借助transforms。 __all__ = ["Compose", "ToTensor", "PILToTensor", "ConvertImageDtype", ...

    一、transforms的介绍


    在pytorch中,图像的预处理过程中常常需要对图片的格式、尺寸等做一系列的变化,这就需要借助transforms。

    __all__ = ["Compose", "ToTensor", "PILToTensor", "ConvertImageDtype", "ToPILImage", "Normalize", "Resize", "Scale",
               "CenterCrop", "Pad", "Lambda", "RandomApply", "RandomChoice", "RandomOrder", "RandomCrop",
               "RandomHorizontalFlip", "RandomVerticalFlip", "RandomResizedCrop", "RandomSizedCrop", "FiveCrop", "TenCrop",
               "LinearTransformation", "ColorJitter", "RandomRotation", "RandomAffine", "Grayscale", "RandomGrayscale",
               "RandomPerspective", "RandomErasing", "GaussianBlur", "InterpolationMode", "RandomInvert", "RandomPosterize",
               "RandomSolarize", "RandomAdjustSharpness", "RandomAutocontrast", "RandomEqualize"]

    这是官方文档里所有的transforms下的操作,以下就根据官方文档和我自己的理解介绍几个常用的transforms方法。

    1、transforms.ToTensor()

    ToTensor就是将 "PIL Image" 或者 "numpy.ndarray" 格式转换为tensor格式,tensor格式的数据可以直接作为网络的输入。

    from PIL import Image
    from torchvision import transforms
    
    
    img = Image.open("../**.png")
    trans_totensor = transforms.ToTensor()
    img_tensor = trans_totensor(img)

    运行上述范例代码可以很清楚的看到transforms.ToTensor前后的变化。

    #原始的图片信息
    <PIL.PngImagePlugin.PngImageFile image mode=RGB size=477x362 at 0x1E58CEB8FD0>
    
    #transforms.ToTensor后的图片信息
    tensor([[[0.2510, 0.2431, 0.2392,  ..., 0.2392, 0.2275, 0.2314],
             [0.2471, 0.2392, 0.2353,  ..., 0.2471, 0.2627, 0.2627],
             [0.2549, 0.2549, 0.2549,  ..., 0.2549, 0.2471, 0.2471],
             ...,
             [0.2784, 0.2902, 0.2784,  ..., 0.9843, 0.9961, 1.0000],
             [0.3020, 0.3059, 0.2824,  ..., 0.9608, 0.9843, 0.9647],
             [0.3176, 0.2902, 0.2863,  ..., 1.0000, 0.9686, 0.9608]],
    
            [[0.2510, 0.2431, 0.2392,  ..., 0.2392, 0.2275, 0.2314],
             [0.2471, 0.2392, 0.2353,  ..., 0.2471, 0.2627, 0.2627],
             [0.2549, 0.2549, 0.2549,  ..., 0.2549, 0.2471, 0.2471],
             ...,
             [0.2784, 0.2902, 0.2784,  ..., 0.9843, 0.9961, 1.0000],
             [0.3020, 0.3059, 0.2824,  ..., 0.9608, 0.9843, 0.9647],
             [0.3176, 0.2902, 0.2863,  ..., 1.0000, 0.9686, 0.9608]],
    
            [[0.2510, 0.2431, 0.2392,  ..., 0.2392, 0.2275, 0.2314],
             [0.2471, 0.2392, 0.2353,  ..., 0.2471, 0.2627, 0.2627],
             [0.2549, 0.2549, 0.2549,  ..., 0.2549, 0.2471, 0.2471],
             ...,
             [0.2784, 0.2902, 0.2784,  ..., 0.9843, 0.9961, 1.0000],
             [0.3020, 0.3059, 0.2824,  ..., 0.9608, 0.9843, 0.9647],
             [0.3176, 0.2902, 0.2863,  ..., 1.0000, 0.9686, 0.9608]]])
    

    2、transforms.Normalize

    Normalize就是使用均值和标准差对张量进行归一化处理,其输出通道计算方法为:

    output[channel] = (input[channel] - mean[channel]) / std[channel]

     举例说明,这里设置均值和标准差都为0.5,

    trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    img_norm = trans_norm(img_tensor)

    那么output=(input-0.5)/0.5=2*input-1,根据这个公式计算出对应的输出,结果为:

    tensor([[[-0.4980, -0.5137, -0.5216,  ..., -0.5216, -0.5451, -0.5373],
             [-0.5059, -0.5216, -0.5294,  ..., -0.5059, -0.4745, -0.4745],
             [-0.4902, -0.4902, -0.4902,  ..., -0.4902, -0.5059, -0.5059],
             ...,
             [-0.4431, -0.4196, -0.4431,  ...,  0.9686,  0.9922,  1.0000],
             [-0.3961, -0.3882, -0.4353,  ...,  0.9216,  0.9686,  0.9294],
             [-0.3647, -0.4196, -0.4275,  ...,  1.0000,  0.9373,  0.9216]],
    
            [[-0.4980, -0.5137, -0.5216,  ..., -0.5216, -0.5451, -0.5373],
             [-0.5059, -0.5216, -0.5294,  ..., -0.5059, -0.4745, -0.4745],
             [-0.4902, -0.4902, -0.4902,  ..., -0.4902, -0.5059, -0.5059],
             ...,
             [-0.4431, -0.4196, -0.4431,  ...,  0.9686,  0.9922,  1.0000],
             [-0.3961, -0.3882, -0.4353,  ...,  0.9216,  0.9686,  0.9294],
             [-0.3647, -0.4196, -0.4275,  ...,  1.0000,  0.9373,  0.9216]],
    
            [[-0.4980, -0.5137, -0.5216,  ..., -0.5216, -0.5451, -0.5373],
             [-0.5059, -0.5216, -0.5294,  ..., -0.5059, -0.4745, -0.4745],
             [-0.4902, -0.4902, -0.4902,  ..., -0.4902, -0.5059, -0.5059],
             ...,
             [-0.4431, -0.4196, -0.4431,  ...,  0.9686,  0.9922,  1.0000],
             [-0.3961, -0.3882, -0.4353,  ...,  0.9216,  0.9686,  0.9294],
             [-0.3647, -0.4196, -0.4275,  ...,  1.0000,  0.9373,  0.9216]]])
    

    这里可以稍做验证,取第一个数0.2510,计算“(0.2510-0.5)/0.5=-0.4980”,发现与结果吻合,其他同理,可自行挑选验证。

    3、transforms.Resize

    Resize就是对图像的尺寸进行变换,这种变换并不会改变图像的格式。

    trans_resize = transforms.Resize((224, 224))
    img_resize = trans_resize(img)

    输出结果为:

    <PIL.Image.Image image mode=RGB size=224x224 at 0x2129684C130>

    4、transforms.Compose

    Compose的功能就是将多个transforms组合起来,这里举例来说明,还是对图片的尺寸进行变换,然后转换为tensor形式

    trans_resize_2 = transforms.Resize((224, 224))
    trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
    img_resize_2 = trans_compose(img)

    输出结果为:

    tensor([[[0.2471, 0.2588, 0.2510,  ..., 0.2392, 0.2431, 0.2431],
             [0.2510, 0.2471, 0.2471,  ..., 0.2471, 0.2471, 0.2471],
             [0.2510, 0.2510, 0.2510,  ..., 0.2510, 0.2392, 0.2510],
             ...,
             [0.2824, 0.2902, 0.3020,  ..., 0.3255, 0.7608, 0.9882],
             [0.2863, 0.2784, 0.2784,  ..., 0.5686, 0.9412, 0.9922],
             [0.3020, 0.2863, 0.2863,  ..., 0.8471, 0.9804, 0.9725]],
    
            [[0.2471, 0.2588, 0.2510,  ..., 0.2392, 0.2431, 0.2431],
             [0.2510, 0.2471, 0.2471,  ..., 0.2471, 0.2471, 0.2471],
             [0.2510, 0.2510, 0.2510,  ..., 0.2510, 0.2392, 0.2510],
             ...,
             [0.2824, 0.2902, 0.3020,  ..., 0.3255, 0.7608, 0.9882],
             [0.2863, 0.2784, 0.2784,  ..., 0.5686, 0.9412, 0.9922],
             [0.3020, 0.2863, 0.2863,  ..., 0.8471, 0.9804, 0.9725]],
    
            [[0.2471, 0.2588, 0.2510,  ..., 0.2392, 0.2431, 0.2431],
             [0.2510, 0.2471, 0.2471,  ..., 0.2471, 0.2471, 0.2471],
             [0.2510, 0.2510, 0.2510,  ..., 0.2510, 0.2392, 0.2510],
             ...,
             [0.2824, 0.2902, 0.3020,  ..., 0.3255, 0.7608, 0.9882],
             [0.2863, 0.2784, 0.2784,  ..., 0.5686, 0.9412, 0.9922],
             [0.3020, 0.2863, 0.2863,  ..., 0.8471, 0.9804, 0.9725]]])

    5、transforms.RandomCrop

    RandomCrop是对图像进行随机裁剪,这里可以用来进行数据增强等,具体实现代码如范例程序。

    trans_random = transforms.RandomCrop(224)
    trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
    for i in range(10):
        img_crop = trans_compose_2(img)

    以上就是一些常见的transforms操作,为了更加直观的看出图像预处里过程,可以使用tensorboard。


    以上就是我关于tansforms中一些常见操作的介绍,写的不是很规范,欢迎大家批评与讨论,我们一起进步!

    展开全文
  • transforms.Resize()的用法

    千次阅读 2021-06-25 16:40:22
    调整PILImage对象的尺寸 ...transforms.Resize([h, w]) 将图片短边缩放至x,长宽比保持不变: transforms.Resize(x) 例如transforms.Resize([224, 224])就能将输入图片转化成224×224的输入特征图。 ...

    调整PILImage对象的尺寸
    提示:不能是用io.imread或者cv2.imread读取的图片,这两种方法得到的是ndarray

    一般输入深度网络的特征图长宽是相等的,就不能采取等比例缩放的方式了,需要同时指定长宽:

    transforms.Resize([h, w])

    将图片短边缩放至x,长宽比保持不变:

    transforms.Resize(x)

    例如transforms.Resize([224, 224])就能将输入图片转化成224×224的输入特征图。

    需要注意的一点是PILImage对象size属性返回的是w, h,而resize的参数顺序是h, w。

    def load_data(root_path, dir, batch_size, phase):
     transform_dict = {
     'src': transforms.Compose(
     [transforms.RandomResizedCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'tar': transforms.Compose(
     [transforms.Resize(224),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = datasets.ImageFolder(root=root_path + dir, transform=transform_dict[phase])
     data_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=False, num_workers=4)
     return data_loader
    def get_screen(self, env):
     screen = env.render(mode='rgb_array').transpose((2, 0, 1)) # transpose into torch order (CHW)
     # Strip off the top and bottom of the screen
     screen = screen[:, 160:320]
     view_width = 320
     cart_location = self.get_cart_location(env)
     if cart_location < view_width // 2:
     slice_range = slice(view_width)
     elif cart_location > (self.screen_width - view_width // 2):
     slice_range = slice(-view_width, None)
     else:
     slice_range = slice(cart_location - view_width // 2,
     cart_location + view_width // 2)
     # Strip off the edges, so that we have a square image centered on a cart
     screen = screen[:, :, slice_range]
     # Convert to float, rescale, convert to torch tensor
     screen = np.ascontiguousarray(screen, dtype=np.float32) / 255
     screen = torch.from_numpy(screen)
     # Resize, and add a batch dimension (BCHW)
     return resize(screen).unsqueeze(0)
    def load_data(data_folder, batch_size, phase='train', train_val_split=True, train_ratio=.8):
     transform_dict = {
     'train': transforms.Compose(
     [transforms.Resize(256),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'test': transforms.Compose(
     [transforms.Resize(224),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = datasets.ImageFolder(root=data_folder, transform=transform_dict[phase])
     if phase == 'train':
     if train_val_split:
     train_size = int(train_ratio * len(data))
     test_size = len(data) - train_size
     data_train, data_val = torch.utils.data.random_split(data, [train_size, test_size])
     train_loader = torch.utils.data.DataLoader(data_train, batch_size=batch_size, shuffle=True, drop_last=True,
     num_workers=4)
     val_loader = torch.utils.data.DataLoader(data_val, batch_size=batch_size, shuffle=False, drop_last=False,
     num_workers=4)
     return [train_loader, val_loader]
     else:
     train_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=True,
     num_workers=4)
     return train_loader
     else:
     test_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=False, drop_last=False,
     num_workers=4)
     return test_loader
     ## Below are for ImageCLEF datasets
    def load_imageclef_train(root_path, domain, batch_size, phase):
     transform_dict = {
     'src': transforms.Compose(
     [transforms.Resize((256, 256)),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'tar': transforms.Compose(
     [transforms.Resize((224, 224)),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = ImageCLEF(root_dir=root_path, domain=domain, transform=transform_dict[phase])
     train_size = int(0.8 * len(data))
     test_size = len(data) - train_size
     data_train, data_val = torch.utils.data.random_split(data, [train_size, test_size])
     train_loader = torch.utils.data.DataLoader(data_train, batch_size=batch_size, shuffle=True, drop_last=False,
     num_workers=4)
     val_loader = torch.utils.data.DataLoader(data_val, batch_size=batch_size, shuffle=True, drop_last=False,
     num_workers=4)
     return train_loader, val_loader
    def load_imageclef_test(root_path, domain, batch_size, phase):
     transform_dict = {
     'src': transforms.Compose(
     [transforms.Resize((256,256)),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'tar': transforms.Compose(
     [transforms.Resize((224, 224)),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = ImageCLEF(root_dir=root_path, domain=domain, transform=transform_dict[phase])
     data_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=False, num_workers=4)
     return data_loader
    def load_imageclef_test(root_path, domain, batch_size, phase):
     transform_dict = {
     'src': transforms.Compose(
     [transforms.Resize((256,256)),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'tar': transforms.Compose(
     [transforms.Resize((224, 224)),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = ImageCLEF(root_dir=root_path, domain=domain, transform=transform_dict[phase])
     data_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=False, num_workers=4)
     return data_loader
    def load_training(root_path, dir, batch_size, kwargs):
     transform = transforms.Compose(
     [transforms.Resize([256, 256]),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor()])
     data = datasets.ImageFolder(root=root_path + dir, transform=transform)
     train_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=True, **kwargs)
     return train_loader
    def load_data(data_folder, batch_size, train, kwargs):
     transform = {
     'train': transforms.Compose(
     [transforms.Resize([256, 256]),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225])]),
     'test': transforms.Compose(
     [transforms.Resize([224, 224]),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225])])
     }
     data = datasets.ImageFolder(root = data_folder, transform=transform['train' if train else 'test'])
     data_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, **kwargs, drop_last = True if train else False)
     return data_loader
    def load_train(root_path, dir, batch_size, phase):
     transform_dict = {
     'src': transforms.Compose(
     [transforms.RandomResizedCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ]),
     'tar': transforms.Compose(
     [transforms.Resize(224),
     transforms.ToTensor(),
     transforms.Normalize(mean=[0.485, 0.456, 0.406],
     std=[0.229, 0.224, 0.225]),
     ])}
     data = datasets.ImageFolder(root=root_path + dir, transform=transform_dict[phase])
     train_size = int(0.8 * len(data))
     test_size = len(data) - train_size
     data_train, data_val = torch.utils.data.random_split(data, [train_size, test_size])
     train_loader = torch.utils.data.DataLoader(data_train, batch_size=batch_size, shuffle=True, drop_last=False, num_workers=4)
     val_loader = torch.utils.data.DataLoader(data_val, batch_size=batch_size, shuffle=True, drop_last=False, num_workers=4)
     return train_loader, val_loader
    def __init__(self, train_mode, loader_params, dataset_params, augmentation_params):
     super().__init__(train_mode, loader_params, dataset_params, augmentation_params)
     self.image_transform = transforms.Compose([transforms.Resize((self.dataset_params.h, self.dataset_params.w)),
     transforms.Grayscale(num_output_channels=3),
     transforms.ToTensor(),
     transforms.Normalize(mean=self.dataset_params.MEAN,
     std=self.dataset_params.STD),
     ])
     self.mask_transform = transforms.Compose([transforms.Resize((self.dataset_params.h, self.dataset_params.w),
     interpolation=0),
     transforms.Lambda(to_array),
     transforms.Lambda(to_tensor),
     ])
     self.image_augment_train = ImgAug(self.augmentation_params['image_augment_train'])
     self.image_augment_with_target_train = ImgAug(self.augmentation_params['image_augment_with_target_train'])
     if self.dataset_params.target_format == 'png':
     self.dataset = ImageSegmentationPngDataset
     elif self.dataset_params.target_format == 'json':
     self.dataset = ImageSegmentationJsonDataset
     else:
     raise Exception('files must be png or json')
    def get_transform2(dataset_name, net_transform, downscale):
     "Returns image and label transform to downscale, crop and prepare for net."
     orig_size = get_orig_size(dataset_name)
     transform = []
     target_transform = []
     if downscale is not None:
     transform.append(transforms.Resize(orig_size // downscale))
     target_transform.append(
     transforms.Resize(orig_size // downscale,
     interpolation=Image.NEAREST))
     transform.extend([transforms.Resize(orig_size), net_transform])
     target_transform.extend([transforms.Resize(orig_size, interpolation=Image.NEAREST),
     to_tensor_raw])
     transform = transforms.Compose(transform)
     target_transform = transforms.Compose(target_transform)
     return transform, target_transform
    def get_transform(params, image_size, num_channels):
     # Transforms for PIL Images: Gray RGB
     Gray2RGB = transforms.Lambda(lambda x: x.convert('RGB'))
     RGB2Gray = transforms.Lambda(lambda x: x.convert('L'))
     transform = []
     # Does size request match original size?
     if not image_size == params.image_size:
     transform.append(transforms.Resize(image_size))
     # Does number of channels requested match original?
     if not num_channels == params.num_channels:
     if num_channels == 1:
     transform.append(RGB2Gray)
     elif num_channels == 3:
     transform.append(Gray2RGB)
     else:
     print('NumChannels should be 1 or 3', num_channels)
     raise Exception
     transform += [transforms.ToTensor(),
     transforms.Normalize((params.mean,), (params.std,))]
     return transforms.Compose(transform)
    def get_mnist_dataloaders(batch_size=128):
     """MNIST dataloader with (32, 32) sized images."""
     # Resize images so they are a power of 2
     all_transforms = transforms.Compose([
     transforms.Resize(32),
     transforms.ToTensor()
     ])
     # Get train and test data
     train_data = datasets.MNIST('../data', train=True, download=True,
     transform=all_transforms)
     test_data = datasets.MNIST('../data', train=False,
     transform=all_transforms)
     # Create dataloaders
     train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
     test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
     return train_loader, test_loader
    def get_fashion_mnist_dataloaders(batch_size=128):
     """Fashion MNIST dataloader with (32, 32) sized images."""
     # Resize images so they are a power of 2
     all_transforms = transforms.Compose([
     transforms.Resize(32),
     transforms.ToTensor()
     ])
     # Get train and test data
     train_data = datasets.FashionMNIST('../fashion_data', train=True, download=True,
     transform=all_transforms)
     test_data = datasets.FashionMNIST('../fashion_data', train=False,
     transform=all_transforms)
     # Create dataloaders
     train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
     test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=True)
     return train_loader, test_loader
    def get_lsun_dataloader(path_to_data='../lsun', dataset='bedroom_train',
     batch_size=64):
     """LSUN dataloader with (128, 128) sized images.
     path_to_data : str
     One of 'bedroom_val' or 'bedroom_train'
     """
     # Compose transforms
     transform = transforms.Compose([
     transforms.Resize(128),
     transforms.CenterCrop(128),
     transforms.ToTensor()
     ])
     # Get dataset
     lsun_dset = datasets.LSUN(db_path=path_to_data, classes=[dataset],
     transform=transform)
     # Create dataloader
     return DataLoader(lsun_dset, batch_size=batch_size, shuffle=True)
    def save_distorted(method=gaussian_noise):
     for severity in range(1, 6):
     print(method.__name__, severity)
     distorted_dataset = DistortImageFolder(
     root="/share/data/vision-greg/ImageNet/clsloc/images/val",
     method=method, severity=severity,
     transform=trn.Compose([trn.Resize(256), trn.CenterCrop(224)]))
     distorted_dataset_loader = torch.utils.data.DataLoader(
     distorted_dataset, batch_size=100, shuffle=False, num_workers=4)
     for _ in distorted_dataset_loader: continue
     # /// End Further Setup ///
     # /// Display Results ///
    def save_distorted(method=gaussian_noise):
     for severity in range(1, 6):
     print(method.__name__, severity)
     distorted_dataset = DistortImageFolder(
     root="./imagenet_val_bbox_crop/",
     method=method, severity=severity,
     transform=trn.Compose([trn.Resize((64, 64))]))
     distorted_dataset_loader = torch.utils.data.DataLoader(
     distorted_dataset, batch_size=100, shuffle=False, num_workers=6)
     for _ in distorted_dataset_loader: continue
     # /// End Further Setup ///
     # /// Display Results ///
    def save_distorted(method=gaussian_noise):
     for severity in range(1, 6):
     print(method.__name__, severity)
     distorted_dataset = DistortImageFolder(
     root="/share/data/vision-greg/ImageNet/clsloc/images/val",
     method=method, severity=severity,
     transform=trn.Compose([trn.Resize((64, 64))]))
     distorted_dataset_loader = torch.utils.data.DataLoader(
     distorted_dataset, batch_size=100, shuffle=False, num_workers=6)
     for _ in distorted_dataset_loader: continue
     # /// End Further Setup ///
     # /// Display Results ///
    def get_transform():
     transform_image_list = [
     transforms.Resize((256, 256), 3),
     transforms.ToTensor(),
     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
     ]
     transform_gt_list = [
     transforms.Resize((256, 256), 0),
     transforms.Lambda(lambda img: np.asarray(img, dtype=np.uint8)),
     ]
     data_transforms = {
     'img': transforms.Compose(transform_image_list),
     'gt': transforms.Compose(transform_gt_list),
     }
     return data_transforms
    def get_data(train):
     data_raw = datasets.CIFAR10('../data/dl/', train=train, download=True, transform=transforms.Compose([
     transforms.Grayscale(),
     transforms.Resize((20, 20)),
     transforms.ToTensor(),
     lambda x: x.numpy().flatten()]))
     data_x, data_y = zip(*data_raw)
     data_x = np.array(data_x)
     data_y = np.array(data_y, dtype='int32').reshape(-1, 1)
     # binarize
     label_0 = data_y < 5
     label_1 = ~label_0
     data_y[label_0] = 0
     data_y[label_1] = 1
     data = pd.DataFrame(data_x)
     data[COLUMN_LABEL] = data_y
     return data, data_x.mean(), data_x.std()
     #---
    def get_data(train):
     data_raw = datasets.CIFAR10('../data/dl/', train=train, download=True, transform=transforms.Compose([
     transforms.Grayscale(),
     transforms.Resize((20, 20)),
     transforms.ToTensor(),
     lambda x: x.numpy().flatten()]))
     data_x, data_y = zip(*data_raw)
     data_x = np.array(data_x)
     data_y = np.array(data_y, dtype='int32').reshape(-1, 1)
     data = pd.DataFrame(data_x)
     data[COLUMN_LABEL] = data_y
     return data, data_x.mean(), data_x.std()
     #---
    def initialize_dataset(clevr_dir, dictionaries, state_description=True):
     if not state_description:
     train_transforms = transforms.Compose([transforms.Resize((128, 128)),
     transforms.Pad(8),
     transforms.RandomCrop((128, 128)),
     transforms.RandomRotation(2.8), # .05 rad
     transforms.ToTensor()])
     test_transforms = transforms.Compose([transforms.Resize((128, 128)),
     transforms.ToTensor()])
     clevr_dataset_train = ClevrDataset(clevr_dir, True, dictionaries, train_transforms)
     clevr_dataset_test = ClevrDataset(clevr_dir, False, dictionaries, test_transforms)
     else:
     clevr_dataset_train = ClevrDatasetStateDescription(clevr_dir, True, dictionaries)
     clevr_dataset_test = ClevrDatasetStateDescription(clevr_dir, False, dictionaries)
     return clevr_dataset_train, clevr_dataset_test

    最常用:

    def build_image_transforms(self):
     self.image_transform = transforms.Compose([
     transforms.Resize((224, 224)),
     transforms.ToTensor(),
     transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
     ])
    def load_data(domain, root_dir, batch_size):
     src_train_img, src_train_label, src_test_img, src_test_label = load_dataset(domain['src'], root_dir)
     tar_train_img, tar_train_label, tar_test_img, tar_test_label = load_dataset(domain['tar'], root_dir)
     transform = transforms.Compose([
     transforms.Resize(32),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
     ])
     data_src_train, data_src_test = GetDataset(src_train_img, src_train_label,
     transform), GetDataset(src_test_img,
     src_test_label,
     transform)
     data_tar_train, data_tar_test = GetDataset(tar_train_img, tar_train_label,
     transform), GetDataset(tar_test_img,
     tar_test_label,
     transform)
     dataloaders = {}
     dataloaders['src'] = torch.utils.data.DataLoader(data_src_train, batch_size=batch_size, shuffle=True,
     drop_last=False,
     num_workers=4)
     dataloaders['val'] = torch.utils.data.DataLoader(data_src_test, batch_size=batch_size, shuffle=True,
     drop_last=False,
     num_workers=4)
     dataloaders['tar'] = torch.utils.data.DataLoader(data_tar_train, batch_size=batch_size, shuffle=True,
     drop_last=False,
     num_workers=4)
     return dataloaders
    def loader(path, batch_size=16, num_workers=1, pin_memory=True):
     normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
     return data.DataLoader(
     datasets.ImageFolder(path,
     transforms.Compose([
     transforms.Resize(256),
     transforms.RandomResizedCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor(),
     normalize,
     ])),
     batch_size=batch_size,
     shuffle=True,
     num_workers=num_workers,
     pin_memory=pin_memory)
    def test_loader(path, batch_size=16, num_workers=1, pin_memory=True):
     normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
     return data.DataLoader(
     datasets.ImageFolder(path,
     transforms.Compose([
     transforms.Resize(256),
     transforms.CenterCrop(224),
     transforms.ToTensor(),
     normalize,
     ])),
     batch_size=batch_size,
     shuffle=False,
     num_workers=num_workers,
     pin_memory=pin_memory)
    def load_training(root_path, dir, batch_size, kwargs):
     transform = transforms.Compose(
     [transforms.Resize([256, 256]),
     transforms.RandomCrop(224),
     transforms.RandomHorizontalFlip(),
     transforms.ToTensor()])
     data = datasets.ImageFolder(root=root_path + dir, transform=transform)
     train_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, drop_last=True, **kwargs)
     return train_loader
    def load_testing(root_path, dir, batch_size, kwargs):
     transform = transforms.Compose(
     [transforms.Resize([224, 224]),
     transforms.ToTensor()])
     data = datasets.ImageFolder(root=root_path + dir, transform=transform)
     test_loader = torch.utils.data.DataLoader(data, batch_size=batch_size, shuffle=True, **kwargs)
     return test_loader
    展开全文
  • Discrete Wavelet Transforms

    2019-04-04 19:49:19
    这是关于离散小波变换的电子书,最新版本,高清,经典著作,英文版
  • Grabbit是Unity的创新和现代的Transform&Level Design工具集。 凭借其强大的编辑器物理系统,它为您提供了一种独特的方式来提高您的关卡设计和世界建筑效率。
  • 玩转pytorch中的torchvision.transforms

    千次阅读 2020-06-15 17:48:54
    文章作者:Tyan ...|  CSDN  |  简书 0. 运行环境 python 3.6.8, pytorch 1.5.0 1. torchvision.transforms 在深度学习中,计算机视觉...本文主要整理PyTorch中torchvision.transforms提供的一些功能(代码

    文章作者:Tyan
    博客:noahsnail.com  |  CSDN  |  简书

    0. 运行环境

    python 3.6.8, pytorch 1.5.0

    1. torchvision.transforms

    在深度学习中,计算机视觉(CV)是其中的一大方向,而在CV任务中,图像变换(Image Transform)通常是必不可少的一环,其可以用来对图像进行预处理,数据增强等。本文主要整理PyTorch中torchvision.transforms提供的一些功能(代码加示例)。具体定义及参数可参考PyTorch文档

    1.1 torchvision.transforms.Compose

    Compose的主要作用是将多个变换组合在一起,具体用法可参考2.5。下面的示例结果左边为原图,右边为保存的结果。

    2. Transforms on PIL Image

    这部分主要是对Python最常用的图像处理库Pillow中Image的处理。基本环境及图像如下:

    import torchvision.transforms as transforms
    
    from PIL import Image
    
    img = Image.open('tina.jpg')
    
    ...
    
    # Save image
    img.save('image.jpg')
    

    Demo

    2.1 torchvision.transforms.CenterCrop(size)

    CenterCrop的作用是从图像的中心位置裁剪指定大小的图像。例如一些神经网络的输入图像大小为224*224,而训练图像的大小为256*256,此时就需要对训练图像进行裁剪。示例代码及结果如下:

    size = (224, 224)
    transform = transforms.CenterCrop(size)
    center_crop = transform(img)
    center_crop.save('center_crop.jpg')
    

    CenterCrop

    2.2 torchvision.transforms.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0)

    ColorJitter的作用是随机修改图片的亮度、对比度和饱和度,常用来进行数据增强,尤其是训练图像类别不均衡或图像数量较少时。示例代码及结果如下:

    brightness = (1, 10)
    contrast = (1, 10)
    saturation = (1, 10)
    hue = (0.2, 0.4)
    transform = transforms.ColorJitter(brightness, contrast, saturation, hue)
    color_jitter = transform(img)
    color_jitter.save('color_jitter.jpg')
    

    ColorJitter

    2.3 torchvision.transforms.FiveCrop(size)

    FiveCrop的作用是分别从图像的四个角以及中心进行五次裁剪,图像分类评估时分为Singl Crop Evaluation/TestMulti Crop Evaluation/TestFiveCrop可以用在Multi Crop Evaluation/Test中。示例代码及结果如下:

    size = (224, 224)
    transform = transforms.FiveCrop(size)
    five_crop = transform(img)
    

    FiveCrop

    2.4 torchvision.transforms.Grayscale(num_output_channels=1)

    Grayscale的作用是将图像转换为灰度图像,默认通道数为1,通道数为3时,RGB三个通道的值相等。示例代码及结果如下:

    transform = transforms.Grayscale()
    grayscale = transform(img)
    grayscale.save('grayscale.jpg')
    

    Grayscale

    2.5 torchvision.transforms.Pad(padding, fill=0, padding_mode=‘constant’)

    Pad的作用是对图像进行填充,可以设置要填充的值及填充的大小,默认是图像四边都填充。示例代码及结果如下:

    size = (224, 224)
    padding = 16
    fill = (0, 0, 255)
    transform = transforms.Compose([
            transforms.CenterCrop(size),
            transforms.Pad(padding, fill)
    ])
    pad = transform(img)
    pad.save('pad.jpg')
    

    Pad

    2.6 torchvision.transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=False, fillcolor=0)

    RandomAffine的作用是保持图像中心不变的情况下对图像进行随机的仿射变换。示例代码及结果如下:

    degrees = (15, 30)
    translate=(0, 0.2)
    scale=(0.8, 1)
    fillcolor = (0, 0, 255)
    transform = transforms.RandomAffine(degrees=degrees, translate=translate, scale=scale, fillcolor=fillcolor)
    random_affine = transform(img)
    random_affine.save('random_affine.jpg')
    

    RandomAffine

    2.7 torchvision.transforms.RandomApply(transforms, p=0.5)

    RandomApply的作用是以一定的概率执行提供的transforms操作,即可能执行,也可能不执行。transforms可以是一个,也可以是一系列。示例代码及结果如下:

    size = (224, 224)
    padding = 16
    fill = (0, 0, 255)
    transform = transforms.RandomApply([transforms.CenterCrop(size), transforms.Pad(padding, fill)])
    for i in range(3):
        random_apply = transform(img)
    

    RandomApply

    2.8 torchvision.transforms.RandomChoice(transforms)

    RandomChoice的作用是从提供的transforms操作中随机选择一个执行。示例代码及结果如下:

    size = (224, 224)
    padding = 16
    fill = (0, 0, 255)
    degrees = (15, 30)
    transform = transforms.RandomChoice([transforms.RandomAffine(degrees), transforms.CenterCrop(size), transforms.Pad(padding, fill)])
    for i in range(3):
        random_choice = transform(img)
    

    RandomChoice

    2.9 torchvision.transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode=‘constant’)

    RandomCrop的作用是在一个随机位置上对图像进行裁剪。示例代码及结果如下:

    size = (224, 224)
    transform = transforms.RandomCrop(size)
    random_crop = transform(img)
    

    RandomCrop

    2.10 torchvision.transforms.RandomGrayscale(p=0.1)

    RandomGrayscale的作用是以一定的概率将图像变为灰度图像。示例代码及结果如下:

    p = 0.5
    transform = transforms.RandomGrayscale(p)
    for i in range(3):
        random_grayscale = transform(img)
    

    RandomGrayscale

    2.11 torchvision.transforms.RandomHorizontalFlip(p=0.5)

    RandomHorizontalFlip的作用是以一定的概率对图像进行水平翻转。示例代码及结果如下:

    p = 0.5
    transform = transforms.RandomHorizontalFlip(p)
    for i in range(3):
        random_horizontal_filp = transform(img)
    

    RandomHorizontalFlip

    2.12 torchvision.transforms.RandomOrder(transforms)

    RandomOrder的作用是以随机顺序执行提供的transforms操作。示例代码及结果如下:

    size = (224, 224)
    padding = 16
    fill = (0, 0, 255)
    degrees = (15, 30)
    transform = transforms.RandomOrder([transforms.RandomAffine(degrees), transforms.CenterCrop(size), transforms.Pad(padding, fill)])
    for i in range(3):
        random_order = transform(img)
    

    RandomOrder

    2.13 torchvision.transforms.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=3, fill=0)

    RandomPerspective的作用是以一定的概率对图像进行随机的透视变换。示例代码及结果如下:

    distortion_scale = 0.5
    p = 1
    fill = (0, 0, 255)
    transform = transforms.RandomPerspective(distortion_scale=distortion_scale, p=p, fill=fill)
    random_perspective = transform(img)
    random_perspective.save('random_perspective.jpg')
    

    RandomPerspective

    2.14 torchvision.transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=2)

    RandomResizedCrop的作用是以随机大小和随机长宽比裁剪图像并缩放到指定的大小。示例代码及结果如下:

    size = (256, 256)
    scale=(0.8, 1.0)
    ratio=(0.75, 1.0)
    transform = transforms.RandomResizedCrop(size=size, scale=scale, ratio=ratio)
    random_resized_crop = transform(img)
    random_resized_crop.save('random_resized_crop.jpg')
    

    RandomResizedCrop

    2.15 torchvision.transforms.RandomRotation(degrees, resample=False, expand=False, center=None, fill=None)

    RandomRotation的作用是对图像进行随机旋转。示例代码及结果如下:

    degrees = (15, 30)
    fill = (0, 0, 255)
    transform = transforms.RandomRotation(degrees=degrees, fill=fill)
    random_rotation = transform(img)
    random_rotation.save('random_rotation.jpg')
    

    RandomRotation

    2.16 torchvision.transforms.RandomSizedCrop(*args, **kwargs)

    已废弃,参见RandomResizedCrop

    2.17 torchvision.transforms.RandomVerticalFlip(p=0.5)

    RandomVerticalFlip的作用是以一定的概率对图像进行垂直翻转。示例代码及结果如下:

    p = 1
    transform = transforms.RandomVerticalFlip(p)
    random_vertical_filp = transform(img)
    random_vertical_filp.save('random_vertical_filp.jpg')
    

    RandomVerticalFlip

    2.18 torchvision.transforms.Resize(size, interpolation=2)

    Resize的作用是对图像进行缩放。示例代码及结果如下:

    size = (224, 224)
    transform = transforms.Resize(size)
    resize_img = transform(img)
    resize_img.save('resize_img.jpg')
    

    Resize

    2.19 torchvision.transforms.Scale(*args, **kwargs)

    已废弃,参加Resize

    2.20 torchvision.transforms.TenCrop(size, vertical_flip=False)

    TenCrop与2.3类似,除了对原图裁剪5个图像之外,还对其翻转图像裁剪了5个图像。

    3. Transforms on torch.*Tensor

    3.1 torchvision.transforms.LinearTransformation(transformation_matrix, mean_vector)

    LinearTransformation的作用是使用变换矩阵和离线计算的均值向量对图像张量进行变换,可以用在白化变换中,白化变换用来去除输入数据的冗余信息。常用在数据预处理中。

    3.2 torchvision.transforms.Normalize(mean, std, inplace=False)

    Normalize的作用是用均值和标准差对Tensor进行归一化处理。常用在对输入图像的预处理中,例如Imagenet竞赛的许多分类网络都对输入图像进行了归一化操作。

    3.3 torchvision.transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)

    RandomErasing的作用是随机选择图像中的一块区域,擦除其像素,主要用来进行数据增强。示例代码及结果如下:

    p = 1.0
    scale = (0.2, 0.3)
    ratio = (0.5, 1.0)
    value = (0, 0, 255)
    
    transform = transforms.Compose([
                    transforms.ToTensor(),
                    transforms.RandomErasing(p=p, scale=scale, ratio=ratio, value=value),
                    transforms.ToPILImage()
                ])
    random_erasing = transform(img)
    random_erasing.save('random_erasing.jpg')
    

    RandomErasing

    4 Conversion Transforms

    4.1 torchvision.transforms.ToPILImage(mode=None)

    ToPILImage的作用是将pytorch的Tensornumpy.ndarray转为PIL的Image。示例代码及结果如下:

    img = Image.open('tina.jpg')
    transform = transforms.ToTensor()
    img = transform(img)
    print(img.size())
    img_r = img[0, :, :]
    img_g = img[1, :, :]
    img_b = img[2, :, :]
    print(type(img_r))
    print(img_r.size())
    transform = transforms.ToPILImage()
    img_r = transform(img_r)
    img_g = transform(img_g)
    img_b = transform(img_b)
    print(type(img_r))
    img_r.save('img_r.jpg')
    img_g.save('img_g.jpg')
    img_b.save('img_b.jpg')
    
    # output
    torch.Size([3, 256, 256])
    <class 'torch.Tensor'>
    torch.Size([256, 256])
    <class 'PIL.Image.Image'>
    

    ToPILImage

    4.2 torchvision.transforms.ToTensor

    ToTensor的作用是将PIL Imagenumpy.ndarray转为pytorch的Tensor,并会将像素值由[0, 255]变为[0, 1]之间。通常是在神经网络训练中读取输入图像之后使用。示例代码如下:

    img = Image.open('tina.jpg')
    print(type(img))
    print(img.size)
    transform = transforms.ToTensor()
    img = transform(img)
    print(type(img))
    print(img.size())
    
    # output
    <class 'PIL.JpegImagePlugin.JpegImageFile'>
    (256, 256)
    <class 'torch.Tensor'>
    torch.Size([3, 256, 256])
    

    5. Code

    代码参见https://github.com/SnailTyan/deep-learning-tools/blob/master/transforms.py

    References

    1. https://pytorch.org/docs/stable/torchvision/transforms.html
    展开全文
  • 随机裁剪:transforms.RandomCrop2.中心裁剪:transforms.CenterCrop3.随机长宽比裁剪 transforms.RandomResizedCrop4.上下左右中心裁剪:transforms.FiveCrop5.上下左右中心裁剪后翻转: transforms.TenCrop二、翻转...
  • TORCHVISION.TRANSFORMS的图像预处理

    千次阅读 2020-12-25 14:01:56
    `torchvision.transforms的图像变换函数使用说明,以及使用实例图
  • 目录 transform的各个方法(适用于PIL Image和 Tensor): 1.... 4 对transforms操作,使数据增强更加灵活 4.1 以一定概率支持变换 CLASS torchvision.transforms.RandomApply(transforms, p=0.5) 官网文档链接:...
  • Transforms in CSS

    2016-11-24 08:53:36
    Transforms in CSS
  • 一般情况下,预加载的数据集或自己构造的数据集并不能直接用于训练机器学习算法,为了将其转换为训练模型所需的最终形式,我们可以使用 transforms 对数据进行处理,以使其适合训练。 0. 简介 在介绍 Dataset 时,...
  • PyTorch torchvision.transforms的方法 在实际应用过程中,我们需要在数据进入模型之前进行一些预处理,例如数据中心化(仅减均值),数据标准化(减均值,再除以标准差),随机裁剪,旋转一定角度,镜像等一系列操作。...
  • byte-transforms, 哈希压缩和编码字节的方法 这里库可以帮助你计算 hash hash压缩,以及字节的字节 encode 。 它包含标准 Java lib中的方法以及最佳可以用方法的计划的Collection 。用法[byte-transforms "0.1.4"]...
  • 文章目录一、数据预处理transforms模块机制二、二十二种transforms数据预处理方法1.数据标准化2.裁剪 一、数据预处理transforms模块机制   torchvision.transforms模块包含了很多图像预处理方法: 数据中心化 ...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 48,196
精华内容 19,278
关键字:

transforms