精华内容
下载资源
问答
  • 用于目标检测的数据增强工具,通过读取voc格式的数据可以对图像相应的box进行缩放、平移、镜像、旋转、resize更改颜色空间。通过对扩充方法的随机组合,可以将一个带标注的图像扩充成100幅。使用上有问题请直接...
  • 数据增强 数据集扩充 班级分配不均衡的创新解决方案 (A Creative Solution to Imbalanced Class Distribution) Imbalanced class distribution is a common problem in Machine Learning. I was recently confronted...

    数据增强 数据集扩充

    班级分配不均衡的创新解决方案 (A Creative Solution to Imbalanced Class Distribution)

    Imbalanced class distribution is a common problem in Machine Learning. I was recently confronted with this issue when training a sentiment classification model. Certain categories were far more prevalent than others and the predictive quality of the model suffered. The first technique I used to address this was random under-sampling, wherein I randomly sampled a subset of rows from each category up to a ceiling threshold. I selected a ceiling that reasonably balanced the upper 3 classes. Although a small improvement was observed, the model was still far from optimal.

    班级分配不平衡是机器学习中的常见问题。 最近,我在训练情感分类模型时遇到了这个问题。 某些类别比其他类别更为普遍,因此模型的预测质量受到影响。 我用来解决此问题的第一个技术是随机欠采样,其中我从每个类别中随机采样了行的子集,直到上限阈值。 我选择了一个合理地平衡前三类的上限。 尽管观察到很小的改进,但是该模型仍远非最佳。

    I needed a way to deal with the under-represented classes. I could not rely on traditional techniques used in multi-class classification such as sample and class weighting, as I was working with a multi-label dataset. It became evident that I would need to leverage oversampling in this situation.

    我需要一种方法来处理代表性不足的课程。 当我使用多标签数据集时,我不能依赖于用于多类分类的传统技术,例如样本和类加权。 很明显,在这种情况下,我需要利用过度采样。

    A technique such as SMOTE (Synthetic Minority Over-sampling Technique) can be effective for oversampling, although the problem again becomes a bit more difficult with multi-label datasets. MLSMOTE (Multi-Label Synthetic Minority Over-sampling Technique) has been proposed [1], but the high dimensional nature of the numerical vectors created from text can sometimes make other forms of data augmentation more appealing.

    诸如SMOTE(合成少数族裔过采样技术)之类的技术可以有效地进行过采样,尽管对于多标签数据集,问题再次变得更加棘手。 已经提出了MLSMOTE (多标签综合少数族裔过采样技术)[1],但是从文本创建的数字矢量的高维性质有时会使其他形式的数据增强更具吸引力。

    Image for post
    Photo by Christian Wagner on Unsplash
    克里斯蒂安·瓦格纳在《 Unsplash》上的照片

    变形金刚救援! (Transformers to the Rescue!)

    If you decided to read this article, it is safe to assume that you are aware of the latest advances in Natural Language Processing bequeathed by the mighty Transformers. The exceptional developers at Hugging Face in particular have opened the door to this world through their open source contributions. One of their more recent releases implements a breakthrough in Transfer Learning called the Text-to-Text Transfer Transformer or T5 model, originally presented by Raffel et. al. in their paper Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer [2].

    如果您决定阅读本文,可以假定您了解强大的变形金刚在自然语言处理方面的最新进展。 Hugging Face的杰出开发人员尤其通过其开源贡献为这个世界打开了一扇门。 他们的一个更新的版本工具的转移的突破口学习所谓的T外部- 牛逼T外部贸易交接牛逼 ransformer或T5型号,最初由拉费尔等人提出的。 等 在他们的论文《使用统一的文本到文本的转换器探索迁移学习的局限性》中 [2]。

    T5 allows us to execute various NLP tasks by specifying prefixes to the input text. In my case, I was interested in Abstractive Summarization, so I made use of the summarize prefix.

    T5允许我们通过指定输入文本的前缀来执行各种NLP任务。 就我而言,我感兴趣的是写意总结,所以我利用的summarize前缀。

    Image for post
    Text-to-Text Transfer Transformer [2]
    文本到文本传输变压器[2]

    抽象总结 (Abstractive Summarization)

    Abstractive Summarization put simplistically is a technique by which a chunk of text is fed to an NLP model and a novel summary of that text is returned. This should not be confused with Extractive Summarization, where sentences are embedded and a clustering algorithm is executed to find those closest to the clusters’ centroids — namely, existing sentences are returned. Abstractive Summarization seemed particularly appealing as a Data Augmentation technique because of its ability to generate novel yet realistic sentences of text.

    简而言之,抽象摘要是一种将文本块输入NLP模型并返回该文本的新颖摘要的技术。 这不应与“提取摘要”相混淆,在“摘要提取”中嵌入句子并执行聚类算法以查找最接近聚类质心的那些,即返回现有的句子。 抽象汇总作为一种数据增强技术特别吸引人,因为它能够生成新颖而逼真的文本句子。

    算法 (Algorithm)

    Here are the steps I took to use Abstractive Summarization for Data Augmentation, including code segments illustrating the solution.

    这是我使用抽象汇总进行数据增强所采取的步骤,包括说明解决方案的代码段。

    I first needed to determine how many rows each under-represented class required. The number of rows to add for each feature is thus calculated with a ceiling threshold, and we refer to these as the append_counts. Features with counts above the ceiling are not appended. In particular, if a given feature has 1000 rows and the ceiling is 100, its append count will be 0. The following methods trivially achieve this in the situation where features have been one-hot encoded:

    首先,我需要确定每个代表性不足的类需要多少行。 因此,每个特征要添加的行数是使用上限阈值计算的,我们将其称为append_counts 。 计数不超过上限的要素不会被附加。 特别是,如果给定要素具有1000行且上限为100,则其附加计数将为0。在要素已被一键编码的情况下,以下方法可以轻松实现此目的:

    def get_feature_counts(self, df):
    shape_array = {} for feature in self.features:
    shape_array[feature] = df[feature].sum() return shape_array
    def get_append_counts(self, df):
    append_counts = {}
    feature_counts = self.get_feature_counts(df)
    for feature in self.features:
    if feature_counts[feature] >= self.threshold:
    count = 0
    else:
    count = self.threshold - feature_counts[feature]
    append_counts[feature] = count
    return append_counts

    For each feature, a loop is completed from an append index range to the append count specified for that given feature. This append_index variable along with a tasks array are introduced to allow for multi-processing which we will discuss shortly.

    对于每个功能,从附加索引范围到为该给定功能指定的附加计数的循环完成。 引入了这个append_index变量以及一个task数组,以允许进行多重处理,我们将在稍后进行讨论。

    counts = self.get_append_counts(self.df)
    # Create append dataframe with length of all rows to be appended
    self.df_append = pd.DataFrame(
    index=np.arange(sum(counts.values())),
    columns=self.df.columns
    )
    # Creating array of tasks for multiprocessing
    tasks = []
    # set all feature values to 0
    for feature in self.features:
    self.df_append[feature] = 0
    for feature in self.features:
    num_to_append = counts[feature]
    for num in range(
    self.append_index,
    self.append_index + num_to_append
    ):
    tasks.append(
    self.process_abstractive_summarization(feature, num)
    )
    # Updating index for insertion into shared appended dataframe
    # to preserve indexing for multiprocessing
    self.append_index += num_to_append

    An Abstractive Summarization is calculated for a specified size subset of all rows that uniquely have the given feature, and is added to the append DataFrame with its respective feature one-hot encoded.

    为唯一具有给定特征的所有行的指定大小的子集计算一个摘要汇总,并将其摘要附加到附加DataFrame中,并对其各个特征进行一次热编码。

    df_feature = self.df[
    (self.df[feature] == 1) &
    (self.df[self.features].sum(axis=1) == 1)
    ]
    df_sample = df_feature.sample(self.num_samples, replace=True)
    text_to_summarize = ' '.join(
    df_sample[:self.num_samples]['review_text'])
    new_text = self.get_abstractive_summarization(text_to_summarize)
    self.df_append.at[num, 'text'] = new_text
    self.df_append.at[num, feature] = 1

    The Abstractive Summarization itself is generated in the following way:

    摘要汇总本身是通过以下方式生成的:

    t5_prepared_text = "summarize: " + text_to_summarize
    if self.device.type == 'cpu':
    tokenized_text = self.tokenizer.encode(
    t5_prepared_text,
    return_tensors=self.return_tensors).to(self.device)
    else:
    tokenized_text = self.tokenizer.encode(
    t5_prepared_text,
    return_tensors=self.return_tensors)
    summary_ids = self.model.generate(
    tokenized_text,
    num_beams=self.num_beams,
    no_repeat_ngram_size=self.no_repeat_ngram_size,
    min_length=self.min_length,
    max_length=self.max_length,
    early_stopping=self.early_stopping
    )
    output = self.tokenizer.decode(
    summary_ids[0],
    skip_special_tokens=self.skip_special_tokens
    )

    In initial tests the summarization calls to the T5 model were extremely time-consuming, reaching up to 25 seconds even on a GCP instance with an NVIDIA Tesla P100. Clearly this needed to be addressed to make this a feasible solution for data augmentation.

    在最初的测试中,对T5模型的汇总调用非常耗时,即使在使用NVIDIA Tesla P100的GCP实例上,也要长达25秒。 显然,需要解决此问题,以使其成为可行的数据增强解决方案。

    Image for post
    Photo by Brad Neathery on Unsplash
    Brad NeatheryUnsplash拍摄的照片

    多处理 (Multiprocessing)

    I introduced a multiprocessing option, whereby the calls to Abstractive Summarization are stored in a task array later passed to a sub-routine that runs the calls in parallel using the multiprocessing library. This resulted in an exponential decrease in runtime. I must thank David Foster for his succinct stackoverflow contribution [3]!

    我介绍了一个multiprocessing选项,其中对抽象总结的调用存储在一个任务数组中,然后传递给一个子例程,该子例程使用多处理库并行运行这些调用。 这导致运行时间呈指数下降。 我必须感谢David Foster所做的简洁的stackoverflow贡献[3]!

    running_tasks = [Process(target=task) for task in tasks]
    for running_task in running_tasks:
    running_task.start()
    for running_task in running_tasks:
    running_task.join()

    简化解决方案 (Simplified Solution)

    To make things easier for everybody I packaged this into a library called absum. Installing is possible through pip:pip install absum. One can also download directly from the repository.

    为了使每个人都更容易,我将其打包到一个名为absum的库中。 可以通过pip install absumpip install absum 。 也可以直接从资源库下载。

    Running the code on your own dataset is then simply a matter of importing the library’s Augmentor class and running its abs_sum_augment method as follows:

    在自己的数据集运行的代码则只需导入库的事项Augmentor类和运行其abs_sum_augment方法如下:

    import pandas as pd
    from absum import Augmentorcsv = 'path_to_csv'
    df = pd.read_csv(csv)
    augmentor = Augmentor(df)
    df_augmented = augmentor.abs_sum_augment()
    df_augmented.to_csv(
    csv.replace('.csv', '-augmented.csv'),
    encoding='utf-8',
    index=False
    )

    absum uses the Hugging Face T5 model by default, but is designed in a modular way to allow you to use any pre-trained or out-of-the-box Transformer models capable of Abstractive Summarization. It is format agnostic, expecting only a DataFrame containing text and one-hot encoded features. If additional columns are present that you do not wish to be considered, you have the option to pass in specific one-hot encoded features as a comma-separated string to the features parameter.

    absum默认情况下使用Hugging Face T5模型,但以模块化方式设计,允许您使用任何能够进行抽象总结的预训练或开箱即用的Transformer模型。 它与格式无关,只希望包含文本和一键编码功能的DataFrame。 如果存在您不希望考虑的其他列,则可以选择将特定的一键编码特征作为逗号分隔的字符串传递给features参数。

    Also of special note are the min_length and max_length parameters, which determine the size of the resulting summarizations. One trick I found useful is to find the average character count of the text data you’re working with and start with something a bit lower for the minimum length while slightly padding it for the maximum. All available parameters are detailed in the documentation.

    还要特别注意的是min_lengthmax_length参数,它们确定所得汇总的大小。 我发现有用的一个技巧是找到正在使用的文本数据的平均字符数,并从最小长度的小一些开始,而最大长度的填充一些。 文档中详细介绍了所有可用参数。

    Feel free to add any suggestions for improvement in the comments or even better yet in a PR. Happy coding!

    可以随意添加任何建议以改善评论,甚至可以改善PR 。 编码愉快!

    翻译自: https://towardsdatascience.com/abstractive-summarization-for-data-augmentation-1423d8ec079e

    数据增强 数据集扩充

    展开全文
  • 我写了一个数据增强的py程序,有几个主要的增强操作,你自己选择把。不会发生检测目标消失的情况。比augmentation好用
  • 数据增强数据扩充

    千次阅读 2019-02-26 15:06:02
    数据扩充方法 在图像上很常用: 方法有:左右翻转、随机裁剪、旋转、平移、噪声扰动、亮度对比度变换等许多简单高效的方法; 其作用是增大数据集且提高泛化效果,随手百度都有很多讲解。 在文本上的使用: 方法...

    数据扩充方法

    在图像上很常用:

    方法有:左右翻转、随机裁剪、旋转、平移、噪声扰动、亮度对比度变换等许多简单高效的方法;

    其作用是增大数据集且提高泛化效果,随手百度都有很多讲解。

    在文本上的使用:

    方法有:

    同义词替换(这种方法比较大的局限性在于同义词在NLP中通常具有比较相近的词向量,因此对于模型来说,并没有起到比较好的对数据增强的作用)

    反向翻译(这是机器翻译中一种非常常用的增强数据的方法,主要思想就是通过机器将一个句子翻译为另一种语言,再把另一种语言翻译为原先的语言,得到一个意思相近但表达方式可能不同的句子。这种方法不仅有同义词替换、词语增删的能力,还具有对句子结构语序调整的效果,并能保持与原句子意思相近,是一种非常有效的数据增强方式。)

    生成对抗网络(近些年大热的生成对抗网络模型(GAN)以及它的各种变体模型,通过生成器和判别器的相互博弈,不断迭代增强训练达到以假乱真的效果,最后用生成器大量生成数据。但这种方法的难点在于需要对GAN模型的训练达到比较好,才能更有效的生成高质量数据,这一点工作量相对较大也较为复杂。)

    来源:

    https://www.zhihu.com/question/305256736/answer/586459726?utm_source=wechat_session&utm_medium=social&utm_oi=760907773469790208

     

    展开全文
  • image classification - data augmentation (数据扩充 数据增强 数据增广) - 图像翻转 (flipping) OpenCV documentation index - OpenCV 文档索引 https://www.docs.opencv.org/ OpenCV 3.4.6 ...Modules ...

    image classification - data augmentation (数据扩充 数据增强 数据增广) - 图像翻转 (flipping)

    OpenCV documentation index - OpenCV 文档索引
    https://www.docs.opencv.org/

    OpenCV 3.4.6
    https://www.docs.opencv.org/3.4.6/index.html

    Modules
    https://www.docs.opencv.org/3.4.6/modules.html

    1. Core functionality -> Operations on arrays -> flip()

    C++:
    void cv::flip(InputArray 	src, OutputArray dst, int flipCode)
    Python:
    dst = cv.flip(src, flipCode[, dst])
    

    #include <opencv2/core.hpp>

    Flips a 2D array around vertical, horizontal, or both axes.
    围绕垂直坐标轴、水平坐标轴或两个轴翻转 2D 阵列。

    The function cv::flip flips the array in one of three different ways (row and column indices are 0-based):
    函数 cv::flip 以三种不同方式之一翻转数组 (行和列索引从 0 开始):

    dstij={srcsrc.rowsi1,jifflipCode==0srci,src.colsj1ifflipCode>0srcsrc.rowsi1,src.colsj1ifflipCode<0 \text{dst}_{ij} = \begin{cases} \text{src}_{\text{src.rows}-i-1, j} & if \, \text{flipCode} == 0 \\ \text{src}_{i, \text{src.cols} -j-1} & if \, \text{flipCode} > 0 \\ \text{src}_{ \text{src.rows} -i-1, \text{src.cols} -j-1} & if \, \text{flipCode} < 0 \\ \end{cases}

    The example scenarios of using the function are the following:

    • Vertical flipping of the image (flipCode == 0) to switch between top-left and bottom-left image origin.

    • Horizontal flipping of the image with the subsequent horizontal shift and absolute difference calculation to check for a vertical-axis symmetry (flipCode > 0).

    • Simultaneous horizontal and vertical flipping of the image with the subsequent shift and absolute difference calculation to check for a central symmetry (flipCode < 0). Reversing the order of point arrays (flipCode > 0 or flipCode == 0).

    • 垂直翻转图像 (flipCode == 0) 在图像左上角和左下角之间切换。

    • 水平翻转图像,随后进行水平移位和绝对差值计算,以检查垂直轴对称 (flipCode > 0)。

    • 同时水平和垂直翻转图像,随后进行移位和绝对差值计算,以检查中心对称性 (flipCode < 0)。反转点数组的顺序 (flipCode > 0 或 flipCode == 0)。

    1.1 Parameters

    src - input array.
    dst - output array of the same size and type as src.
    flipCode - a flag to specify how to flip the array; 0 means flipping around the x-axis and positive value (for example, 1) means flipping around y-axis. Negative value (for example, -1) means flipping around both axes. (用于指定如何翻转数组的标志。0 表示绕 x 轴翻转,正值 (例如 1) 表示绕 y 轴翻转。负值 (例如 -1) 表示围绕两个轴翻转。)

    flip [flɪp]:vt. 掷,轻击 vi. 用指轻弹,蹦跳 adj. 无礼的,轻率的 n. 弹,筋斗
    vertical ['vɜːtɪk(ə)l]:adj. 垂直的,直立的,头顶的,顶点的,纵长的,直上的 n. 垂直线,垂直面,垂直位置
    horizontal [hɒrɪ'zɒnt(ə)l]:adj. 水平的,地平线的,同一阶层的 n. 水平线,水平面,水平位置
    symmetry ['sɪmɪtrɪ]:n. 对称,整齐,匀称
    reverse [rɪ'vɜːs]:n. 背面,相反,倒退,失败 vt. 颠倒,倒转 adj. 反面的,颠倒的,反身的 vi. 倒退,逆叫
    subsequent ['sʌbsɪkw(ə)nt]:adj. 后来的,随后的
    simultaneous [,sɪm(ə)l'teɪnɪəs]:adj. 同时的,联立的,同时发生的 n. 同时译员
    

    1.2 Mirroring (镜像) == Horizontal flipping (水平方向翻转图像)

    在这里插入图片描述

    2. data_augmentation_image_classification_flipping.py

    2.1 directory

    文件夹中存放的图片都是剪裁出来的小图片,包含单个目标。

    在这里插入图片描述

    2.2 data_augmentation_image_classification_flipping.py

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    # image_classification_data_augmentation_flipping.py - v1.0
    
    import cv2
    import os
    from pylab import *
    
    
    def data_augmentation(image_path, flip_mode):
        flip_abbr = flip_mode.split('_')[-1]
    
        for folderName, subFolders, fileNames in os.walk(image_path):
            print("The current folder is " + folderName)
    
            for subfolder in subFolders:
                print("SUBFOLDER OF " + folderName + ': ' + subfolder)
    
            num = 0
            for filename in fileNames:
                # print("FILE INSIDE " + folderName + ': ' + filename)
    
                if ".jpg" not in filename:
                    print(filename)
                    continue
    
                image_name = (filename.split('/'))[-1]
    
                dst_folder = folderName + '_' + flip_abbr
                if not os.path.exists(dst_folder):
                    os.makedirs(dst_folder)
    
                img_file = folderName + '/' + filename
                src_file = cv2.imread(img_file)
                dst_file = flipping(src_file, flip_mode)
    
                aug_file = dst_folder + '/' + flip_abbr + image_name
    
                cv2.imwrite(aug_file, dst_file, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
                num = num + 1
    
                # cv2.imshow("Image Display", dst_file)
                # htitch = np.hstack((src_file, dst_file))
                # cv2.imshow("Image Display", htitch)
    
                keyboard = cv2.waitKey(10) & 0xFF
                # wait for ESC key to exit
                if keyboard == 27:
                    break
    
        # close the windows
        cv2.destroyAllWindows()
    
    
    def flipping(src_image, flip_mode):
        '''
        flip_code - a flag to specify how to flip the array;
        0 means flipping around the x-axis and positive value (for example, 1) means flipping around y-axis.
        Negative value (for example, -1) means flipping around both axes.
        '''
        if "vertical_flipping_vf" == flip_mode:
            dst_image = cv2.flip(src_image, 0)
        elif "horizontal_flipping_hf" == flip_mode:
            dst_image = cv2.flip(src_image, 1)
        elif "horizontal_vertical_flipping_hvf" == flip_mode:
            dst_image = cv2.flip(src_image, -1)
        else:
            dst_image = src_image
    
        return dst_image
    
    
    if __name__ == "__main__":
        current_dir = os.path.dirname(os.path.abspath(__file__))
    
        # "vertical_flipping_vf" == vertical_flipping
        # "horizontal_flipping_hf" == horizontal_flipping
        # "horizontal_vertical_flipping_hvf" == horizontal_vertical_flipping
        flip_mode = "horizontal_flipping_hf"
    
        image_path = current_dir + "/head_rear_data"
        flip_mode = "horizontal_flipping_hf"
        data_augmentation(image_path, flip_mode)
    
    

    2.3 output

    在这里插入图片描述

    /usr/bin/python3.5 /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/data_augmentation_image_classification_flipping.py
    The current folder is /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/head_rear_data
    SUBFOLDER OF /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/head_rear_data: head
    SUBFOLDER OF /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/head_rear_data: rear
    The current folder is /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/head_rear_data/head
    The current folder is /home/strong/training_validation_test_sets_auto_parts/vehicle_source_data_20181015/head_rear_data/rear
    
    Process finished with exit code 0
    

    在这里插入图片描述
    在这里插入图片描述

    展开全文
  • 图像数据增强扩充数据库Image classification is one of the most researched and well-documented task of machine learning. There are lots of benchmarks and large public datasets like ImageNet [1] to ...

    图像数据增强扩充数据库

    Image classification is one of the most researched and well-documented task of machine learning. There are lots of benchmarks and large public datasets like ImageNet [1] to compare new models and algorithms to state of the art (SOTA). Every year a couple of new algorithms are published, and the accuracy of SOTA improves rapidly.

    图像分类是机器学习中研究最多,记录最充分的任务之一。 有很多基准和大型公共数据集,例如ImageNet [1],用于将新模型和算法与最新技术(SOTA)进行比较。 每年都会发布一些新算法,SOTA的准确性会Swift提高。

    In recent years, a key element of improvement was the data augmentation techniques, especially when one tries to improve the rankings without using external data. Data augmentation, small modifications of the training data helps the models to generalize better for unseen examples. An example of data augmentation working with images is mirroring the picture: a cat is a cat in the mirror too.

    近年来,改进的关键要素是数据增强技术,尤其是当人们尝试不使用外部数据来提高排名时。 数据扩充 ,对训练数据进行少量修改,有助于模型更好地概括看不见的示例。 使用图像进行数据增强的一个示例是镜像图片:猫也是镜子中的猫。

    In this story, we analyze the image augmentation techniques used in RandAugment [2], a 2019 algorithm by Google researchers. For this, we will use a pre-trained DenseNet [3] model available in Tensorflow 2.0 Keras. To illustrate the model outputs, we will perform a principal component analysis on the last hidden layer of the model. All codes are available on Google Colab.

    在这个故事中,我们分析了Google研究人员在2019年提出的算法RandAugment [2]中使用的图像增强技术。 为此,我们将使用Tensorflow 2.0 Keras中可用的预训练DenseNet [3]模型 为了说明模型的输出,我们将在模型的最后一个隐藏层上执行主成分分析。 所有代码均可在Google Colab上找到。

    图片分类 (Image classification)

    With Keras, image classification is a three-step problem. 1) load the image, 2) load the pre-trained model, 3) decode the output. The following is a small snippet to do it using TensorFlow 2.0 pre-trained Keras DenseNet model.

    使用Keras时 ,图像分类是一个三步问题。 1)加载图像,2)加载预训练模型,3)解码输出。 以下是使用TensorFlow 2.0预训练的Keras DenseNet模型进行操作的一小段代码。

    Image classification with a pre-trained model in Keras

    在Keras中使用预训练模型进行图像分类

    If we load the model with include_top the classification has an output layer with 1000 classes. The decode_predictions collects the most probable categories, and it adds the name of the classes as well.

    如果我们使用include_top加载模型,则分类的输出层将包含1000个类。 decode_predictions收集最可能的类别,并添加类的名称。

    Image for post
    Cat image from Wikipedia and classification output猫图片和分类输出

    For the PCA analysis, we will use the outputs of the last hidden layer of the model (before softmax). In the DenseNet 121, it means a 1024 dimension large vector space. In Keras, we will get the new model using:

    对于PCA分析,我们将使用模型最后一个隐藏层的输出(在softmax之前)。 在DenseNet 121中,它表示1024维大矢量空间。 在Keras中,我们将使用以下方法获得新模型:

    model_last_hidden = tf.keras.models.Model(inputs=model.input, outputs=model.layers[-2].output)

    model_last_hidden = tf.keras.models.Model(inputs=model.input, outputs=model.layers[-2].output)

    I use this practice to get a good representation of the model outputs without the flattening effect of the softmax layer.

    我使用这种做法来很好地表示模型输出,而没有softmax层的展平效果。

    Image for post
    An example of the DenseNet model from the original paper
    原始论文中的DenseNet模型示例

    使用PCA子空间识别类 (Identifying classes using PCA subspace)

    主成分分析(PCA) (Principal Component Analysis (PCA))

    As I discussed in earlier stories, PCA is an orthogonal transformation that we will use to reduce the dimension of the vectors [4,5]. PCA finds special basis vectors (eigenvectors) for the projection in a way that it maximizes the variance of the reduced-dimension data. Using PCA has two important benefits. On the one hand, we can project the 1024 dimension vectors to a 2D subspace where we can plot it to see the data. On the other hand, it keeps the maximum possible variance (projection loses information). Therefore, it might keep enough variance so we can identify the classes on the picture.

    正如我在前面的故事中讨论的那样, PCA是一种正交变换,我们将使用它来减小向量的维数[4,5]。 PCA以最大化降维数据方差的方式找到投影的特殊基向量(特征向量)。 使用PCA有两个重要好处。 一方面,我们可以将1024维矢量投影到2D子空间中,在其中可以对其进行绘制以查看数据。 另一方面,它保持最大可能的方差(投影会丢失信息)。 因此,它可能保持足够的方差,以便我们可以识别图片上的类。

    Image for post
    Wikipedia维基百科的方程式

    Using PCA with sklearn.decomposition.PCA is a one-liner:pred_pca = PCA(n_components=2).fit_transform(pred)

    将PCA与sklearn.decomposition.PCA使用是一种方法: pred_pca = PCA(n_components=2).fit_transform(pred)

    猫还是大象? (A cat or an elephant?)

    To determine a transformation from the 1024 dimension vector space to a 2D vector space, we will use eight images, four cats and four elephants. Later on, we will show the effect of the data augmentation in this projection. Images are from the Wikipedia.

    为了确定从1024维向量空间到2D向量空间的转换,我们将使用八张图像,四只猫和四只大象。 稍后,我们将在此投影中展示数据增强的效果。 图片来自维基百科。

    Image for post
    Images of cats and elephants from the Wikipedia
    维基百科中的猫和大象图片

    Illustrating the 2D projection, we can see that the cats are well-separated from the elephants.

    举例说明2D投影,我们可以看到猫与大象之间的距离很好。

    Image for post
    2D projection of the DenseNet last hidden layer using PCA
    使用PCA的DenseNet最后隐藏层的2D投影

    图像增强 (Image augmentation)

    Data augmentation is an important part of training a machine learning model, especially when the training images are limited. For image augmentation, lots of augmentation algorithms are defined. An extensive collection of methods are available at the imgaug package for Python developers.

    数据扩充是训练机器学习模型的重要部分,尤其是在训练图像有限的情况下。 对于图像增强,定义了许多增强算法。 imgaug 包中为Python开发人员提供了广泛的方法集合

    For this analysis, we will use the imgaug implementation of the methods used in RandAugment [2], an augmentation achieving SOTA on ImageNet in 2019 using an EfficientNet model, but many other algorithms use the same basic methods as well.

    为了便于分析,我们将使用imgaug的RandAugment [2]中,增强使用EfficientNet模型在2019年就实现ImageNet SOTA使用的方法实施,但许多其他的算法使用相同的基本方法为好。

    When it comes to data augmentation, the most crucial part is to determine the intervals of the parameters of the augmentation methods. For example, if we use rotation, a simple image augmentation technique, it is clear that rotating an image of a cat or an elephant with a few degrees does not change the meaning of the picture. However, usually, we do not expect to see an elephant rotated 180° in nature. Another example comes when we use brightness or contrast manipulation: too high modification might end up with unrecognizable data.

    当涉及数据扩充时,最关键的部分是确定扩充方法的参数间隔。 例如,如果我们使用旋转(一种简单的图像增强技术),则很明显,将猫或大象的图像旋转几度不会改变图片的含义。 但是,通常情况下,我们不希望看到自然旋转180°的大象。 另一个例子是当我们使用亮度或对比度操纵时:太高的修改可能最终导致无法识别的数据。

    Recent works like Population Based Augmentation [6] aims to adapt the magnitude of the modifications in training time while others like RandAugment [2] use it as hyperparameters of the training. The AutoAugment study showed that increasing the magnitude of the augmentation during the training can improve the performance of the model.

    最近的工作(如基于人口的增强[6])旨在适应训练时间的变化幅度,而其他研究(如RandAugment [2])则将其用作训练的超参数。 AutoAugment研究表明,在训练过程中增加增强的幅度可以改善模型的性能。

    Image for post
    Augmentation of an elephant image using default intervals of the imgaug augmentation methods
    使用imgaug增强方法的默认间隔增强大象图像

    If we process the augmented images above and project it to the same 2D vector space as the previous cat-elephant images, we can see that the new dots are around the original image’s. This is the effect of image augmentation:

    如果我们对上面的增强图像进行处理并将其投影到与先前的猫大象图像相同的2D矢量空间中,我们可以看到新的点位于原始图像的周围。 这是图像增强的效果:

    Augmentation expands the single point of an elephant image in the classification space to a whole area of elephant images.

    增强将大象图像在分类空间中的单点扩展到大象图像的整个区域。

    Image for post
    Augmented elephant images in the cat-elephant projection
    猫大象投影中的增强大象图像

    一站式方法 (One-shot approach)

    When one has very few samples in a label class, the problem is called few-shot learning, and data augmentation is a crucial tool to solve this problem. The following experiment tries to prove the concept. Of course, here we have a pre-trained model on a large dataset, so it was not learned in a few-shot training. However, if we try to generate a projection using only one original elephant image, we can get something similar.

    当一个标签类中的样本很少时,该问题称为“一次性学习”,而数据扩充是解决该问题的关键工具。 以下实验试图证明这一概念。 当然,这里我们在大型数据集上有一个预先训练的模型,因此在几次训练中就没有学习到它。 但是,如果我们尝试仅使用一个原始的大象图像生成投影,则可以获得类似的结果。

    For this projection, we will use a new PCA using the original elephant image, and it’s augmented images. The augmented images in this subspace are shown in the following image.

    对于此投影,我们将使用使用原始大象图像及其增强图像的新PCA。 下图显示了此子空间中的增强图像。

    Image for post
    PCA projection of the augmented images
    增强图像的PCA投影

    But can this projection, using only one elephant image separate the elephants from the cats? Well, the clusters are not as clear as in the previous case (see the first scatter plot figure), but the cats and the elephants are in fact in different parts of the vector space.

    但是,仅使用一个大象图像进行的投影能否将大象与猫分开? 好吧,聚类不如前一种情况清晰(请参见第一个散点图),但是猫和大象实际上位于向量空间的不同部分。

    Image for post
    Elephants and cats in the PCA projection generated from one elephant image
    从一幅大象图像生成的PCA投影中的大象和猫

    摘要 (Summary)

    In this story, we illustrated the effect of the data augmentation tools used in the state of the art image classification. We visualized images of cats and elephants and the augmented images of an elephant to understand better how the model sees the augmented images.

    在这个故事中,我们说明了在最先进的图像分类中使用的数据增强工具的效果。 我们将猫和大象的图像以及大象的增强图像可视化,以更好地了解模型如何看增强图像。

    翻译自: https://towardsdatascience.com/analyzing-data-augmentation-for-image-classification-3ed30aa61411

    图像数据增强扩充数据库

    展开全文
  • 图像处理可以用到的数据, 这是原始图像总共1356张。对应的XML文件是它的标签。此数据可以用来做图像目标检测。...YOLOv4 v3 SSD faster rcnn 数据集增广,数据扩充数据增强的常用方法与软件 ...
  • 语义分割 图像增强数据扩充

    千次阅读 热门讨论 2019-03-19 10:47:02
    本文介绍一种比较好用的数据增强工具:Augmentor (1)安装 Augmentor 在终端中输入命令: pip install Augmentor 即可完成安装。 (2)数据增强 语义分割任务需要同时对原始图掩码图(mask)进行增强,...
  • 数据扩充 几何变换 图像过滤 现在所有的技巧都在一个算法中 下一步 在这里,我们将简要介绍数据增强技术。然后,我们选择适合我们数据集的适当扩充方法。然后,我们提供并解释用于实现增强算法的Python代码。...
  • 发现一个python包Augmentor,专门用于数据扩充,链接:https://augmentor.readthedocs.io/en/master/userguide/install.html,就是开发手册,里面包含了安装,包内函数的介绍、扩展性等,这个模块主要包括了:随机...
  • python图片数据增强扩充

    千次阅读 2019-08-30 07:59:48
    python图片数据扩充 # coding:utf-8 ''' 作者:TimeVShow time:2019/8/29 效果:找到标签文件中的所有的图片,对其进行水平翻转,垂直翻转等操作, 并保留改变后的图片,同时在标签集中写入图片的信息 ''' import ...
  • 数据扩充常见方法

    千次阅读 2019-07-20 11:48:07
    数据扩充(data augmentation),又名数据增强 / 数据增广。其本质即:缺少海量数据时,为了保证模型的有效训练,一分钱掰成两半花。 为什么需要数据增强: 一般而言,比较成功的神经网络需要大量的参数,许许多...
  • 有三种数据增强类型,默认情况下,Keras的ImageDataGenerator该类执行就地/即时数据扩充数据增强是正则化的一种形式,使我们的网络可以更好地将其推广到我们的测试/验证集。 在训练中不应用数据增强会导致过度...
  • 数据增强相关实现总结如下: import cv2 import numpy as np import tensorflow as tf import imutils import skimage import pillow path = '/home/zhangwei/workfiles/deeplearning/dogVScat/data/cat_1.jpg' img...
  • 为了能够使训练的网络有更高的准确率,更低的过拟合,通常需要大量的训练数据,但在实际工作中,大量数据并不是说有就有的,怎么办呢,通常解决这个问题的办法就是,使用一些手段,人为扩充数据集。本篇文章,将针对...
  • 图像增强就是对图像的简单形变 image_gen_train.fit(x_train)这里的fit需要输入一个思维的数据,所以要对x_train进行reshape,把60000张28行28列的数据,转换为60000张28行28列的单通道数据(即为1),单通道为灰度...
  • 这里是我找到的扩充数据集的方法,对图像进行亮度增强、对比度增强、水平翻转随机方向旋转,我的1406张图扩充到了7030张。 变换程序 from PIL import ImageEnhance import os import numpy as np from PIL import ...
  • 2-8 数据扩充

    2019-09-23 12:19:28
    数据扩充(Data augmentation) 大部分的计算机视觉任务使用很多的数据,所以数据扩充是经常使用的一种技巧来提高计算机视觉系统的表现。在实践中,更多的数据对大多数计算机视觉任务都有所帮助,不像其他领域,有...
  • 利用合成数据进行时间序列分类的数据扩充方法 Data augmentation using synthetic data for time series classification with ...数据增强技术在计算机视觉方面使用的十分广泛,对于样本数量较少的数据集来说模型...
  • 联合优化数据增强和网络培训:人体姿势估计中的对抗性数据增强 该论文的培训代码 ,CVPR 2018 概述 传统的随机扩增有两个局限性。 进行增强时,它不考虑训练样本的个体差异。 而且它也独立于目标网络的训练状态。 ...
  • 水平翻转增强,垂直翻转增强,镜像对称增强,仿射变化,旋转,高斯加噪,对比度变化,尺度变换,平移等增强方法, 3、数据集格式? 第一种 文件或者文件夹代表标签,不需要对图像中的目标进行打标签。 第二种 ...
  • 增强方式包括: # (一) 针对像素的数据增强 # 1. 改变亮度 # 2. 加噪声 # (二) 针对图像的数据增强 # 3. 裁剪(需改变bbox) # 4. 平移(需改变bbox) # 5. 镜像(需要改变bbox) # 6. 旋转(需要改变bbox) # 7. 遮挡
  • 1、问题描述 收集数据准备微调深度学习模型时,经常会遇到某些分类数据严重不足的情况,另外数据集...Color Jittering:对颜色的数据增强:图像亮度、饱和度、对比度变化(此处对色彩抖动的理解不知是否得当); P...
  • 避免过拟合的基本方法之一是从数据源获得更多数据,当训练数据有限时,可以通过数据增强(data augmentation)变换原有的数据生成新的数据来扩大训练集。即使拥有大量数据,进行数据增强也是有必要的,因为可以防止...
  • python图像数据扩充.zip

    2019-11-10 16:44:42
    数据增强,有时候数据集小的话,需要用一些函数来对对自己的数据进行增强操作。该代码使用python编写,包括旋转、平移等操作。
  • 当然数据增强是针对数据集较小的时候,很多标注的数据集一般都几百张,数据增强可以有效提高数据量,可以扩充训练数据集。但也并非万能的,有时过度信任数据增强会带来负面效果,还会增加网络训练时间。需酌情使用...
  • 音频数据扩充 在此存储库中,给出了音频数据增强的示例。 先决条件 NumPy Matplotlib 罗莎 参考

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 39,856
精华内容 15,942
关键字:

数据扩充和数据增强