精华内容
下载资源
问答
  • SQL 记录间交叉值求解

    2013-05-16 15:01:36
    -- 序号1与序号3 求交叉值是 8:00:00~9:00:00 ,希望得出1小时 -- 序号1与序号4 求交叉值是 9:30:00~12:00:00,希望得出3小时 -- 序号1与序号5 求交叉值不存在,希望得出0小时 -- 序号2与序号3 求交叉值不存在...
  • k值交叉验证 交叉验证集by Shruti Tanwar 通过Shruti Tanwar 如何掌握交叉验证 (How to get a grip on Cross Validations) Lately, I’ve had the chance to be involved in building a product that aims at ...

    k值交叉验证 交叉验证集

    by Shruti Tanwar

    通过Shruti Tanwar

    如何掌握交叉验证 (How to get a grip on Cross Validations)

    Lately, I’ve had the chance to be involved in building a product that aims at accelerating ML/AI (Machine Learning / Artificial Intelligence) adoption for businesses. In the process of developing such an exciting product, I learned a thing or two along the way.

    最近,我有机会参与开发旨在加速企业采用ML / AI(机器学习/人工智能)的产品。 在开发如此令人兴奋的产品的过程中,我一路上学到了一两件事。

    And although ML/AI is too big of an umbrella to be covered in a single article, I’m taking this chance to brighten the light on one of the concepts which will help you in building out a resilient predictive model. A model which is capable of performing reliably in the real-world, and behaves ‘fairly’ on unseen data.

    而且,尽管ML / AI太大了,无法在一篇文章中涵盖,但我还是借此机会来阐明其中一个概念,这将帮助您建立弹性的预测模型。 一种模型,能够在现实世界中可靠地执行,并且在看不见的数据上表现出“公平”的行为。

    You can never be a 100% sure about your machine learning model’s behavior. There is always room for improvement, or progress or a certain tweak ?. Merely fitting the model to your training data and expecting it to perform accurately in the real world, would be a poor choice to make. Certain factors that can guarantee or at least assure you of reasonable performance need to be considered before deploying the model to production.

    您永远不可能对机器学习模型的行为有100%的把握。 总是有改进的空间,还是有进步或一定的调整? 仅将模型拟合到您的训练数据并期望它在现实世界中能够准确执行,将是一个糟糕的选择。 在将模型部署到生产之前,需要考虑某些可以保证或至少确保您具有合理性能的因素。

    You need to make sure that your model has an understanding of different patterns in your data — is not under-fit or over-fit — and the bias and variance for the model are on the lower end.

    您需要确保您的模型对数据中的不同模式有所了解,而不是过度拟合或过度拟合,并且模型的偏差和方差在较低端。

    Cross-Validation” ✔ is the technique which helps you validate your model’s performance. It’s a statistical method used to estimate the skill of machine learning models. Wikipedia defines it as follows.

    交叉验证 ”✔是帮助您验证模型性能的技术。 这是一种统计方法,用于估计机器学习模型的技能。 维基百科对它的定义如下。

    Cross-validation, sometimes called rotation estimation, or out-of-sample testing is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice.

    交叉验证 (有时称为旋转估计样本外测试)是各种类似的模型验证技术中的任何一种,用于评估统计分析的结果将如何概括为一个独立的数据集。 它主要用于设置,其目的是预测,和一个想要估计如何准确地一个预测模型在实践中执行。

    In extremely simple words, the practical implementation of the above jargon would be as follows:

    用非常简单的话来说,上述术语的实际实现如下:

    While training a model, some of the data is removed before training begins. Upon completion of training, the data that was removed is used to test the performance of the learned model and tweak the parameters to improve the final performance of the model.

    训练模型时,在训练开始之前会删除一些数据。 训练完成后,删除的数据将用于测试学习到的模型的性能,并调整参数以提高模型的最终性能。

    This is the fundamental idea for the whole spectrum of evaluation methods called cross-validation.

    这是称为交叉验证的整个评估方法的基本思想。

    Before discussing the validation techniques though, let us take a quick look at two terms used above. Over-fit and under-fit. What exactly is under-fitting and over-fitting of models and how does it affect the performance of a model with real-world data?

    在讨论验证技术之前,让我们快速看一下上面使用的两个术语。 过度拟合和不足拟合。 什么是模型的过度拟合和过度拟合,以及它如何影响具有实际数据的模型的性能?

    We can understand it easily through the following graph.

    通过下图我们可以很容易地理解它。

    A model is said to be underfitting (High Bias) when it performs poorly on the training data. As we can see in the graph on the left, the line doesn’t cover most of the data points on the graph meaning it has been unable to capture the relationship between the input (say X), and the output to be predicted (say Y).

    模型被认为是欠拟合 (高偏差) 当它在训练数据上表现不佳时。 正如我们在左侧图表中看到的那样,该行并未覆盖图表上的大多数数据点,这意味着它无法捕获输入(例如X )和要预测的输出(例如Y )。

    An overfitting model, (High Variance) on the other hand, performs well on the training data but does not perform well on the evaluation data. In such a case, the model is memorizing the data it has seen instead of learning and is unable to generalize to unseen data.

    过度拟合 另一方面,(高方差) 在训练数据上表现不佳,但在评估数据上表现不佳。 在这种情况下,该模型存储的是已看到的数据,而不是学习的数据,无法概括为看不见的数据。

    The graph on the right represents the case of over-fitting. We see that the predicted line is covering all the data points in the graph. Though it might seem like this should make the model work even better, sadly, that’s far from the practical truth. The predicted line covering all points which also includes noise and outliers produces poor results due to its complexity.

    右图代表过度拟合的情况。 我们看到预测线覆盖了图中的所有数据点。 尽管看起来这应该会使模型更好地工作,但遗憾的是,这与实际情况相去甚远。 由于其复杂性,覆盖所有点(还包括噪声和异常值)的预测线会产生较差的结果。

    Let’s move on to the various types of cross-validation techniques out there.

    让我们继续进行各种类型的交叉验证技术。

    保持方法 (Holdout Method)

    The simplest type of cross-validation. Here, the data set is separated into two sets, called the training set and the testing set. The model is allowed to fit only on the training set. Then the predictions are made for the data in the testing set (which the model has never seen before). The errors it makes are aggregated to give the mean absolute test set error, which is used to evaluate the model.

    交叉验证的最简单类型。 在这里,数据集分为两组,分别称为训练集和测试集。 该模型仅适用于训练集。 然后对测试集中的数据进行预测(模型从未见过)。 汇总其产生的误差以给出平均绝对测试设置误差,该误差用于评估模型。

    This type of evaluation to an extent is dependent on which data points end up in the training set and which end up in the test set, and thus might affect the evaluation depending on how the division is made.

    这种评估在一定程度上取决于训练集中的数据点和测试集中的数据点,因此可能会取决于划分方式而影响评估。

    K折交叉验证 (K-fold cross-validation)

    One of the most popular validation techniques is the K-fold cross-validation. This is due to its simplicity which generally produces less biased or less optimistic estimate of the model skill than other methods, such as a simple train/test split.

    最受欢迎的验证技术之一是K折交叉验证。 这是由于其简单性,与其他方法(例如简单的训练/测试拆分)相比,该方法通常对模型技能产生较少的偏见或较不乐观的估计。

    Here, the data set is divided into k subsets, and the holdout method is repeated k times. Each time, one of the k subsets is used as the test set and the other k-1 subsets constitute the training set. Then the average error is computed across all k trials.

    在此,将数据集划分为k个子集,并将保留方法重复k次。 每次,将k个子集之一用作测试集,而其他k-1个子集构成训练集。 然后,在所有k试验中计算平均误差。

    The general procedure is as follows:

    一般步骤如下:

    1. Shuffle the dataset randomly and split it into k groups

      随机洗净数据集并将其分成k

    2. Take one group as a holdout or test data set and the remaining groups as training data set.

      将一组作为保持或测试数据集,将其余组作为训练数据集。
    3. Fit a model on the training set and evaluate it on the test set.

      在训练集上拟合模型并在测试集上对其进行评估。
    4. Retain the evaluation score and discard the model.

      保留评估分数并丢弃模型。
    5. Summarize the skill of the model using the sample of model evaluation scores.

      使用模型评估分数的样本来总结模型的技能。

    The edge this method has over others is that it matters little how the data gets divided. Every data point will get to be in a test set exactly once and will get into training set exactly k-1 times. As k is increased, we see a fall in the variance of the resulting estimate.

    该方法相对于其他方法的优势在于,如何划分数据无关紧要。 每个数据点将被准确地放入测试集中一次,并将被精确地进入k-1次训练集中。 随着k的增加,我们看到结果估计的方差下降。

    One disadvantage of this method can be the computation required during the training. The training algorithm has to be rerun from scratch k times, which means it takes k times as much computation to make an evaluation.

    这种方法的一个缺点是训练期间需要进行计算。 训练算法必须从头开始重新运行k次,这意味着需要k倍的计算才能进行评估。

    留一法交叉验证 (Leave-one-out cross-validation)

    Leave-one-out is sort of like a cousin to K-fold cross-validation where k becomes equal to n, the total number of data points in the set. It's basically a logical extreme version of K-fold validation. How it works practically is by leaving out exactly one data point in each iteration and using that data point to make the prediction.

    一劳永逸 有点像表亲对K折交叉验证,其中k等于n ,即集合中数据点的总数。 它基本上是K折验证的逻辑极限版本。 实际上,它的工作方式是在每次迭代中精确地保留一个数据点,然后使用该数据点进行预测。

    The function approximator is trained on all the data, exactly n times, except for one point and a prediction is made for that point. As before, the average error is computed and used to evaluate the model.

    除一个点外,函数逼近器在所有数据上均进行了n次训练,并对该点进行了预测。 如前所述,计算平均误差并将其用于评估模型。

    There we go and call it a wrap. Hope you enjoyed reading it as much I enjoyed creating it.❤️ Let me know your thoughts?, comments? or advice? in the comments below.And while you’re at it, why don’t you go and check out what I build with my team, at skyl.ai, and strike up a conversation with me or share your feedback. Cheers.

    在那里,我们称之为包裹。 希望您喜欢阅读它,就像我喜欢创造它一样。❤️让我知道您的想法吗? 或建议? 在您评论的同时,为什么不去看看我与我的团队合作的成果,在sky l.ai,与我进行对话或分享您的反馈。 干杯。

    翻译自: https://www.freecodecamp.org/news/how-to-get-a-grip-on-cross-validations-bb0ba779e21c/

    k值交叉验证 交叉验证集

    展开全文
  • k值交叉验证 交叉验证集Cross-Validation also referred to as out of sampling technique is an essential element of a data science project. It is a resampling procedure used to evaluate machine learning ...

    k值交叉验证 交叉验证集

    Cross-Validation also referred to as out of sampling technique is an essential element of a data science project. It is a resampling procedure used to evaluate machine learning models and access how the model will perform for an independent test dataset.

    交叉验证(也称为“过采样”技术)是数据科学项目的基本要素。 它是一种重采样过程,用于评估机器学习模型并访问该模型对独立测试数据集的性能。

    In this article, you can read about 8 different cross-validation techniques having their pros and cons, listed below:

    在本文中,您可以阅读以下大约8种不同的交叉验证技术,各有其优缺点:

    1. Leave p out cross-validation

      省略p交叉验证

    2. Leave one out cross-validation

      留出一个交叉验证

    3. Holdout cross-validation

      保持交叉验证

    4. Repeated random subsampling validation

      重复随机二次抽样验证

    5. k-fold cross-validation

      k折交叉验证

    6. Stratified k-fold cross-validation

      分层k折交叉验证

    7. Time Series cross-validation

      时间序列交叉验证

    8. Nested cross-validation

      嵌套交叉验证

    Before coming to cross-validation techniques let us know why cross-validation should be used in a data science project.

    在介绍交叉验证技术之前,让我们知道为什么在数据科学项目中应使用交叉验证。

    为什么交叉验证很重要? (Why Cross-Validation is Important?)

    We often randomly split the dataset into train data and test data to develop a machine learning model. The training data is used to train the ML model and the same model is tested on independent testing data to evaluate the performance of the model.

    我们经常将数据集随机分为训练数据和测试数据,以开发机器学习模型。 训练数据用于训练ML模型,同一模型在独立的测试数据上进行测试以评估模型的性能。

    With the change in the random state of the split, the accuracy of the model also changes, so we are not able to achieve a fixed accuracy for the model. The testing data should be kept independent of the training data so that no data leakage occurs. During the development of an ML model using the training data, the model performance needs to be evaluated. Here’s the importance of cross-validation data comes into the picture.

    随着分裂随机状态的变化,模型的准确性也会发生变化,因此我们无法为模型获得固定的准确性。 测试数据应与训练数据无关,以免发生数据泄漏。 在使用训练数据开发ML模型的过程中,需要评估模型的性能。 这就是交叉验证数据的重要性。

    Data needs to split into:

    数据需要分为:

    • Training data: Used for model development

      训练数据:用于模型开发

    • Validation data: Used for validating the performance of the same model

      验证数据:用于验证相同模型的性能

    Image for post
    (Image by Author), Validation split
    (作者提供的图像),验证拆分

    In simple terms cross-validation allows us to utilize our data even better. You can further read, working, and implementation of 7 types of Cross-Validation techniques.

    简单来说,交叉验证使我们可以更好地利用我们的数据。 您可以进一步阅读,使用和实施7种类型的交叉验证技术。

    1.保留p-out交叉验证: (1. Leave p-out cross-validation:)

    Leave p-out cross-validation (LpOCV) is an exhaustive cross-validation technique, that involves using p-observation as validation data, and remaining data is used to train the model. This is repeated in all ways to cut the original sample on a validation set of p observations and a training set.

    离开p-out交叉验证(LpOCV)是一种详尽的交叉验证技术,涉及使用p观测作为验证数据,而其余数据则用于训练模型。 以所有方式重复此步骤,以在p个观察值的验证集和一个训练集上切割原始样本。

    A variant of LpOCV with p=2 known as leave-pair-out cross-validation has been recommended as a nearly unbiased method for estimating the area under ROC curve of a binary classifier.

    已推荐使用p = 2的LpOCV变体(称为休假配对交叉验证)作为估计二进制分类器ROC曲线下面积的几乎无偏的方法。

    2.一劳永逸的交叉验证: (2. Leave-one-out cross-validation:)

    Leave-one-out cross-validation (LOOCV) is an exhaustive cross-validation technique. It is a category of LpOCV with the case of p=1.

    留一法交叉验证(LOOCV)是一种详尽的交叉验证技术。 在p = 1的情况下,它是LpOCV的类别。

    Image for post
    Source), LOOCV operations来源 ),LOOCV操作

    For a dataset having n rows, 1st row is selected for validation, and the rest (n-1) rows are used to train the model. For the next iteration, the 2nd row is selected for validation and rest to train the model. Similarly, the process is repeated until n steps or the desired number of operations.

    对于具有n行的数据集,选择第一行进行验证,其余(n-1)行用于训练模型。 对于下一次迭代,选择第二行进行验证,然后休息以训练模型。 类似地,重复该过程,直到n个步骤或所需的操作数。

    Both the above two cross-validation techniques are the types of exhaustive cross-validation. Exhaustive cross-validation methods are cross-validation methods that learn and test in all possible ways. They have the same pros and cons discussed below:

    以上两种交叉验证技术都是穷举性交叉验证的类型。 详尽的交叉验证方法是以各种可能的方式学习和测试的交叉验证方法。 它们具有以下讨论的优点和缺点:

    优点: (Pros:)

    1. Simple, easy to understand, and implement.

      简单,易于理解和实施。

    缺点: (Cons:)

    1. The model may lead to a low bias.

      该模型可能会导致较低的偏差。
    2. The computation time required is high.

      所需的计算时间长。

    3.保留交叉验证: (3. Holdout cross-validation:)

    The holdout technique is an exhaustive cross-validation method, that randomly splits the dataset into train and test data depending on data analysis.

    保持技术是一种详尽的交叉验证方法,该方法根据数据分析将数据集随机分为训练数据和测试数据。

    Image for post
    (Image by Author), 70:30 split of Data into training and validation data respectively
    (作者提供的图片),70:30将数据分别分为训练和验证数据

    In the case of holdout cross-validation, the dataset is randomly split into training and validation data. Generally, the split of training data is more than test data. The training data is used to induce the model and validation data is evaluates the performance of the model.

    在保持交叉验证的情况下,数据集被随机分为训练和验证数据。 通常,训练数据的分割不仅仅是测试数据。 训练数据用于推导模型,而验证数据用于评估模型的性能。

    The more data is used to train the model, the better the model is. For the holdout cross-validation method, a good amount of data is isolated from training.

    用于训练模型的数据越多,模型越好。 对于保持交叉验证方法,需要从训练中隔离大量数据。

    优点: (Pros:)

    1. Same as previous.

      和以前一样。

    缺点: (Cons:)

    1. Not suitable for an imbalanced dataset.

      不适合不平衡数据集。
    2. A lot of data is isolated from training the model.

      许多数据与训练模型隔离。

    4. k折交叉验证: (4. k-fold cross-validation:)

    In k-fold cross-validation, the original dataset is equally partitioned into k subparts or folds. Out of the k-folds or groups, for each iteration, one group is selected as validation data, and the remaining (k-1) groups are selected as training data.

    在k倍交叉验证中,原始数据集被平均分为k个子部分或折叠。 从k折或组中,对于每次迭代,选择一组作为验证数据,其余(k-1)个组选择为训练数据。

    Image for post
    Source), k-fold cross-validation来源 ),k折交叉验证

    The process is repeated for k times until each group is treated as validation and remaining as training data.

    该过程重复k次,直到将每个组视为验证并保留为训练数据为止。

    Image for post
    (Image by Author), k-fold cross-validation
    (作者提供的图片),k倍交叉验证

    The final accuracy of the model is computed by taking the mean accuracy of the k-models validation data.

    模型的最终精度是通过获取k模型验证数据的平均精度来计算的。

    Image for post

    LOOCV is a variant of k-fold cross-validation where k=n.

    LOOCV是k倍交叉验证的变体,其中k = n。

    优点: (Pros:)

    1. The model has low bias

      该模型偏差低
    2. Low time complexity

      时间复杂度低
    3. The entire dataset is utilized for both training and validation.

      整个数据集可用于训练和验证。

    缺点: (Cons:)

    1. Not suitable for an imbalanced dataset.

      不适合不平衡数据集。

    5.重复随机二次抽样验证: (5. Repeated random subsampling validation:)

    Repeated random subsampling validation also referred to as Monte Carlo cross-validation splits the dataset randomly into training and validation. Unlikely k-fold cross-validation split of the dataset into not in groups or folds but splits in this case in random.

    重复的随机子采样验证(也称为蒙特卡洛交叉验证)将数据集随机分为训练和验证。 数据集的k倍交叉验证不太可能分成几类,而不是成组或成对,而是在这种情况下随机地成组。

    The number of iterations is not fixed and decided by analysis. The results are then averaged over the splits.

    迭代次数不是固定的,而是由分析决定的。 然后将结果平均化。

    Image for post
    (Image by Author), Repeated random subsampling validation
    (作者提供的图片),重复随机子采样验证
    Image for post

    优点: (Pros:)

    1. The proportion of train and validation splits is not dependent on the number of iterations or partitions.

      训练和验证拆分的比例不取决于迭代或分区的数量。

    缺点: (Cons:)

    1. Some samples may not be selected for either training or validation.

      某些样本可能无法选择用于训练或验证。
    2. Not suitable for an imbalanced dataset.

      不适合不平衡数据集。

    6.分层k折交叉验证: (6. Stratified k-fold cross-validation:)

    For all the cross-validation techniques discussed above, they may not work well with an imbalanced dataset. Stratified k-fold cross-validation solved the problem of an imbalanced dataset.

    对于上面讨论的所有交叉验证技术,它们可能不适用于不平衡的数据集。 分层k折交叉验证解决了数据集不平衡的问题。

    In Stratified k-fold cross-validation, the dataset is partitioned into k groups or folds such that the validation data has an equal number of instances of target class label. This ensures that one particular class is not over present in the validation or train data especially when the dataset is imbalanced.

    在分层k倍交叉验证中,数据集被划分为k个组或折叠,以使验证数据具有相等数量的目标类标签实例。 这样可以确保在验证或训练数据中不会出现一个特定的类,尤其是在数据集不平衡时。

    Image for post
    (Image by Author), Stratified k-fold cross-validation, Each fold has equal instances of the target class
    (作者提供的图片),分层k折交叉验证,每折具有相等的目标类实例

    The final score is computed by taking the mean of scores of each fold.

    最终分数是通过取各折分数的平均值来计算的。

    优点: (Pros:)

    1. Works well for an imbalanced dataset.

      对于不平衡的数据集,效果很好。

    缺点: (Cons:)

    1. Now suitable for time series dataset.

      现在适合时间序列数据集。

    7.时间序列交叉验证: (7. Time Series cross-validation:)

    The order of the data is very important for time-series related problem. For time-related dataset random split or k-fold split of data into train and validation may not yield good results.

    数据的顺序对于与时间序列相关的问题非常重要。 对于与时间相关的数据集,将数据随机拆分或k倍拆分为训练和验证可能不会产生良好的结果。

    For the time-series dataset, the split of data into train and validation is according to the time also referred to as forward chaining method or rolling cross-validation. For a particular iteration, the next instance of train data can be treated as validation data.

    对于时间序列数据集,根据时间将数据分为训练和验证,也称为前向链接方法滚动交叉验证 。 对于特定的迭代,可以将火车数据的下一个实例视为验证数据。

    Image for post
    (Image by Author), Time Series cross-validation
    (作者提供的图像),时间序列交叉验证

    As mentioned in the above diagram, for the 1st iteration, 1st 3 rows are considered as train data and the next instance T4 is validation data. The chance of choice of train and validation data is forwarded for further iterations.

    如上图所述,对于第一个迭代,第一个3行被视为训练数据,下一个实例T4是验证数据。 选择训练和验证数据的机会将被进一步迭代。

    8.嵌套交叉验证: (8. Nested cross-validation:)

    In the case of k-fold and stratified k-fold cross-validation, we get a poor estimate of the error in training and test data. Hyperparameter tuning is done separately in the earlier methods. When cross-validation is used simultaneously for tuning the hyperparameters and generalizing the error estimate, nested cross-validation is required.

    在进行k折和分层k折交叉验证的情况下,我们对训练和测试数据中的错误估计差。 超参数调整是在较早的方法中单独完成的。 当交叉验证同时用于调整超参数和泛化误差估计时,需要嵌套交叉验证。

    Nested Cross Validation can be applicable in both k-fold and stratified k-fold variants. Read the below article to know more about nested cross-validation and its implementation:

    嵌套交叉验证可同时应用于k折和分层k折变体。 阅读以下文章,以了解有关嵌套交叉验证及其实现的更多信息:

    结论: (Conclusion:)

    Cross-validation is used to compare and evaluate the performance of ML models. In this article, we have covered 8 cross-validation techniques along with their pros and cons. k-fold and stratified k-fold cross-validations are the most used techniques. Time series cross-validation works best with time series related problems.

    交叉验证用于比较和评估ML模型的性能。 在本文中,我们介绍了8种交叉验证技术及其优缺点。 k折和分层k折交叉验证是最常用的技术。 时间序列交叉验证最适合与时间序列相关的问题。

    Implementation of these cross-validations can be found out in the sklearn package. Read this sklearn documentation for more details.

    这些交叉验证的实现可以在sklearn包中找到。 阅读此sklearn文档以获取更多详细信息。

    翻译自: https://towardsdatascience.com/understanding-8-types-of-cross-validation-80c935a4976d

    k值交叉验证 交叉验证集

    展开全文
  • 边缘零交叉

    千次阅读 2016-06-07 09:10:08
    边缘零交叉化原理: 通过LoG算子检测出边缘并将二侧的局部象素划分为背景和目标2类;当局部均为目标或者背景(这时不存在边缘)时,先将其确定为待定区域,然后进行连通区域标记,最后根据标记后连通区域周围...

    边缘零交叉二值化原理:
    通过LoG算子检测出边缘并将二侧的局部象素划分为背景和目标2类;当局部均为目标或者背景(这时不存在边缘)时,先将其确定为待定区域,然后进行连通区域标记,最后根据标记后连通区域周围象素的属性判断归属将其正确地进行二值化。

    1.Log算子边缘检测
    log边缘检测非常好理解,先进行高斯平滑处理,再进行拉普拉斯边缘检测,最终得到的图像数据为含有正负号的二次导数的图像。
    这里写图片描述

    2.分类
    依据LoG算子计算后对图像的边缘零交叉结果h(x,y),对图像像素进行分类,借助局部区域图像象素灰度的极大值和极小值的差值s来判断该区域属于均一区域还是真正的边缘,分为前景(目标),背景和待定区。
    (1)当s大于某个阈值且h(x,y)>0时,令该象素为背景 -1。
    (2)当s大于某个阈值且h(x,y)<0时,令该象素为目标 1。
    (3)当前二者条件都不满足时,令该象素为待定区域‘0’。

    此时的待定区可以用使用otsu算法再次进行分类,采用这种方法的话,第三步将省略。

    差值计算代码:

    int getNeighborMaxDiff(Mat src, Point center_point, int half_size)
    {
    	int max = 0;
    	int min = 256;
    	int max_diff = 0;
    
    	int nr = src.rows;
    	int nc = src.cols * src.channels();
    
    	int start_x = 0;
    	int end_x = 0;
    	int start_y = 0;
    	int end_y = 0;
    
    	if(center_point.x - half_size < 0)
    	{
    		start_x = 0;
    	}
    	else
    	{
    		start_x = center_point.x - half_size;
    	}
    
    	if(center_point.x + half_size > nc)
    	{
    		end_x = nc;
    	}
    	else
    	{
    		end_x = center_point.x + half_size;
    	}
    
    	if(center_point.y - half_size < 0)
    	{
    		start_y = 0;
    	}
    	else
    	{
    		start_y = center_point.y - half_size;
    	}
    
    	if(center_point.y + half_size > nr)
    	{
    		end_y = nr;
    	}
    	else
    	{
    		end_y = center_point.y + half_size;
    	}
    
    	for(int i=start_y; i<end_y; i++)
    	{
    		uchar *data = src.ptr<uchar>(i);
    		for(int j=start_x; j<end_x; j++)
    		{
    			int current_data = data[j];
    			if(current_data > max)
    			{
    				max = data[j];
    			}
    			if(current_data < min)
    			{
    				min = data[j];
    			}
    		}
    		cout << endl;
    	}
    
    	max_diff = max - min;
    
    	if(max_diff < 0)
    	{
    		return -1;
    	}
    
    	return max_diff;
    }
    
    

    3.待定区分类
    采用基于邻域统计的方法来对待定区进行分类。对每个待定区域象素进行4连通区域标记,并分别统计每个标记区域周围的8连接目标象素个数numA与8连接背景象素的个数numB。利用numA和numB进行二值化,规则如下:
      (1)当该区域的8连接目标象素个数numA大于8连接的背景象素个数numB时,令该区域象素为目标 1。
      (2)当该区域的8连接目标象素个数numA小于等于8连接的背景象素个数numB时,令该区域象素为背景 -1。
      经过以上步骤得到背景和目标分开的二值图像。

    展开全文
  • 针对全局阈值法和局部阈值法的不足,提出一种基于LoG算子的边缘零交叉化方法,并通过实验分析各算法的二化效果及运算速度。
  • libjpeg交叉编译移

    2020-09-15 10:28:33
    1.源码下载 链接: jpegsrc.v9d.tar.gz 2.解压 $ tar xvf jpegsrc.v9d.tar.gz 3.配置configure CC=arm-linux-gcc \ ./configure \ --host=arm-linux \ --prefix=$(pwd)/target_bin ...4.make 然后make install

    1.源码下载
    链接: jpegsrc.v9d.tar.gz
    2.解压

    $ tar xvf jpegsrc.v9d.tar.gz
    

    3.配置configure

    CC=arm-linux-gcc \
    ./configure \
            --host=arm-linux \
            --prefix=$(pwd)/target_bin
    

    4.make 然后make install

    展开全文
  • 主要代码如下:在有一个问题,上面的n_neighbors(k)为什么是5,其他可不可以?效果会怎么样?如何知道哪个k的效果相对较好?这就需要用到交叉验证。交叉验证有什么用,为什么要用它?关于它有什么用,经过浏览...
  • 摘 要: 针对全局阈值法和局部阈值法的不足,提出一种基于LoG算子的边缘零交叉化方法,并通过实验分析各算法的二化效果及运算速度。  1 现有的图像二化算法  二化是一幅图像包括目标物体、背景还有...
  • 交叉验证 import numpy as np from sklearn.neighbors import KNeighborsClassifier from sklearn import datasets #model_selection :模型选择 # cross_val_score: 交叉 ,validation:验证(测试) #交叉验证 ...
  • Oracle 交叉表 固定 如果速度固定只有0 20 50 100 150这几个,可以: select No, sum(decode(速度,'0',,0)), sum(decode(速度,'20',,0)), ...
  • 维度及交叉维度最佳解决方案1、前言2、事实表与维度表多对多(多维度)3、维表与维表多对多(交叉维度)4、总结 1、前言 正常情况下,维表和事实表之间是一对多的关系,维表中的一行记录会连接事实表中的多行记录...
  • AR轴比(dB)与交叉极化的对应关系
  • 点击取@上级id的显示: String ls_click,ls_col_tmp,ls_value dw_1.Object.DataWindow.Crosstab.StaticMode = True ls_click = dw_1.GetObjectAtPointer ( ) ls_col_tmp = Left(ls_click,Pos(ls_click,
  • 我知道我必须先对数据框进行排序,然后才能使用交叉表,但是如果使用其他任何东西,那么最基本的交叉表功能就会给我带来错误。我也知道排名前10位的聚会应该是唯一的,所以我猜想在某个时候我必须使用独特的功能,...
  • 今天遇到一个问题,在excel中当行和列的相同时,需要这个单元格高亮显示。 解决方法: 1、开始>条件格式>管理规则 2、新建规则,选择最后一个使用公式,=$M2<=Q$2 ,我这里用这个公式是因为我需要将...
  • 交叉验证」到底如何选择K? 原文链接:https://cloud.tencent.com/developer/article/1410946 交叉验证(cross validation)一般被用于评估一个机器学习模型的表现。更多的情况下,我们也用交叉验证来进行模型选择...
  • 用K折交叉验证估计KNN算法中的K

    万次阅读 2018-04-09 23:48:36
    之后看了好多CSDN的博客,发现一般大家除了靠缘分去试K之外,也会采用交叉验证的方法去近似求得K,因此我决定自己实现一下,看看有什么效果。 交叉验证之前老师在将数据挖掘的时候简单的介绍了一下,那时候只...
  • 接触到交叉分组表,如果使用Excel可以使用数据透视表进行组合交叉分组表,但在MySQL中如何创建呢?交叉分组表交叉分组表是一种常用的分类汇总表格,可以显示多变量之间的关系。其表格形式的行和列标签为一个或多个...
  • 全志A64Qt5.6.3移之设置IDE交叉编译工具链 1.打开Q5.6的IDE Qt Creator的"工具"->“选项” 2.然后选择“kits”->编译器 添加编译器 选择右上角添加 选择GCC-C 编译器路径选择交叉编译工具所在的GCC工具 我...
  • 美国国家半导体公司(National Semiconductor Corporation)宣布推出一款全新的低电压差分信号传输(LVDS)4×4交叉点开关DS25CP104。该产品在数据传输速率高达3.125Gbps时,仍可将抖动降至业界最低的水平(典型为10ps)...
  • 近日看到有人提了一个问题,需要更新下图中3位置的,算法为1+2=3 看了图中的数值,我想可以用循环来实现,供参考。 首先建立一个测试表 create table test( ...
  • 在matlab进行曲面插值后,得到插值曲面却无法进一步对数据进行校验,通常插值后需要进行交叉验证,评价算法优劣,苦苦搜寻没有找到资料,索性自己编写以griddata,抛砖引玉函数插值曲面后进行说明,抛砖引玉其他可以...
  • 包含头文件md5.h和C源码md5.c 不到200行可直接编译和交叉编译,用于计算字符的MD5
  • 运用训练集来做交叉验证,从而找到最优k 我们因为要同时观察训练集的子训练集和测试集效果随着k的增加而变化情况,所以这里直接用 sklearn.model_selection 中的 vlidation_curve 来完成。 import ...
  • 文章目录1.10 交叉验证,网格搜索学习目标1 什么是交叉验证(cross validation)1.1 分析1.2 为什么需要交叉验证2 什么是网格搜索(Grid Search)3 交叉验证,网格搜索(模型选择与调优)API:4 鸢尾花案例增加K调优5 ...
  • //提取动态交叉报表动态标题名和动态列: String ls_Str, ls_Str1, ls_Name, ls_Name_Text, ls_Text Long ll_Cnt, ll_CntTmp, i, ll_Row dwobject ldwo environment env GetEnvironment(env) ll_Cnt = Long(dw_2...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 4,788
精华内容 1,915
关键字:

交叉值