• A/B-test / diff in diff01 | A/B-test1.A/B测试定义2.实验方法3.A/B-test设计流程4.案例02 | diff in diff 01 | A/B-test 1.A/B测试定义 A / B测试（分割测试或桶测试）是一种将web或app的两个版本相互比较以确定...
A/B-test / diff in diff01 | A/B-test1.A/B测试定义2.实验方法3.A/B-test设计流程4.案例02 | diff in diff
01 | A/B-test
1.A/B测试定义
A / B测试（分割测试或桶测试）是一种将web或app的两个版本相互比较以确定哪个版本的性能更好的方法。

本质上是一个实验，同一时间维度，将产品的两个或多个版本随机分发给用户，通过收集各群组的用户体验数据和业务数据，最后根据显著性检验分析评估出最好版本正式采用。

2.实验方法
随机将测试用户群分为2部分，用户1使用A方案，用户2使用B方案，经过一定测试时间后，根据收集到的两方案样本观测数据，根据显著性检验结果选取最好方案。
在A / B test中，产品可以创建同一页面的第二个版本。可以设计一些个新的页面。进行测试的过程中，一半的用户显示页面的原始版本（称为控件），另一半用户显示页面的修改版本（称为变体）。

当用户访问产品页面时，利用埋点可以对用户点击行为数据采集，并通过统计引擎进行分析（进行A/B test）。最终可以确定这种更改（变体）对于给定的指标（这里是用户点击率CTR）产生正向影响，负向影响或无影响。

3.A/B-test设计流程
①确定目标：目标是用于确定变体是否比原始版本更成功的指标。可以是点击按钮的点击率、链接到产品购买的打开率、电子邮件注册的注册率等等。
②创建变体：对网站原有版本的元素进行所需的更改。可能是更改按钮的颜色，交换页面上元素的顺序，隐藏导航元素或完全自定义的内容。
③生成假设：一旦确定了目标，就可以开始生成A / B测试想法和假设，以便统计分析它们是否会优于当前版本。
④收集数据：针对指定区域的假设收集相对应的数据用于A/B test分析。
⑤运行试验：此时，网站或应用的访问者将被随机分配控件或变体。测量，计算和比较他们与每种体验的相互作用，以确定每个用户体验的表现。
⑥分析结果：实验完成后，就可以分析结果了。A / B test分析将显示两个版本之间是否存在统计性显著差异。

4.案例

某大厂「猜你喜欢」模块接入了的新推荐算法，新算法开发完成后，上线之前要进行评估，所用的评估方法是A/B test。
具体做法是在全量中抽样出两份小流量，分别走新推荐策略分支和旧推荐策略分支，通过对比这两份流量下的指标（这里按用户点击衡量）的差异，可以评估出新策略的优劣，进而决定新算法是否完全发布。
实例A/B test步骤：

指标：CTR（点击率）
test：新的推荐算法
假设：新算法可以带来更多的用户点击
收集数据：B组数据为新的算法结果，A组数据为旧的算法结果。均为伪造数据。

from scipy import stats
import numpy as np
import seaborn as sns

'''
1.先做一个假设
推荐算法接入前 H0：A>=B
推荐算法接入后 H1：A < B

'''
a_group = np.array([1,4,2,3,5,5,5,7,8,9,10,18])
b_group = np.array([1,2,5,6,8,10,13,14,17,20,13,8])

import matplotlib.pyplot as plt

fig,ax = plt.subplots(figsize = (12,6))
plt.plot(a_group,label = 'a')
plt.plot(b_group,label = 'b')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.legend()
plt.show()


a_group.mean()
# 6.416666666666667

b_group.mean()
# 9.75

'''
2.双边检验
'''
import scipy.stats
t, pval = scipy.stats.ttest_ind(b_group,a_group)
t, pval
# (1.556783470104261, 0.13379164919826217)

# 为了得到单边检验的结果，需要将计算出来的pvalue除于2取单边的结果(这里取阈值为0.05）
value = pval / 2
value
# 大于0.05，不能够拒绝假设。不能够拒绝假设
# 0.06689582459913108
'''
3.结论
根据 scipy.stats.ttest_ind(x, y) 文档的解释，这是双边检验的结果。

为了得到单边检验的结果，需要将 计算出来的 pvalue 除于2 取单边的结果(这里取阈值为0.05）。

求得pvalue=0.13462981561745652，p/2 > alpha(0.05),所以不能够拒绝假设，暂时不能够认为策略B能带来多的用户点击。
'''

02 | diff in diff


展开全文
• Command indiff

2020-11-28 11:44:31
t filter out the commands being issues and they are included in the diff. Example on a Ironware <p>-! Invalid input -> terminal length 0 -! Type ? for a list -! SSH#show version -! <p>Or an ASA <p>...
• In diff_match_patch::diff_cleanupMerge the 'if (count_delete != 0 && count_insert != 0)' is always true because it is proceeded by 'while (count_delete-- > 0)' &...
• <p>As already described in issue 1247, the diff highlighting in Virtaal still has room for improvement. I'll attach a sample screenshot from Windows XP where the diff is really hard to spot (grey ...
• <div><p>In diff view it is confusing to say new, resolved, unresolved, since these are also the categories of detection status. <p>We should use instead: Only in A, Only in B, Both in A and B</p><p>该...
• Bug indiff view

2020-12-27 01:54:05
<div><p>Hello again! Version 2.1.1 did help me with previous case https://github.com/otakustay/react-diff-view/issues/44, but I have encountered another problems ...otakustay/react-diff-view</p></div>
• It looks like filebucket must be present to allow a diff between the old and new version. Puppet dashboard has some config parameters to set this. Is it possible to see diffs in puppetboard ?</p><p>...
• <p>This pull request changes the default commit message to include the full diff of all staged files, instead of just a list of their filenames.</p><p>该提问来源于开源项目：kemayo/sublime-text-...
• <div><p>don't include any context in diff, so there is no report for lines which aren't in fact failed</p><p>该提问来源于开源项目：Bachmann1234/diff_cover</p></div>
• <ul><li>[ ] Add tests for the change</li><li>[ ] Document changes</li><li>[ ] Communicate in the mailing list if needed</li><li>[ ] Pass <code>make installcheck</code></li>[ ] Review a PR in return to...
• <div><p>I believe this change breaks diff functionality in 1.0.2: <p>https://github.com/gitpython-developers/GitPython/commit/332521ac1d94f743b06273e6a8daf91ce93aed7d</p> <p>Using OS X and git 2.5.4, ...
• <p><code>dialogx.jsx</code> is not clickable in diff view <p><strong>How to reproduce the issue:</strong></p> <p>...
• See spaces indiff

2020-12-08 20:53:36
<div><p>Adding colored boxes in whitespaces in the preview window</p><p>该提问来源于开源项目：redbluegames/unity-mulligan-renamer</p></div>
• m getting failing tests where the generated diff has strange artifacts in and it's leaving me totally stumped. <p>Wondering if anyone here has seen diffs with similar artifacts and knows what the ...
• <div><p>This is a bug in react-diff-view.4.4 <h2>reproduce <p>goto https://otakustay.github.io/react-diff-view/ and choose <code>beauty a diff</code> <pre><code>diff diff --git a/sys_test/diff/diff_...

2021-01-06 14:21:08
<div><p>This PR adds rank GroupBy (both SeriesGroupBy and DataFrameGroupBy) by using existing <code>diff</code> logic in Series.</p><p>该提问来源于开源项目：databricks/koalas</p></div>
• <p>Does it make sense that git diff would display "a delete of a single line followed by a bunch of added lines", but Sublime Text would display just "a bunch of changed lines"? <p>I ...
• <p>Add <code>kubectl diff</code> in <code>kustomize</code>. Part of https://github.com/kubernetes/kubernetes/issues/86525</p><p>该提问来源于开源项目：kubernetes/website</p></div>
• I want to make a table which has two columns, and in each column i wanna put one side of the split-view diff. How can i achieve this? I attached below a screenshot. Please disregard the column names...
• <ul><li>Manual test (add detailed scripts or steps below)</li><li>create table t1 in both source db and target db, with different ddl.</li><li>use sync_diff_inspector to check and get <code>table'...
• <div><p>This allows the execution of the <code>helm diff</code> plugin for the entire state file, which for us makes it very easy to visualize the configuration drift of a given environment for ...
• s going to enable me to implement submodule-aware changelogs in my Jenkins pipeline projects, with just 3 lines in the Jenkinsfile. :-) <p>Is there a way to coerce diff2html-cli to include diff ...
• But is there a way to have the normal diff view "show word diff" by default? So I don't have to hit the <code>w</code> key each time? Thanks.</p><p>该提问来源于开源项目：timbrel/...
• <div><p>Created a fresh PR from #159.</p><p>该提问来源于开源项目：SQiShER/java-object-diff</p></div>
• <p>Diff patch for selected only shows the patch for a single file that is currently open in the diff view. <h2>Steps to Reproduce <ol><li>Make changes to some code</li><li>Open version control sidebar...
• diffin linux

2014-02-19 10:40:00
那么，该用diff做些什么，什么时候用diff，什么时候diff的结果对我们有帮助呢？ 我刚用，还不能回答； 我试了1下，对1个修改过的文件和原文件； 结果不错：修改的内容都是添加，于是diff的结果把我添加的新...


2019独角兽企业重金招聘Python工程师标准>>>

"when you use 'diff', what do you think?"
当我第1次用diff的时候，我是这样想的：2个文件的不同处有很多种情况，diff是如何处理的呢？
搜了几篇关于用法的网页：都是讲参数、输出结果的解读。。都不能回答上面的问题。

diff用3种行为描述不同之处，并通过这3种操作，可以使2个文件变为相同。

但实际情况是：有多种不同的操作使2个文件相同，那么diff用的是什么算法？
举例：
file1；     file2；
1,              4,
2,              1,
3,              3,
4,              2,
diff的第1个操作输出是：
0a1
>4
即，file1前添加4,，使第1行与file2相同；
但是，要使2文件的第1行相同，还可能的操作是：
方法1》file2删除第1行；
方法2》file1删除前3行。

为什么diff选择了上述的输出？
于是我又搜索了相关资料，发现diff的背后并不简单，不过有1点是确定的：那就是算法！

有关算法的资料：
http://en.wikipedia.org/wiki/Diff#Algorithm
http://stackoverflow.com/questions/805626/diff-algorithm
http://c2.com/cgi/wiki?DiffAlgorithm
http://www.faqs.org/rfcs/rfc3284.html
https://github.com/paulgb/simplediff
当然我并不是说非要去了解这些算法，相反我觉得并不需要了解他们。
但到此为止，算是解除了心头的疑惑；

那么，该用diff做些什么，什么时候用diff，什么时候diff的结果对我们有帮助呢？
我刚用，还不能回答；
我试了1下，对1个修改过的文件和原文件；
结果不错：修改的内容都是添加，于是diff的结果把我添加的新内容都显示出来了！

转载于:https://my.oschina.net/u/1453251/blog/201079
展开全文
• ve tried to clone a database, edit a single value in a table row, and launch the program. The edit passed unnoted, and no diff file was generated. <p>Conversely, editing the primary key of the table ...
• - the diff in properties (especially "name") - restrict to a limited number of "versions" to avoid big load on server or client (?) <p>Is this feature useful? Should I work on it to ...

...