• 5星
111.52MB qq_28239511 2021-03-17 13:04:53
• 5星
1.18MB weixin_45771864 2021-07-19 16:11:07
• 5星
9.55MB HUANGliang_ 2021-08-13 14:52:28
• 5星
157KB qq_41934573 2021-05-21 23:17:41
• 5星
93.95MB qq_21548021 2021-06-28 10:17:26
• 5星
13KB weixin_37581211 2021-03-28 08:31:57
• 5星
17.55MB qq_31899125 2021-05-22 21:52:40
• 5星
406.57MB qq_42324086 2021-08-05 16:13:28
• 5星
3.43MB weixin_44860200 2021-05-03 18:12:37
• 5星
139KB qq_20232875 2021-01-08 15:29:56
• Using Python, how can I sample data from a multivariate log-normal distribution? For instance, for a multivariate normal, there are two options. Let's assume we have a 3 x 3 covariance matrix and a 3-...


Using Python, how can I sample data from a multivariate log-normal distribution? For instance, for a multivariate normal, there are two options. Let's assume we have a 3 x 3 covariance matrix and a 3-dimensional mean vector mu.
# Method 1
sample = np.random.multivariate_normal(mu, covariance)
# Method 2
L = np.linalg.cholesky(covariance)
sample = L.dot(np.random.randn(3)) + mu
I found numpy's numpy.random.lognormal, but that only seems to work for univariate samples. I also noticed scipy's scipy.stats.lognorm. This does seem to have the potential for multivariate samples. However, I can't figure out how to do this.
解决方案
A multivariate lognormal distributed random variable Rv should have this property: log(Rv) should follow a normal distribution. Therefore, the problem is really just to generation a random variable of multivariate normal distribution and np.exp it.
In [1]: import numpy.random as nr
In [2]: cov = np.array([[1.0, 0.2, 0.3,],
[0.2, 1.0, 0.3,],
[0.3, 0.3, 1.0]])
In [3]: mu = np.log([0.3, 0.4, 0.5])
In [4]: mvn = nr.multivariate_normal(mu, cov, size=5)
In [5]: mvn # This is multivariate normal
Out[5]:
array([[-1.36808854, -1.32562291, -1.9706876 ],
[-2.13119289, 1.28146425, 0.66000019],
[-2.82590272, -1.22500654, -0.32635701],
[-0.4967589 , -0.34469589, -2.04084115],
[-0.85590235, -1.27133544, -0.70959595]])
In [6]: mvln = np.exp(mvn)
In [7]: mvln # This is multivariate log-normal
Out[7]:
array([[ 0.25459314, 0.26563744, 0.139361 ],
[ 0.11869562, 3.60190996, 1.9347927 ],
[ 0.05925514, 0.29375578, 0.72154754],
[ 0.60849968, 0.70843576, 0.12991938],
[ 0.42489961, 0.28045684, 0.49184289]])

展开全文
weixin_36383052 2021-01-11 22:56:09
• 你说I have a sample data, the logarithm of which ...使此数据适合使用scipy.stats.lognorm的对数正态分布，使用：s, loc, scale = stats.lognorm.fit(data, floc=0)假设mu和sigma是基本正态分布。得到这些值...

你说I have a sample data, the logarithm of which follows a normal distribution.
假设data是包含样本的数组。使此数据适合
使用scipy.stats.lognorm的对数正态分布，使用：s, loc, scale = stats.lognorm.fit(data, floc=0)
假设mu和sigma是
基本正态分布。得到这些值的估计值
从该配合中，使用：estimated_mu = np.log(scale)
estimated_sigma = s
(这些是而不是的平均值和标准差的估计值
对于对数正态分布的均值和方差，用mu和sigma表示。)
要组合直方图和PDF，可以使用，例如import matplotlib.pyplot as plt.
plt.hist(data, bins=50, normed=True, color='c', alpha=0.75)
xmin = data.min()
xmax = data.max()
x = np.linspace(xmin, xmax, 100)
pdf = stats.lognorm.pdf(x, s, scale=scale)
plt.plot(x, pdf, 'k')
如果想查看数据日志，可以执行以下操作
下面。注意，使用了正态分布的PDF
在这里。logdata = np.log(data)
plt.hist(logdata, bins=40, normed=True, color='c', alpha=0.75)
xmin = logdata.min()
xmax = logdata.max()
x = np.linspace(xmin, xmax, 100)
pdf = stats.norm.pdf(x, loc=estimated_mu, scale=estimated_sigma)
plt.plot(x, pdf, 'k')
顺便说一下，与stats.lognorm匹配的另一种方法是匹配log(data)
使用stats.norm.fit：logdata = np.log(data)
estimated_mu, estimated_sigma = stats.norm.fit(logdata)
相关问题：

展开全文
weixin_31856057 2021-01-11 22:56:15
• ## python 经验函数分布图 正态分布函数曲线拟合 python 数据分析

第二条曲线用了正态分布函数曲线进行拟合。 import numpy as np import pandas as pd import matplotlib.pyplot as plt datas = np.array([64.3, 65.0, 65.0, 67.2, 67.3, 67.3, 67.3, 67.3, 68.0, 68.0, 68.8, ...
数据分析朱老师课件上的代码。
但图像最后会骤降至0，这是代码的一个缺点，不知道怎么改进。
第二条曲线用了正态分布函数曲线进行拟合。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

datas = np.array([64.3, 65.0, 65.0, 67.2, 67.3, 67.3, 67.3, 67.3, 68.0, 68.0, 68.8, 68.8, 68.8, 69.7,\
69.7, 69.7, 70.3,70.4, 70.4, 70.4, 70.4, 70.4,70.4, 70.4, 71.2, 71.2, 71.2, 71.2,\
72.0, 72.0, 72.0, 72.0, 72.0, 72.0, 72.0, 72.7, 72.7, 72.7, 72.7, 72.7, 72.7, 72.7,\
73.5, 73.5, 73.5, 73.5, 73.5, 73.5, 73.5, 73.5, 73.5,73.5, 73.5, 74.3, 74.3, 74.3,\
74.3, 74.3, 74.3, 74.3, 74.3, 74.7, 75.0, 75.0, 75.0, 75.0, 75.0, 75.0, 75.0, 75.4,\
75.6, 75.8, 75.8, 75.8, 75.8, 75.8, 76.5, 76.5, 76.5, 76.5, 76.5, 76.5, 76.5, 77.2,\
77.2,77.6, 78.0, 78.8, 78.8, 78.8, 79.5, 79.5, 79.5, 80.3, 80.5, 80.5, 81.2, 81.6,\
81.6, 84.3])

#数据特征计算
s = np.std(datas, ddof=1)#样本标准差
xbar = np.mean(datas)#样本均值

#数据可视化 画数据经验分布曲线图
nt, bins, patches = plt.hist(datas, bins=10, histtype='step', \
cumulative=True, density=True, color='darkcyan')#datas是数据,bins是分组数
plt.title('bins = 10')
plt.savefig('经验函数分布图1.jpg', dpi=200)
plt.show()

#数据可视化 画数据经验分布曲线图
nt, bins, patches = plt.hist(datas, bins=15, histtype='step', \
cumulative=True, density=True, color='darkcyan')#datas是数据,bins是分组数
plt.title('bins = 15')

#正态分布函数曲线拟合
y = (1 / (np.sqrt(2 * np.pi) * s)) * np.exp(-0.5 * ((bins - xbar) ** 2 / s ** 2))
y = y.cumsum()
y = y / y[-1]
plt.plot(bins, y, 'tomato', linewidth = 1.5, label = 'Theoretical')
plt.savefig('经验函数分布图2.jpg', dpi=200)
plt.show()



展开全文
zhanjuex 2020-12-25 11:06:16
• 1、生成正态分布数据并绘制概率分布图import pandas as pdimport numpy as npimport matplotlib.pyplot as plt# 根据均值、标准差,求指定范围的正态分布概率值def normfun(x, mu, sigma):pdf = np.exp(-((x - mu)**2...

1、生成正态分布数据并绘制概率分布图
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# 根据均值、标准差,求指定范围的正态分布概率值
def normfun(x, mu, sigma):
pdf = np.exp(-((x - mu)**2)/(2*sigma**2)) / (sigma * np.sqrt(2*np.pi))
return pdf
# result = np.random.randint(-65, 80, size=100) # 最小值,最大值,数量
result = np.random.normal(15, 44, 100) # 均值为0.5,方差为1
print(result)
x = np.arange(min(result), max(result), 0.1)
# 设定 y 轴，载入刚才的正态分布函数
print(result.mean(), result.std())
y = normfun(x, result.mean(), result.std())
plt.plot(x, y) # 这里画出理论的正态分布概率曲线
# 这里画出实际的参数概率与取值关系
plt.hist(result, bins=10, rwidth=0.8, density=True) # bins个柱状图,宽度是rwidth(0~1),=1没有缝隙
plt.title('distribution')
plt.xlabel('temperature')
plt.ylabel('probability')
# 输出
plt.show() # 最后图片的概率和不为1是因为正态分布是从负无穷到正无穷,这里指截取了数据最小值到最大值的分布

根据范围生成正态分布：
result = np.random.randint(-65, 80, size=100) # 最小值,最大值,数量
根据均值、方差生成正态分布：
result = np.random.normal(15, 44, 100) # 均值为0.5,方差为1
2、判断一个序列是否符合正态分布
import numpy as np
from scipy import stats
pts = 1000
np.random.seed(28041990)
a = np.random.normal(0, 1, size=pts) # 生成1个正态分布，均值为0，标准差为1，100个点
b = np.random.normal(2, 1, size=pts) # 生成1个正态分布，均值为2，标准差为1, 100个点
x = np.concatenate((a, b)) # 把两个正态分布连接起来，所以理论上变成了非正态分布序列
k2, p = stats.normaltest(x)
alpha = 1e-3
print("p = {:g}".format(p))
# 原假设:x是一个正态分布
if p < alpha: # null hypothesis: x comes from a normal distribution
print("The null hypothesis can be rejected") # 原假设可被拒绝,即不是正态分布
else:
print("The null hypothesis cannot be rejected") # 原假设不可被拒绝,即使正态分布
3、求置信区间、异常值
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd
# 求列表数据的异常点
def get_outer_data(data_list):
df = pd.DataFrame(data_list, columns=['value'])
df = df.iloc[:, 0]
# 计算下四分位数和上四分位
Q1 = df.quantile(q=0.25)
Q3 = df.quantile(q=0.75)
# 基于1.5倍的四分位差计算上下须对应的值
low_whisker = Q1 - 1.5 * (Q3 - Q1)
up_whisker = Q3 + 1.5 * (Q3 - Q1)
# 寻找异常点
kk = df[(df > up_whisker) | (df < low_whisker)]
data1 = pd.DataFrame({'id': kk.index, '异常值': kk})
return data1
N = 100
result = np.random.normal(0, 1, N)
# result = np.random.randint(-65, 80, size=N) # 最小值,最大值,数量
mean, std = result.mean(), result.std(ddof=1) # 求均值和标准差
# 计算置信区间,这里的0.9是置信水平
conf_intveral = stats.norm.interval(0.9, loc=mean, scale=std) # 90%概率
print('置信区间:', conf_intveral)
x = np.arange(0, len(result), 1)
# 求异常值
outer = get_outer_data(result)
print(outer, type(outer))
x1 = outer.iloc[:, 0]
y1 = outer.iloc[:, 1]
plt.scatter(x1, y1, marker='x', color='r') # 所有离散点
plt.scatter(x, result, marker='.', color='g') # 异常点
plt.plot([0, len(result)], [conf_intveral[0], conf_intveral[0]])
plt.plot([0, len(result)], [conf_intveral[1], conf_intveral[1]])
plt.show()

4、采样点离散图和概率图
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd
import time
print(time.strftime('%Y-%m-%D %H:%M:%S'))
# 根据均值、标准差,求指定范围的正态分布概率值
def _normfun(x, mu, sigma):
pdf = np.exp(-((x - mu)**2)/(2*sigma**2)) / (sigma * np.sqrt(2*np.pi))
return pdf
# 求列表数据的异常点
def get_outer_data(data_list):
df = pd.DataFrame(data_list, columns=['value'])
df = df.iloc[:, 0]
# 计算下四分位数和上四分位
Q1 = df.quantile(q=0.25)
Q3 = df.quantile(q=0.75)
# 基于1.5倍的四分位差计算上下须对应的值
low_whisker = Q1 - 1.5 * (Q3 - Q1)
up_whisker = Q3 + 1.5 * (Q3 - Q1)
# 寻找异常点
kk = df[(df > up_whisker) | (df < low_whisker)]
data1 = pd.DataFrame({'id': kk.index, '异常值': kk})
return data1
N = 100
result = np.random.normal(0, 1, N)
# result = np.random.randint(-65, 80, size=N) # 最小值,最大值,数量
# result = [100]*100 # 取值全相同
# result = np.array(result)
mean, std = result.mean(), result.std(ddof=1) # 求均值和标准差
# 计算置信区间,这里的0.9是置信水平
if std == 0: # 如果所有值都相同即标准差为0则无法计算置信区间
conf_intveral = [min(result)-1, max(result)+1]
else:
conf_intveral = stats.norm.interval(0.9, loc=mean, scale=std) # 90%概率
# print('置信区间:', conf_intveral)
# 求异常值
outer = get_outer_data(result)
# 绘制离散图
fig = plt.figure()
x = np.arange(0, len(result), 1)
plt.scatter(x, result, marker='.', color='g') # 画所有离散点
plt.scatter(outer.iloc[:, 0], outer.iloc[:, 1], marker='x', color='r') # 画异常离散点
plt.plot([0, len(result)], [conf_intveral[0], conf_intveral[0]]) # 置信区间线条
plt.plot([0, len(result)], [conf_intveral[1], conf_intveral[1]]) # 置信区间线条
plt.text(0, conf_intveral[0], '{:.2f}'.format(conf_intveral[0])) # 置信区间数字显示
plt.text(0, conf_intveral[1], '{:.2f}'.format(conf_intveral[1])) # 置信区间数字显示
info = 'outer count:{}'.format(len(outer.iloc[:, 0]))
plt.text(min(x), max(result)-((max(result)-min(result)) / 2), info) # 异常点数显示
plt.xlabel('sample count')
plt.ylabel('value')
# 绘制概率图
if std != 0: # 如果所有取值都相同
x = np.arange(min(result), max(result), 0.1)
y = _normfun(x, result.mean(), result.std())
plt.plot(x, y) # 这里画出理论的正态分布概率曲线
plt.hist(result, bins=10, rwidth=0.8, density=True) # bins个柱状图,宽度是rwidth(0~1),=1没有缝隙
info = 'mean:{:.2f}\nstd:{:.2f}\nmode num:{:.2f}'.format(mean, std, np.median(result))
plt.text(min(x), max(y) / 2, info)
plt.xlabel('value')
plt.ylabel('Probability')
else:
info = 'non-normal distribution!!\nmean:{:.2f}\nstd:{:.2f}\nmode num:{:.2f}'.format(mean, std, np.median(result))
plt.text(0.5, 0.5, info)
plt.xlabel('value')
plt.ylabel('Probability')
plt.savefig('./distribution.jpg')
plt.show()
print(time.strftime('%Y-%m-%D %H:%M:%S'))

以上就是python 生成正态分布数据,并绘图和解析的详细内容，更多关于python 正态分布的资料请关注脚本之家其它相关文章！

展开全文
weixin_39725650 2020-12-21 19:30:28
• weixin_35995661 2021-01-28 22:18:13
• weixin_33670352 2021-04-26 20:00:15
• ## 使用Python实现正态分布、正态分布采样 python分布采样

weixin_32566055 2021-03-18 00:33:57
• weixin_36480423 2021-01-11 22:56:48
• weixin_36397104 2021-01-11 22:55:02
• ## python生成正态分布随机数 python

u012119316 2021-09-23 15:09:10
• ## python标准正态分布表(scipy.stats) python 正态分布

qq_19446965 2021-01-10 00:36:44
• ## 正态分布的分布函数和概率密度（matplotlib） 数据可视化 python

yyyyypppppzzzzz 2021-12-15 14:41:12
• weixin_28726391 2020-12-29 09:22:47
• ## 用python画正态分布图 python

TSzero 2020-12-23 10:33:23
• weixin_39850143 2021-04-26 20:36:31
• ## python求正态分布的分位数 python 统计学

weixin_40725055 2021-07-11 17:13:27
• weixin_33952612 2021-01-29 09:11:58
• weixin_39791653 2020-12-18 06:44:32
• ## python中正态分布是什么？ python正态分布

weixin_34856548 2021-04-27 08:24:03
• ## 用matplotlib.pyplot实现正态分布函数的图像绘制 python jupyter

lyj456258 2021-11-01 11:14:42
• weixin_35135418 2021-01-28 22:26:08
• weixin_42451850 2021-02-09 22:55:01
• weixin_32321921 2020-12-28 22:48:37
• weixin_39924573 2020-12-29 07:24:31
• weixin_36373949 2021-01-11 22:56:14
• weixin_34472954 2021-07-19 18:32:51
• weixin_42361478 2020-12-29 09:21:58
• weixin_39929254 2020-12-22 11:13:19

...

python 订阅