• 2021-07-27 01:20:27

四分位差(quartile deviation)，也称为内距或四分间距(inter-quartile range)，它是上四分位数(QU，即位于75%)与下四分位数(QL，即位于25%)的差。计算公式.

将所有数值按大小顺序排列并分成四等份，处于三个分割点位置的得分就是四分位数。最小的四分位数称为下四分位数，所有数值中，有四分之一小于下四分位数，四分之.

把一个数组从小到大排序，中位数是中间那个数上四分位数是排在1/4的那个数下四分位数是排在3/4的那个数如果用EXCEL计算($A$1:$A$9为数列)最小值=QUARTILE.

哪位大神可以给我详细说一下4分位数的具体求法。。我举一个例子。。这里。

四分位数(Quartile)，即统计学中，把所有数值由小到大排列并分成四等份，处于三个分割点位置的得分就是四分位数。第一四分位数 (Q1)，又称“较小四分位数”，.

有一个函数是专门求四分位数的。=quartile(a1:a10,1)

四分位数和中位数是同一类的概念，将一组数据按大小顺序排列后，按数据的个数分成四份，而这三个分割点上的数值，就称四分位数，具体分别称为：第1四分位数，第2.

统计学中，把所有数值由小到大排列并分成四等份，处于三个分割点位置的数值就是四分位数。第一四分位数 (Q1)，又称“较小四分位数”，等于该样本中所有数值由.

晕死，这个貌似不是佛法，是财务方法吧。——你看这样解释对不对？——四分位法是zhidao统计学的一种分析方法。简单地说，就是将全部数据从小到大排列，正好排 列.

众数从=10中位数=10.5下四分位数=9.25上四分位数=13.5平均数=11.1667标准差=2.7579

如题，是一个数字，比如10，还是一个范围，比如2-12？怎么求中四分位范围。

四分位数是将全部数据分成相等的四部分，其中每部分包括25%的数据，处在各分位点的数值就是四分位数。 四分位数作为分位数的一种形式，在统计中有着十分重要的.

要计算过程，怎么算出来的？

从小到大排序：17,19,20,22,23,23,,24,25 下四分位数等于该样本中所有数值由小到大排列后第25%的数字，即第2个数19。上四分位数等于该样本中所有数值由小到大排列.

四分位数(Quartile)，即统计学中，把所有数值由小到大排列并分成四等份，处于三个分割点位置的数值就是四分位数。 第一四分位数 (Q1)，又称“较小四分位数”.

4分位数有两个25%和75%把一组数据按照大小的顺序排列其中前者的求法是，这个数的前面占全部数据的25%后者是这个数的前面占全部数据的75%

1/4的我知道，3/4怎么算

使用excel中quartile的函数.语法(array,quart).参数array为需要求得四分位数值的数组或数字引用区域，quart决定返回哪个四分位值.如果quart取0,1,2,3或4则函数quartile返.

四分位差是上四分位数与下四分位数之差，也称为内距或四分间距。它主要用于测度顺序数据的离散程度。当然对于数值型数据也可以计算四分位差，但它不适合于分类数.

lz你好IQR = Q3 ? Q1 四分位距通常是用来构建箱形图，以及对概率分布的简要图表概述。对一个对称性分布数据(其中位数必然等于第三四分位数与第一四分位数的算术.

75、85、87、95、99、100、101、105、113、115、125 第一个四分位数：。

75 85 87 |95 99、100、101 105 | 113 115 125 分4段，100为中点 Q1=(87+95)/2=91 Q2=100 Q3=(105+113)/2=109 四分位数：将所有数值按大小顺序排列并分成四等份，.

嗯，最好举例说一下说得明了一点，用话自己的话解释一下，容易看懂一些各。

英语是quartile？ 你要问的是lower quartile和 upper quartile？将所有的样本从小到大排列，并分成四等份，处于三个分割点位置(是一个数值)的得分就是四分位数。最小.

下四分位数怎么求啊还有upper extreme和 lower extreme 怎么求，本人在美国。

四分位数(Quartile)，即统计学中，把所有数值由小到大排列并分成四等份，处于三个分割点位置的得分就是四分位数。 第一四分位数 (Q1)，又称“较小四分位数”，.

更多相关内容
• ）就像进行数据处理的时候，有时会遇到求极值（最大值、最小值）、平均值、中位数和四分位数（25%、 75%）的情况。 这一篇博客就是你的福音，让你绝对0基础使用python 进行数据分析。 1、下载py的环境。 这里引用一...
• 四分位数和百分位数 四分位数 (Quartiles) To calculate a quartile of a sample is in theory easy, and is much like calculating the median. The difficult part is the implementation; contrary to ...

四分位数和百分位数

## 四分位数 (Quartiles)

To calculate a quartile of a sample is in theory easy, and is much like calculating the median. The difficult part is the implementation; contrary to calculating the median, there exists no single specific method that stands above the rest or can be considered the "best" method among the about twenty known methods for calculating a quartile. The "best" method will be the method that fits the purpose or - in some areas - is considered a de-facto standard.

从理论上说，计算样本的四分位数很容易，并且很像计算中位数。 困难的部分是执行； 与计算中位数相反，在计算四分位数的大约二十种已知方法中，没有任何一种特定的方法可以胜过其他方法，也可以认为是“最佳”方法。 “最佳”方法将是适合目的的方法，或者在某些方面被认为是事实上的标准。

Why, how, and when to calculate quartiles using which method is outside the scope of this article. Many articles and even books covering this have been written. However, the day you face the task to calculate a quartile using some specific method, the functions here will help you.

为什么，如何以及何时 使用哪种方法来计算四分位数不在本文的讨论范围之内。 已经写了许多有关此的文章，甚至书籍。 但是，当您面对使用某种特定方法计算四分位数的任务时，此处的功能将为您提供帮助。

## 方法 (Methods)

It is quite hard to even obtain a list of known methods for calculating a quartile, not to say proven results from these. The best source, I've located (see bottom of the article), is quite old and lists 14 methods:

甚至很难获得已知的计算四分位数的方法的列表，更不用说从中得出的可靠结果。 我找到的最好的资源（请参阅本文的底部）已经很旧了，并列出了14种方法：

The additional six methods, I have located here and there. Unfortunately, the sources have vanished.

我已经在这里和那里找到了另外六个方法。 不幸的是，消息来源已经消失了。

If you are aware of any good source, please add a comment to the article.

如果您知道任何好的来源，请在文章中添加评论。

The methods have been collected as an enum including as in-line comments their names, applications, and sources, together with their basic calculation methods for the first and the third quartile (the second is always calculated as the median):

这些方法已作为一个枚举收集，其中包括它们的名称，应用程序和来源以及它们在第一第三四分位数中的基本计算方法（作为内联注释，以内嵌注释）（ 第二个始终以中位数计算）：

' Quartile calculation methods.
' Values equal those listed in the source. See function Quartile.
'
' Common names of variables used in calculation formulas.
'
' L: Q1, Lower quartile.
' H: Q3, Higher quartile.
' M: Q2, Median (not used here).
' n: Count of elements.
' p: Calculated position of quartile.
' j: Element of dataset.
' g: Decimal part of p to be used for interpolation between j and j+1.
'
Public Enum ApQuartileMethod
[_First] = 1

' Basic calculation methods.

' Step. Mendenhall and Sincich method.
'   SAS #3.
'   Round up to actual element of dataset.
'   L:  -Int(-n/4)
'   H: n-Int(-n/4)
apMendenhallSincich = 1

' Average step.
'   SAS #5, Minitab (%DESCRIBE), GLIM (percentile).    '
'   Add bias of one on basis of n/4.
'   L:   CLng((n+2)/2)/2
'   H: n-Clng((n+2)/2)/2
'   Note:
'       Replaces these original formulas that don't return the expected values.
'   L:   (Int((n+1)/4)+Int(n/4))/2+1
'   H: n-(Int((n+1)/4)+Int(n/4))/2+1
apAverage = 2

' Nearest integer to np.
'   SAS #2.
'   Round to nearest integer on basis of n/4.
'   L:   CLng(n/4)
'   H: n-CLng(n/4)
'   Note:
'       Replaces these original formulas that don't return the expected values.
'   L:   Int((n+2)/4)
'   H: n-Int((n+2)/4)
apNearestInteger = 3

' Parzen method.
'   Method 1 with interpolation.
'   SAS #1.
'   L: n/4
'   H: 3n/4
apParzen = 4

' Hazen method.
'   Values midway between method 1 steps.
'   GLIM (interpolate).
'   Wikipedia method 3.
'   Add bias of 2, don't round to actual element of dataset.
'   L: (n+2)/4
'   H: 3(n+2)/4-1
apHazen = 5

' Weibull method.
'   SAS #4. Minitab (DECRIBE), SPSS, BMDP, Excel exclusive.
'   Add bias of 1, don't round to actual element of dataset.
'   L: (n+1)/4
'   H: 3(n+1)/4
apWeibull = 6

' Freund, J. and Perles, B., Gumbell method.
'   S-PLUS, R, Excel legacy, Excel inclusive, Star Office Calc.
'   Add bias of 3, don't round to actual element of dataset.
'   L: (n+3)/4
'   H: (3n+1)/4
apFreundPerlesGumbell = 7

' Median Position.
'   Median unbiased.
'   L: (3n+5)/12
'   H: (9n+7)/12
apMedianPosition = 8

' Bernard and Bos-Levenbach.
'   L: (n/4)+0.4
'   H: (3n/4)/+0.6
'   Note:
'       Reference claims L to be (n/4)+0.31.
apBernardBosLevenbach = 9

' Blom's Plotting Position.
'   Better approximation when the distribution is normal.
'   L: (4n+7)/16
'   H: (12n+9)/16
apBlom = 10

' Moore's first method.
'   Add bias of one half step.
'   L: (n+0.5)/4
'   H: n-(n+0.5)/4
apMooreFirst = 11

' Moore's second method.
'   Add bias of one or two steps on basis of (n+1)/4.
'   L:   (Int((n+1)/4)+Int(n/4))/2+1
'   H: n-(Int((n+1)/4)+Int(n/4))/2+1
apMooreSecond = 12

' John Tukey's method.
'   Include median from odd dataset in dataset for quartile.
'   Wikipedia method 2.
'   L:   (1-Int(-n/2))/2
'   H: n-(-1-Int(-n/2))/2
apTukey = 13

' Moore and McCabe (M & M), variation of John Tukey's method.
'   TI-83.
'   Wikipedia method 1.
'   Exclude median from odd dataset in dataset for quartile.
'   L:   (Int(n/2)+1)/2
'   H: n-(Int(n/2)-1)/2
apTukeyMooreMcCabe = 14

' Additional variations between Weibull's and Hazen's methods, from
'   (i-0.000)/(n+1.00)
' to
'   (i-0.500)/(n+0.00)

' Variation of Weibull.
'   L: n(n/4-0)/(n+1)
'   H: n(3n/4-0)/(n+1)
apWeibullVariation = 15

' Variation of Blom.
'   L: n(n/4-3/8)/(n+1/4)
'   H: n(3n/4-3/8)/(n+1/4)
apBlomVariation = 16

' Variation of Tukey.
'   L: n(n/4-1/3)/(n+1/3)
'   H: n(3n/4-1/3)/(n+1/3)
apTukeyVariation = 17

' Variation of Cunnane.
'   L: n(n/4-2/5)/(n+1/5)
'   H: n(3n/4-2/5)/(n+1/5)
apCunnaneVariation = 18

' Variation of Gringorten.
'   L: n(n/4-0.44)/(n+0.12)
'   H: n(3n/4-0.44)/(n+0.12)
apGringortenVariation = 19

' Variation of Hazen.
'   L: n(n/4-1/2)/n
'   H: n(3n/4-1/2)/n
apHazenVariation = 20

[_Last] = 20
End Enum


The actual calculation methods have been tweaked a little to fit VBA and to correct for weird results when a sample consists of very few elements.

实际计算方法已进行了一些调整，以适合VBA并在样本包含很少元素的情况下纠正怪异的结果。

## 功能 (Functions)

The main function is named Quartile and has the native domain aggregate functions, DAvg etc., in mind as it takes an Expression, a Domain, and a Criteria (filter) as arguments. Other arguments are the quartile Part to return and the Method to use:

主函数被命名为四分位数，并具有本机域聚合函数DAvg等，因为它需要一个表达式，一个和一个条件 （过滤器）作为参数。 其他参数是要返回的四分位数部分和要使用的方法

Expression: Name of the field or an expression to analyse.
Domain    : Name of the source/query, or an SQL select query, to analyse.
Criteria  : Optional. A filter expression for Domain.
Part      : Optional. Which median/quartile or min/max value to return.
Default is the median value.
Method    : Optional. Method for calculation of lower/higher quartile.
Default is the method by Freund, Perles, and Gumbell (Excel). 


The function can be regarded to have four main parts:

该功能可以认为具有四个主要部分：

1. Build the SQL to retrieve the ordered samples

构建SQL以检索有序的样本
2. Calculate either the minimum or maximum value, the first or third quartile, or the median

计算最小值或最大值，第一或第三四分位数或中位数
3. Prepare for interpolation

准备插值
4. Calculate the final output

计算最终输出
Public Function Quartile( _
ByVal Expression As String, _
ByVal Domain As String, _
Optional ByVal Criteria As String, _
Optional ByVal Part As ApQuartilePart = ApQuartilePart.apMedian, _
Optional ByVal Method As ApQuartileMethod = ApQuartileMethod.apFreundPerlesGumbell) _
As Double

' SQL.
Const SqlMask           As String = "Select {0} From {1} {2}"
Const SqlLead           As String = "Select "
Const SubMask           As String = "({0}) As T"
Const FilterMask        As String = "Where {0} "
Const OrderByMask       As String = "Order By {0} Asc"

Dim Records     As DAO.Recordset

Dim Sql         As String
Dim SqlSub      As String
Dim Filter      As String
Dim Count       As Long     ' n.
Dim Position    As Double   ' p.
Dim Element     As Long     ' j.
Dim Interpolate As Double   ' g.
Dim ValueOne    As Double
Dim ValueTwo    As Double
Dim Value       As Double

' Return default quartile part if choice of part is
' outside the range of ApQuartilePart.
If Not IsQuartilePart(Part) Then
Part = ApQuartilePart.apMedian
End If

' Use a default calculation method if choice of method is
' outside the range of ApQuartileMethod.
If Not IsQuartileMethod(Method) Then
Method = ApQuartileMethod.apFreundPerlesGumbell
End If

If Domain <> "" And Expression <> "" Then
' Build SQL to lookup values.
If InStr(1, LTrim(Domain), SqlLead, vbTextCompare) = 1 Then
' Domain is an SQL expression.
Else
' Domain is a table or query name.
SqlSub = Domain
End If
If Trim(Criteria) <> "" Then
' Build Where clause.
End If
' Build final SQL.
Sql = Replace(Replace(Replace(SqlMask, "{0}", Expression), "{1}", SqlSub), "{2}", Filter) & _
Set Records = CurrentDb.OpenRecordset(Sql, dbOpenSnapshot)

With Records
If Not .EOF = True Then
If Part = ApQuartilePart.apMinimum Then
' No need to count records.
Count = 1
Else
' Count records.
.MoveLast
Count = .RecordCount
End If
Select Case Part
Case ApQuartilePart.apMinimum
' Current record is first record.
' Read value of this record.
Case ApQuartilePart.apMaximum
' Current record is last record.
' Read value of this record.
Case ApQuartilePart.apMedian
' Locate position of median.
Position = (Count + 1) / 2
Case ApQuartilePart.apLower
Select Case Method
Case ApQuartileMethod.apMendenhallSincich
Position = -Int(-Count / 4)
Case ApQuartileMethod.apAverage
Position = CLng((Count + 2) / 2) / 2
Case ApQuartileMethod.apNearestInteger
Position = CLng(Count / 4)
Case ApQuartileMethod.apParzen
Position = Count / 4
Case ApQuartileMethod.apHazen
Position = (Count + 2) / 4
Case ApQuartileMethod.apWeibull
Position = (Count + 1) / 4
Case ApQuartileMethod.apFreundPerlesGumbell
Position = (Count + 3) / 4
Case ApQuartileMethod.apMedianPosition
Position = (3 * Count + 5) / 12
Case ApQuartileMethod.apBernardBosLevenbach
Position = (Count / 4) + 0.4
Case ApQuartileMethod.apBlom
Position = (4 * Count + 7) / 16
Case ApQuartileMethod.apMooreFirst
Position = (Count + 0.5) / 4
Case ApQuartileMethod.apMooreSecond
Position = (Int((Count + 1) / 4) + Int(Count / 4)) / 2 + 1
Case ApQuartileMethod.apTukey
Position = (1 - Int(-Count / 2)) / 2
Case ApQuartileMethod.apTukeyMooreMcCabe
Position = (Int(Count / 2) + 1) / 2
Case ApQuartileMethod.apWeibullVariation
Position = Count * (Count / 4) / (Count + 1)
Case ApQuartileMethod.apBlomVariation
Position = Count * (Count / 4 - 3 / 8) / (Count + 1 / 4)
Case ApQuartileMethod.apTukeyVariation
Position = Count * (Count / 4 - 1 / 3) / (Count + 1 / 3)
Case ApQuartileMethod.apCunnaneVariation
Position = Count * (Count / 4 - 2 / 5) / (Count + 1 / 5)
Case ApQuartileMethod.apGringortenVariation
Position = Count * (Count / 4 - 0.44) / (Count + 0.12)
Case ApQuartileMethod.apHazenVariation
Position = Count * (Count / 4 - 1 / 2) / Count
End Select
Case ApQuartilePart.apUpper
' Default position for very low counts for several methods
Position = Count
Select Case Method
Case ApQuartileMethod.apMendenhallSincich
If Count > 2 Then
Position = Count - (-Int(-Count / 4))
End If
Case ApQuartileMethod.apAverage
If Count > 2 Then
Position = Count - CLng((Count + 2) / 2) / 2
End If
Case ApQuartileMethod.apNearestInteger
Position = Count - CLng(Count / 4)
Case ApQuartileMethod.apParzen
Position = 3 * Count / 4
Case ApQuartileMethod.apHazen
If Count > 1 Then
Position = 3 * (Count + 2) / 4 - 1
End If
Case ApQuartileMethod.apWeibull
If Count > 2 Then
Position = 3 * (Count + 1) / 4
End If
Case ApQuartileMethod.apFreundPerlesGumbell
Position = (3 * Count + 1) / 4
Case ApQuartileMethod.apMedianPosition
If Count > 2 Then
Position = (9 * Count + 7) / 12
End If
Case ApQuartileMethod.apBernardBosLevenbach
If Count > 2 Then
Position = (3 * Count / 4) + 0.6
End If
Case ApQuartileMethod.apBlom
If Count > 2 Then
Position = (12 * Count + 9) / 16
End If
Case ApQuartileMethod.apMooreFirst
Position = Count - (Count + 0.5) / 4
Case ApQuartileMethod.apMooreSecond
' Basic calculation method. Will fail for 2 or 3 elements.
'   Position = Count - (Int((Count + 1) / 4) + Int(Count / 4)) / 2 + 1
' Calculation method adjusted to accept 2 or 3 elements.
Position = Count - (Int((Count + Int((Count * 2) / (Count + 4))) / 4) + Int(Count / 4)) / 2 + 1
Case ApQuartileMethod.apTukey
Position = Count - (-1 - Int(-Count / 2)) / 2
Case ApQuartileMethod.apTukeyMooreMcCabe
If Count > 1 Then
Position = Count - (Int(Count / 2) - 1) / 2
End If
Case ApQuartileMethod.apWeibullVariation
Position = Count * (3 * Count / 4) / (Count + 1)
Case ApQuartileMethod.apBlomVariation
Position = Count * (3 * Count / 4 - 3 / 8) / (Count + 1 / 4)
Case ApQuartileMethod.apTukeyVariation
Position = Count * (3 * Count / 4 - 1 / 3) / (Count + 1 / 3)
Case ApQuartileMethod.apCunnaneVariation
Position = Count * (3 * Count / 4 - 2 / 5) / (Count + 1 / 5)
Case ApQuartileMethod.apGringortenVariation
Position = Count * (3 * Count / 4 - 0.44) / (Count + 0.12)
Case ApQuartileMethod.apHazenVariation
Position = Count * (3 * Count / 4 - 1 / 2) / Count
End Select
End Select
Select Case Part
Case ApQuartilePart.apMinimum, ApQuartilePart.apMaximum
Case Else
.MoveFirst
' Find position of first observation to retrieve.
' If Element is 0, then upper position is first record.
' If Element is not 0 and position is not an integer, then
' read the next observation too.
Element = Fix(Position)
Interpolate = Position - Element
If Count = 1 Then
' Nowhere else to move.
If Interpolate < 0 Then
' Prevent values to be created by extrapolation beyond zero from observation one
' for these methods:
'   ApQuartileMethod.apBlomVariation
'   ApQuartileMethod.apTukeyVariation
'   ApQuartileMethod.apCunnaneVariation
'   ApQuartileMethod.apGringortenVariation
'   ApQuartileMethod.apHazenVariation
'
' Comment this line out, if reading by extrapolation *is* requested.
Interpolate = 0
End If
ElseIf Element > 1 Then
' Move to the record to read.
.Move Element - 1
' Special case for apMooreSecond and upper quartile for 2 and 3 elements.
If .EOF Then
.MoveLast
End If
End If
End Select
' Retrieve value from first observation.
ValueOne = .Fields(0).Value

Select Case Part
Case ApQuartilePart.apMinimum, ApQuartilePart.apMaximum
Value = ValueOne
Case Else
If Interpolate = 0 Then
' Only one observation to read.
If Element = 0 Then
' Return 0.
Else
Value = ValueOne
End If
Else
If Element = 0 Or Element = Count Then
' No first/last observation to retrieve.
ValueTwo = ValueOne
If ValueOne > 0 Then
' Use 0 as other observation.
ValueOne = 0
Else
ValueOne = 2 * ValueOne
End If
Else
' Move to next observation.
.MoveNext
' Retrieve value from second observation.
ValueTwo = .Fields(0).Value
End If
' For positive values interpolate between 0 and ValueOne.
' For negative values interpolate between 2 * ValueOne and ValueOne.
' Calculate quartile using linear interpolation.
Value = ValueOne + Interpolate * CDec(ValueTwo - ValueOne)
End If
End Select
End If
.Close
End With
End If

Quartile = Value

End Function


Two important features are, that the Domain argument can be an SQL select query, and the samples in the passed records do not have to be sorted. The function will itself take care of sorting the samples.

两个重要功能是，Domain参数可以是SQL select查询 ，并且传递记录中的样本不必排序 。 该函数本身将负责对样本进行排序。

Thus, typical usages can be as listed here where the resulting SQL has been included for better understanding of the parsing of the Domain argument done by the function:

因此，典型用法可以列在此处，其中包括了生成SQL，以更好地理解函数完成的Domain参数的解析：

' Example calls and the internally generated SQL:
'
'   With fieldname as expression, table (or query) as domain, no filter, and default sorting:
'       Q1 = Quartile("Data", "Observation", , apFirst, apFreundPerlesGumbell)
'       Select Data From Observation Order By Data Asc
'
'   With two fieldnames as expression, table (or query) as domain, no filter, and sorting on two fields:
'       Q1 = Quartile("Data, Step", "Observation", , apFirst, apFreundPerlesGumbell)
'       Select Data, Step From Observation Order By Data, Step Asc
'
'   With fieldname as expression, SQL as domain, no filter, and default sorting:
'       Q1 = Quartile("Data", "Select Data From Observation", , apFirst, apFreundPerlesGumbell)
'       Select Data From (Select Data From Observation) As T Order By Data Asc
'
'   With fieldname as expression, SQL as domain, simple filter, and sorting on one field:
'       Q1 = Quartile("Data", "Select Data, Step From Observation", "Step = 10", apFirst, apFreundPerlesGumbell)
'       Select Data From (Select Data, Step From Observation) As T Where Step = 10 Order By Data Asc
'
'   With calculated expression, SQL as domain, extended filter, and sorting on one field:
'       Q1 = Quartile("Data * 10", "Select Data, Step From Observation", "Step = 10 And Data <= 40", apFirst, apFreundPerlesGumbell)
'       Select Data * 10 From (Select Data, Step From Observation) As T Where Step = 10 And Data <= 40 Order By Data * 10 Asc
'
'   With filtered SQL domain, additional filter, and sorting on one field:
'       Q1 = Quartile("Data", "Select Data, Step From Observation Where Step = 10", "Data <= 40", apFirst, apFreundPerlesGumbell)
'       Select Data From (Select Data, Step From Observation Where Step = 10) As T Where Data <= 40 Order By Data Asc
'
'   With filtered SQL domain, additional filter, and sorting on two fields:
'       Q1 = Quartile("Step, Data", "Select Data, Step From Observation Where Step = 10", "Data <= 40", apFirst, apFreundPerlesGumbell)
'       Select Step, Data From (Select Data, Step From Observation Where Step = 10) As T Where Data <= 40 Order By Step, Data Asc


Note please, that the function is heavily in-line documented as the code otherwise would be uncomprehensive.

请注意，该函数已大量内联文档，否则代码将不完整。

## 域功能 (Domain functions)

To ease the use, indeed in queries, two domain functions supplement the main function:

为了简化在查询中的使用，两个域函数补充了主要功能：

DMedian

DMedian

DQuartile

四分位数

These mimic the native Dxxx domain aggregate functions and take only the arguments needed, using default values - for DMedian, for the part to return and, for DQuartile, for the calculation method to use; that method has been chosen to be the original method used by Excel (formulas QUARTILE and QUARTILE.INCL):

它们模仿本地的Dxxx域聚合函数，并使用默认值仅接受所需的参数-对于DMedian，用于返回的部分，对于DQuartile，用于使用的计算方法； 该方法已被选为Excel所使用的原始方法（公式QUARTILE和QUARTILE.INCL）：

' Returns the median of a field of a table/query.
'
' Parameters:
'   Expression: Name of the field or an expression to analyse.
'   Domain    : Name of the source/query, or an SQL select query, to analyse.
'   Criteria  : Optional. A filter expression for Domain.
'
' Reference and examples: See function Quartile.
'
' Data must be in ascending order by Field.
'
' 2019-08-15. Gustav Brock, Cactus Data ApS, CPH.
'
Public Function DMedian( _
ByVal Expression As String, _
ByVal Domain As String, _
Optional ByVal Criteria As String) _
As Double

Dim Value       As Double

Value = Quartile(Expression, Domain, Criteria)

DMedian = Value

End Function

' Returns the upper or lower quartile or the median or the
' minimum or maximum value of a field of a table/query
' using the method by Freund, Perles, and Gumbell (Excel).
'
' Parameters:
'   Expression: Name of the field or an expression to analyse.
'   Domain    : Name of the source/query, or an SQL select query, to analyse.
'   Criteria  : Optional. A filter expression for Domain.
'   Part      : Optional. Which median/quartile or min/max value to return.
'               Default is the median value.
'
' Reference and examples: See function Quartile.
'
' 2019-08-15. Gustav Brock, Cactus Data ApS, CPH.
'
Public Function DQuartile( _
ByVal Expression As String, _
ByVal Domain As String, _
Optional ByVal Criteria As String, _
Optional ByVal Part As ApQuartilePart = ApQuartilePart.apMedian) _
As Double

Dim Value       As Double

Value = Quartile(Expression, Domain, Criteria, Part)

DQuartile = Value

End Function


## 结果 (Results)

An example workbook with generated results from the Excel formulas is attached for reference.

随附一个示例工作簿，其中包含从Excel公式生成的结果，以供参考。

It displays like this:

它显示如下：

The output from the function ListExcelQuartile, found in the attached Access example file, lists identical values.

在附件的Access示例文件中找到的ListExcelQuartile函数的输出列出了相同的值。

The two methods are our methods 7 and 6, or the enum elements apFreundPerlesGumbell and apWeibull:

这两种方法是我们的​​方法76，或者是枚举元素apFreundPerlesGumbellapWeibull

               100           99            98            97            96            95
INCLUDE (LEGACY)
7            25,75         25,50         25,25         25,00         24,75         24,50
7            50,50         50,00         49,50         49,00         48,50         48,00
7            75,25         74,50         73,75         73,00         72,25         71,50

EXCLUDE
6            25,25         25,00         24,75         24,50         24,25         24,00
6            50,50         50,00         49,50         49,00         48,50         48,00
6            75,75         75,00         74,25         73,50         72,75         72,00 


Likewise, the function ListFirstQuartile returns an output similar to the results from the main source (table H-4 at top):

同样，函数ListFirstQuartile返回的输出类似于主源的结果（顶部的表H-4）：

               40            50            60            70
1            10,00         20,00         20,00         20,00
2            15,00         20,00         20,00         20,00
3            10,00         10,00         20,00         20,00
4            10,00         12,50         15,00         17,50
5            15,00         17,50         20,00         22,50
6            12,50         15,00         17,50         20,00
7            17,50         20,00         22,50         25,00
8            14,17         16,67         19,17         21,67
9            14,00         16,50         19,00         21,50
10           14,38         16,88         19,38         21,88
11           11,25         13,75         16,25         18,75
12           20,00         20,00         20,00         25,00
13           15,00         20,00         20,00         25,00
14           15,00         15,00         20,00         20,00
15           8,00          10,42         12,86         15,31
16           5,88          8,33          10,80         13,28
17           6,15          8,59          11,05         13,52
18           5,71          8,17          10,65         13,13
19           5,44          7,91          10,39         12,88
20           5,00          7,50          10,00         12,50

100           99            98            97            96            95
1            25,00         25,00         25,00         25,00         24,00         24,00
2            25,50         25,00         25,00         25,00         24,50         24,00
3            25,00         25,00         24,00         24,00         24,00         24,00
4            25,00         24,75         24,50         24,25         24,00         23,75
5            25,50         25,25         25,00         24,75         24,50         24,25
6            25,25         25,00         24,75         24,50         24,25         24,00
7            25,75         25,50         25,25         25,00         24,75         24,50
8            25,42         25,17         24,92         24,67         24,42         24,17
9            25,40         25,15         24,90         24,65         24,40         24,15
10           25,44         25,19         24,94         24,69         24,44         24,19
11           25,13         24,88         24,63         24,38         24,13         23,88
12           26,00         25,50         25,00         25,00         25,00         24,50
13           25,50         25,50         25,00         25,00         24,50         24,50
14           25,50         25,00         25,00         24,50         24,50         24,00
15           24,75         24,50         24,25         24,00         23,75         23,50
16           24,56         24,31         24,06         23,81         23,56         23,31
17           24,58         24,33         24,08         23,83         23,58         23,33
18           24,55         24,30         24,05         23,80         23,55         23,30
19           24,53         24,28         24,03         23,78         23,53         23,28
20           24,50         24,25         24,00         23,75         23,50         23,25         


Note please, that column 100-96 here contain the correct values, while in Table H-4 they hold the values for samples 99-95.

请注意，此处的100-96列包含正确的值，而在表H-4中，它们保留了样本99-95的值。

The two small examples found on Wikipedia display the results using three different methods which equal our methods 14, 13, and 5 respectively, or the enum elements apTukeyMooreMcCabe, apTukey, and apHazen:

维基百科上发现的两个小的例子显示使用，它们分别等于我们的方法14，图13，图5三种不同的方法，或枚举元素apTukeyMooreMcCabe，apTukey和 apHazen结果

#### 例子2 (Example 2)

These can be reproduced by the function ListWikipediaSamples:

这些可以由功能ListWikipediaSamples复制

              Method 1      Method 2      Method 3

Q1             15            25,5          20,25
Q2             40            40            40
Q3             43            42,5          42,75

Q1             15            15            15
Q2             37,5          37,5          37,5
Q3             40            40            40 


Also, a query, FirstQuartileAllMethods, is included which will list the results for all sets of samples between 1 and 100 for all 20 methods for the lower quartile. Here's a snip:

此外，还包含一个查询FirstQuartileAllMethods ，它将针对下四分位数的所有20种方法列出1至100之间的所有样本集的结果。 这是一个片段：

Finally, a form is included which lets you select any method and then have the results for all three quartiles for every sample between 1 and 100 listed:

最后，包含一个表格，您可以选择任何方法，然后列出列出的1至100之间的每个样本的所有三个四分位数的结果：

## 实作 (Implementation)

To be able to calculate quartiles, import the module QuartileCode in your application. That's all.

为了能够计算四分位数，请在您的应用程序中导入模块QuartileCode 。 就这样。

The other module, QuartileDemo, is only needed for testing and for the demo form (also named QuartileDemo) to display.

其他模块QuartileDemo仅用于测试和显示的演示表单（也称为QuartileDemo）。

Bonus tip: Study the form's code to see how to right-align numbers in a Listbox column.

温馨提示： 研究表单的代码以查看如何在“列表框”列中将数字右对齐。

## 结论 (Conclusion)

From the sparse sources to be located, a function has been created that for just about any practical purpose will allow for the quartiles of a sample of records to be calculated by twenty different methods.

从要定位的稀疏源中创建了一个函数，该函数几乎可以用于任何实际目的，从而可以通过二十种不同的方法来计算记录样本的四分位数。

In addition, simplified functions intended to supplement the native domain aggregate functions have been presented. Also, a collection of functions and a query for testing and demonstration have been included.

另外，已经提出了旨在补充本地域聚合功能的简化功能。 此外，还包括功能集合以及用于测试和演示的查询。

## 资料来源 (Sources)

Original source (now off-line) by David A. Heiser: http://www.daheiser.info/excel/notes/NOTE%20N.pdf

David A. Heiser的原始资源（现已离线）： http ://www.daheiser.info/excel/notes/NOTE%20N.pdf

Archived source at The Internet Archive: NOTE 20

Internet存档中的存档源： NOTE 20

Notes:

笔记：

1. Table H-4, p. 4, has correct data for the dataset for 1-96 while the datasets for 1-100 to 1-97 actually are the datasets for 1-99 to 1-96 shifted one column left. Thus, the dataset for 1-100 is missing, and that for 1-96 is listed twice.

表H-4，第6页。 4，具有1-96数据集的正确数据，而1-100到1-97的数据集实际上是1-99到1-96的数据集向左移动了一列。 因此，缺少1-100的数据集，并且两次列出了1-96的数据集。
2. Method 3b is not implemented as no one seems to use it. Neither is no example data given. Thus method 3a has here been labeled method

方法3b未实现，因为似乎没有人使用它。 没有给出示例数据。 因此，方法3a在这里被标记为方法

Further notes on quartiles and methods can be found here:

有关四分位数和方法的更多说明，请参见：

Wikipedia

Math Forum

HaiWeb

murdoch.edu.au (archived)

Should you be aware of any good source that can supplement or improve this article, please do not hesitate posting a link as comment.

如果您知道可以补充或改进本文的任何好的资源，请不要犹豫发布链接作为评论。

The full and current code is available for download at GitHub: VBA.Quartiles

完整和当前的代码可从GitHub下载： VBA.Quartiles

Also, code and a demo application is here: Quartiles 1.0.1.zip

另外，代码和演示应用程序也在这里： Quartiles 1.0.1.zip

An Excel workbook with the presented example: Quartiles.xlsx

一个带有示例的Excel工作簿： Quartiles.xlsx

希望本文对您有所帮助。 鼓励您在下面提出问题，报告任何错误或对此作出任何其他评论。

注意 ：如果您需要有关此主题的更多“支持”，请考虑使用Experts Exchange 的“提问”功能。 我会监督提出的问题，并很高兴与其他电子工程师一起为以这种方式提出的问题提供所需的任何其他支持。

如果您认为本文对EE成员有用且有价值，请不要忘记按下“竖起大拇指”按钮。

四分位数和百分位数

展开全文  python java 大数据 机器学习
• 数据库查询，最小值，最大值，平均值，上四分位数，中位数，下四分位数

数据库求分组后求，平均成数，上四分位数，下四分位数，中位数

1. 新建成绩表，包含字段（序号，班级id，学科名称，成绩，学生id）
2. 查询班级id（CLASSID ）为9的各个学科名称的平均成数，上四分位数，下四分位数，中位数
"""先创建一个数表SUBJECT，插入数据"""
import pymysql,random
dbs = pymysql.connect(host='localhost', user='root', password='root', db='demo', port=3306)
db = dbs.cursor()
# 创建成绩表 序号，班级id，学科名称，成绩，学生id
db.execute("""
CREATE TABLE SUBJECT(
ID INT(3) PRIMARY KEY,
CLASSID INT(10),
SUBJECTNAME VARCHAR(20),
STATYHOUR float(4),
GARDEID int(10))
"""
)
# 插入数据
a = 0
CLASSIDS = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
SUBJECTNAMEs = ['python','java','c','c++','mysql','db','gauss']
while a<=1000: # 插入1000条数据
a += 1
CLASSID = CLASSIDS[random.randint(0,len(CLASSIDS)-1)]
SUBJECTNAME = SUBJECTNAMEs[random.randint(0,len(SUBJECTNAMEs)-1)]
STATYHOUR = round(random.uniform(1,150),2)
GARDEID = random.randint(1,100)
sql = "insert into SUBJECT(ID,CLASSID,SUBJECTNAME,STATYHOUR,GARDEID) values ({0},{1},'{2}',{3},{4});".format(a,CLASSID, SUBJECTNAME, STATYHOUR, GARDEID)
db.execute(sql)
dbs.commit()
print(a)
db.close()
dbs.close()


先模拟一组数据库数据分析上4：excel表格使用函数 =QUARTILE(B$3:B4,1) 分析发现： 当行号 N==4的倍数+1的时候，得出的上四分位数恰好为当前行号，理解为：行号R = (N-1)/4+1 ，校验数据，若N=5得出R=2， 当N=9得出R=3， N=1得出R=1 若N非4的倍数+1，这种情况下计算套用推理公式（R = (N-1)/4*1+1）： 若N=6，得出R=2.25 ，相近行号对应2,3 ； 套用数据1，行号2,3对应的数据为2,3， 上四分位数为1,25，而当前位置和差为1， 推测出： 上四分位数==（模拟数据（最大值）-模拟数（最小值））×（R-最小行号）+模拟数最小值 1. 验证 若N=6 得出R=2.25 — 模拟数据3对上四分位数 = （33-12）×（2.25-2）+12=17.25 若N=7 得出R=2.5 — 模拟数据3对上四分位数 = （33-12）×（2.5-2）+12=22.5 若N=8 得出R=2.75 — 模拟数据3对上四分位数 = （33-12）×（2.75-2）+12=27.25 若N=10 得出R = 3.25 — 模拟数据1对上四分位数== … == 3.25 若N=11 得出R = 3.5 — 模拟数据1对上四分位数== … == 3.5 若N=12 得出R = 3.75 — 模拟数据1对上四分位数== … == 3.75 … 和表格数据校验该方程成立 第一次写上四分位数sql： with t1 as( select * from SUBJECT where CLASSID = 9 ), t2 as ( select SUBJECTNAME as b, round((case count(STATYHOUR) when '1' then max(STATYHOUR) else ( (max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*1+1)-min(r))+min(STATYHOUR) ) end ),2)as lowerQuartile from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where r >=((n-1)/4*1+1-0.75) and r<=((n-1)/4*1+1+0.75) )as t2 group by SUBJECTNAME) select * from t2; /* c 29.8 c++ 47.72 db 29.18 gauss 13.55 java 58.96 python 59.52 */  --优化后sql，去除case ... else ...验证结果也正确，这样可以少两行代码 with t1 as( select * from SUBJECT where CLASSID = 9 ), t2 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*1+1)-min(r))+min(STATYHOUR)),2)as lowerQuartile from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where r >=((n-1)/4*1+1-0.75) and r<=((n-1)/4*1+1+0.75) )as a2 group by SUBJECTNAME) select * from t2; /* c 29.8 c++ 47.72 db 29.18 gauss 13.55 java 58.96 python 59.52 */  分析尝试R = (N-1)/41+1==上四分，那么中位数换位2，下四分换位*3，尝试SQL with t1 as( select * from SUBJECT where CLASSID = 9 ), t2 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*1+1)-min(r))+min(STATYHOUR)),2)as lowerQuartile from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where r >=((n-1)/4*1+1-0.75) and r<=((n-1)/4*1+1+0.75) )as a2 group by SUBJECTNAME), t3 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*2+1)-min(r))+min(STATYHOUR)),2)as median from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where (r >=((n-1)/4*2+1-0.75) and r<=((n-1)/4*2+1+0.75)) )as a2 group by SUBJECTNAME), t4 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*3+1)-min(r))+min(STATYHOUR)),2)as upperQuartitle from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where (r >=((n-1)/4*3+1-0.75) and r<=((n-1)/4*3+1+0.75)) )as a2 group by SUBJECTNAME) select t2.b as SUBJECTNAME,lowerQuartile,median,upperQuartitle from t2,t3,t4 where t2.b=t3.b and t3.b=t4.b; /* c 29.8 50.36 107.87 c++ 47.72 73.86 112.25 db 29.18 60.29 137.41 gauss 13.55 29.61 92.71 java 58.96 107.58 127.41 python 59.52 71.44 112.83 */  SQL执行成功啦，可是，返回的结果是否正确？ 通过结果 推算公式 编写sql以后继续验证该结果是否正确； 这种情况下第一想到的是excel表格，突然想起来，我是一名测试，那么，是否可以尝试python的库呢？ # 创建一个数组，查是否可以找到对应的分位数 # pip install numpy 安装库 import numpy as np number = [1,2,3,4,5] z = np.percentile(number,(25,50,75), interpolation='midpoint') print(z) print(type(z)) >> [2. 3. 4.] >> <class 'numpy.ndarray'>  python实现校验：  import numpy as np import pymysql calculated_value={} #存放数据库查询的分位数 verify_calculated_value={} #存放数据库查询的分位数 class MySqlS(): """创建一个类，用来校验数据库获取的上四分数，中四分数，下四分数是否一致""" def __init__(self): self.dbs = pymysql.connect(host='localhost', user='root', password='root', db='demo', port=3306) self.db = self.dbs.cursor() def __del__(self): self.db.close() self.dbs.close() def query_table_data(self): """查询要验证的sql数据，得出：科目名称（SUBJECTNAME）：{科目名称（SUBJECTNAME）,上四分数（lowerQuartile）,中位数（median）,下四分数（upperQuartitle）}}""" sql = """with t1 as( select * from SUBJECT where CLASSID = 9 ), t2 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*1+1)-min(r))+min(STATYHOUR)),2)as lowerQuartile from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where r >=((n-1)/4*1+1-0.75) and r<=((n-1)/4*1+1+0.75) )as t2 group by SUBJECTNAME), t3 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*2+1)-min(r))+min(STATYHOUR)),2)as median from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where (r >=((n-1)/4*2+1-0.75) and r<=((n-1)/4*2+1+0.75)) )as t2 group by SUBJECTNAME), t4 as ( select SUBJECTNAME as b, round(((max(STATYHOUR)-min(STATYHOUR))*(((max(n)-1)/4*3+1)-min(r))+min(STATYHOUR)),2)as upperQuartitle from( select * from ( select *, row_number() over(partition by SUBJECTNAME order by STATYHOUR) as r, count(*) over(partition by SUBJECTNAME) as n from t1 ) as a1 where (r >=((n-1)/4*3+1-0.75) and r<=((n-1)/4*3+1+0.75)) )as t2 group by SUBJECTNAME) select t2.b as SUBJECTNAME,lowerQuartile,median,upperQuartitle from t2,t3,t4 where t2.b=t3.b and t3.b=t4.b; """ self.db.execute(sql) temp_val_1 = self.db.fetchall() for i in temp_val_1: calculated_value_temp = {i:{ "SUBJECTNAME":i,"lowerQuartile":i,"median":i,"upperQuartitle":i}} calculated_value.update(calculated_value_temp) def query_sql_data(self): """ 查询验证的数值，得出 {科目名称（SUBJECTNAME}：{科目名称（SUBJECTNAME）,上四分数（lowerQuartile）,中位数（median）,下四分数（upperQuartitle）}}""" for SUBJECTNAME in calculated_value.keys(): sql = "select STATYHOUR from SUBJECT where CLASSID = 9 and SUBJECTNAME = '{0}'order by STATYHOUR asc;".format(SUBJECTNAME) self.db.execute(sql) temp_val_1 = self.db.fetchall() SUBJECTNAMES = [] for i in temp_val_1: SUBJECTNAMES +=list(i) z = np.percentile(SUBJECTNAMES, (25, 50, 75), interpolation='midpoint') lowerQuartile = round(z,2) median = round(z,2) upperQuartitle = round(z,2) verify_calculated_value_temps = {SUBJECTNAME :{"SUBJECTNAME":SUBJECTNAME,"lowerQuartile":lowerQuartile,"median":median,"upperQuartitle":upperQuartitle}} verify_calculated_value.update(verify_calculated_value_temps) def verify_sql_data(self): """校验数据""" for SUBJECTNAME in calculated_value.keys(): try: assert calculated_value.get(SUBJECTNAME).get('lowerQuartile')==verify_calculated_value.get(SUBJECTNAME).get('lowerQuartile'),\ '{0}=={1}报错\t{2}'.format(calculated_value.get(SUBJECTNAME).get('lowerQuartile'),verify_calculated_value.get(SUBJECTNAME).get('lowerQuartile'),(calculated_value.get(SUBJECTNAME),verify_calculated_value.get(SUBJECTNAME))) assert calculated_value.get(SUBJECTNAME).get('median')==verify_calculated_value.get(SUBJECTNAME).get('median'),\ '{0}=={1}报错\t{2}'.format(calculated_value.get(SUBJECTNAME).get('median'),verify_calculated_value.get(SUBJECTNAME).get('median'),(calculated_value.get(SUBJECTNAME),verify_calculated_value.get(SUBJECTNAME))) assert calculated_value.get(SUBJECTNAME).get('upperQuartitle')==verify_calculated_value.get(SUBJECTNAME).get('upperQuartitle'),\ '{0}=={1}报错\t{2}'.format(calculated_value.get(SUBJECTNAME).get('upperQuartitle'),verify_calculated_value.get(SUBJECTNAME).get('upperQuartitle'),(calculated_value.get(SUBJECTNAME),verify_calculated_value.get(SUBJECTNAME))) except AssertionError as err: print(err) if __name__ == '__main__': MySqlS().query_table_data() MySqlS().query_sql_data() MySqlS().verify_sql_data() >> 47.72==44.49报错 ({'SUBJECTNAME': 'c++', 'lowerQuartile': 47.72, 'median': 73.86, 'upperQuartitle': 112.25}, {'SUBJECTNAME': 'c++', 'lowerQuartile': 44.49, 'median': 73.86, 'upperQuartitle': 112.65}) >> 13.55==14.71报错 ({'SUBJECTNAME': 'gauss', 'lowerQuartile': 13.55, 'median': 29.61, 'upperQuartitle': 92.71}, {'SUBJECTNAME': 'gauss', 'lowerQuartile': 14.71, 'median': 29.62, 'upperQuartitle': 81.44}) >> 59.52==59.53报错 ({'SUBJECTNAME': 'python', 'lowerQuartile': 59.52, 'median': 71.44, 'upperQuartitle': 112.83}, {'SUBJECTNAME': 'python', 'lowerQuartile': 59.53, 'median': 71.44, 'upperQuartitle': 112.83})  ok，执行，然后，断言出错了： 执行sql，分别找出这几条数据： select STATYHOUR from SUBJECT where CLASSID = 9 and SUBJECTNAME= 'c++' order by STATYHOUR asc; select STATYHOUR from SUBJECT where CLASSID = 9 and SUBJECTNAME= 'gauss' order by STATYHOUR asc; select STATYHOUR from SUBJECT where CLASSID = 9 and SUBJECTNAME= 'python' order by STATYHOUR asc;  数据粘贴到excel： 得到的中位数和我执行查询到的中位数是 一致的，把数据粘贴到python里，查询中 import numpy as np c = [21.86,38.03,50.95,72.3,75.42,111.86,113.43,147.58,21.86,38.03,50.95,72.3,75.42,111.86,113.43,147.58] gauss=[1.59,3.52,12.4,17.01,20.29,38.94,58.88,103.99,144.65,144.93] python = [16.74,19.82,59.21,59.84,66.57,71.44,87.05,111.84,113.82,122.19,123.31] c_lowerQuartile = np.percentile(c,(25), interpolation='midpoint') gauss_lowerQuartile =np.percentile(gauss,(25), interpolation='midpoint') python_lowerQuartile = np.percentile(python,(25), interpolation='midpoint') print(c_lowerQuartile) print(gauss_lowerQuartile) print(python_lowerQuartile) >> 44.49 >> 14.705000000000002 >> 59.525000000000006  现在得到的问题是，得到的上四分数， excel算出来的和python（numpy ）算出来的数值不一致； 暂时推测： excel上四分 = （模拟数据（最大值）-模拟数（最小值））×（R-最小行号）+模拟数最小值 python上四分 = （模拟数据（最大值）+模拟数（最小值））*（R-最小行号） 两个公式下得到的结果不一致， 细推，我的数据不止这三组，其他的数据为什么断言就成功了，找出成功的数据，试试规则； 在计算上次执行成功的数据的时候，遇到了一个新问题，第二次执行断言测试，班级java也出现了断言失败；错误数由上次的三个变成了4个。  >> 47.72==44.49报错 ({'SUBJECTNAME': 'c++', 'lowerQuartile': 47.72, 'median': 73.86, 'upperQuartitle': 112.25}, {'SUBJECTNAME': 'c++', 'lowerQuartile': 44.49, 'median': 73.86, 'upperQuartitle': 112.65}) >> 13.55==14.71报错 ({'SUBJECTNAME': 'gauss', 'lowerQuartile': 13.55, 'median': 29.61, 'upperQuartitle': 92.71}, {'SUBJECTNAME': 'gauss', 'lowerQuartile': 14.71, 'median': 29.62, 'upperQuartitle': 81.44}) >> 58.96==58.22报错 ({'SUBJECTNAME': 'java', 'lowerQuartile': 58.96, 'median': 107.58, 'upperQuartitle': 127.41}, {'SUBJECTNAME': 'java', 'lowerQuartile': 58.22, 'median': 107.58, 'upperQuartitle': 128.8}) >> 59.52==59.53报错 ({'SUBJECTNAME': 'python', 'lowerQuartile': 59.52, 'median': 71.44, 'upperQuartitle': 112.83}, {'SUBJECTNAME': 'python', 'lowerQuartile': 59.53, 'median': 71.44, 'upperQuartitle': 112.83})  简单输入 z = [1,2,3,4,5,6] x= np.percentile(z,(25), interpolation='midpoint') print(x) >> 2.5 进程已结束，退出代码为 0 ` 这里实际的上四分口算应该是2.25 这种情况下暂时定义excel的结果是正确的，本人编写的sql找到的结果是正确的。 后续再查找numpy库的逻辑，看看这个区别在哪里 展开全文  数据库 mysql • 2009年第20期中国高新技术企业NO.20.2009总第131期Chinesehi-techenterprisesCumulativetyNO.131统计学中四分位数的计算张云华江西财经职业学院江西 • 脚本的第一行包含一个示例数据集。... 吝啬的1-sigma（标准偏差） 中位数第一个四分位数（第 25 个百分位数） 第二个四分位数（第 50 个百分位数） 第三四分位数（第 75 个百分位数） 第 k 个百分位智商标准识别码 matlab • ## 四分位数 千次阅读 2019-09-30 10:41:42 四分位数是统计学里一个很重要的概念，实际应用中，所画出来的箱图，就使用到了这个概念，只有懂了四分位的概念才能看懂箱图所表达的意思。我这里通过一个实际的案例来说明四分位数的求取过程。 首先我们看下数据的... 四分位数是统计学里一个很重要的概念，实际应用中，所画出来的箱图，就使用到了这个概念，只有懂了四分位的概念才能看懂箱图所表达的意思。我这里通过一个实际的案例来说明四分位数的求取过程。 首先我们看下数据的情况，如下图所示，数据的总个数为10个 1、在求取四分位数据时，首先必须做的是要对数据进行升序排序，如下图。 2、四分位求取，首先需要得出该四分位数的位置，如下是四分位数在排序后数据中的位置的公式 在Excel中使用以上的公式来计算第0、1、2、3、4个四分位值处的位置，得出如下结果 3、四分位数的求取，使用的是以下所说明的一套计算规则 公式可能看起来较为复杂，说明一下： 1、四分位数最终的结果由2部分相加得到，其一是四分位位置的整数部分对应的数据，其二是四分位位置的小数部分乘以差值得到 2、差值是四分位位置整数部分对应数据的下一个数据减去四分位位置整数部分对应的数据。 在Excel中将上面那个公式实现的结果如下： 注意：如果数据的总个数n减去1，得到的结果是4的整数倍的话，那么四分位数的位置都是整数值，各四分位数就可以在升序排序后的数列中直接找到，否则就需要通过以上方式计算小数部分。 例如：n的值为5、9、13等等，就是可以在数列中直接找到各四分位数。 最后我们用一张图来看看箱图中各个图形元素： 1、Q1表示第1个四分位值，Q2表示第2个四分位值，即中位数，Q3表示第3个中位数 2、IQR是第3个中位数与第1个中位数的差值 3、虚线最左侧的划线表示Q1-1.5倍IQR，虚线最右侧的划线表示Q1+1.5倍IQR 4、左右的黑点是离群点，最左侧的离群点是最小值，最右侧离群点是最大值 从以上结果可以解答以往对四分位数的误区： 1、四分之一分位数并非中位数的2倍，也并非是四分之三分为的3倍 2、不要将箱图虚线左右侧的划线理解为四分之0分位，四分之四分位值，这2个分位值分别是最小值和最大值，它们可能会成为离群点 转载于:https://www.cnblogs.com/alexywt/p/11408460.html 展开全文 • # 统计列表中数据的四分位数 import pandas as pd import numpy as np import math # 转换为dataframe数据框形式 df = pd.read_csv('test.csv', sep=',', header=None) # 提取该列数据转换为list形式 data1 = df.... python • 四分位数有三个，第一个四分位数称为下四分位数，第二个四分位数就是中位数，第三个四分位数称为上四分位数，分别用Q1、Q2、Q3表示。 统计学解释 四分位数位置的确定方法有两种。其一是Excel函数QUARTILE.EXC的方法... python 四分位 • I have a list of numbers [1, 2, 3, 4, 5, 6, 7] and I want to have a function to return the interquartile range of this list of numbers. The interquartile range is the difference between the upper and ... • 熟练了以后使用excel就可以提高我们的办公效率了，接下来就给大家讲讲四分位数在excel中该怎么计算。操作步骤如下:1.打开excel表格打开需要进行计算四分位数的excel表格，选择要计算的数据单元格，在最小值后面的... • 1、箱型图-四分位数 箱型图过滤异常数据，需要先计算上四分位数和下四分位数，然后再计算最小min、最大值max，获得判断异常值的阈值范围[min, max]。 四分位数是通过3个点（Q1，Q2，Q3）将全部数据等分为4部分，其中... python 箱型图 • Oracle的四分位数函数 --1) 创建测试表 CREATE TABLE palan.quartitles_test( age VARCHAR(255) ); --2）插入测试数据 INSERT INTO palan.quartitles_test VALUES('6'); INSERT INTO palan.quartitles_test VALUES... 数据挖掘 人工智能 oracle • ## 中位数，四分位数 千次阅读 2021-04-19 04:05:40 三、Excel，Matlab求四分位数 先说Excel: MEDIAN(array)中位数 QUARTILE(array,quart) 第二参数为：0--最小值，相当于min 1--25%的值 2--50%的值，相当于Median ​3--75%的值 4--最大值，相当于max​ Percentile ... • 在说四分位数之前，我们先看下什么是中位数。中位数计算分为2步：你可能听这个计算过程有点头疼，没关系，下面图片我举个具体的例子来看下如何计算出中位数第2步，计算中间位置。数据总数4是偶数，中间位置的数就是... • | 19| 001| 94| | 20| 001| 220| +--------------+--------------+-------------+ 我想计算每个“ID2”的高度四分位数，并根据以下标准将其分为高、中或短： ^{pr2}$ 我在调查pyspark.sql模块找到了一个summary（）...
• 关于四分位数的两种求法 在数据导论课上，我们学习了如何求解四分位数的方法，其实操作起来也不难先用 (n+1) / 4 * i 计算出四分位数的位置，再求出该位置上的数的值即可。如一组数据 【1，3，6，8，10】 根据公式先...
• 简介编辑四分位数间距：由P25、P50、P75将一组变量值等分为四部分，P25称下四分位数，P75称上四分位数，将P75与P25之差定义为四分位数间距。是上四分位数与下四分位数之差,用四分位数间距可反映变异程度的大小.即：...
• ## 四分位数计算过程

千次阅读 2020-12-01 01:32:20
今天在学统计学，被一个 四分位数搞得焦头烂额，网上各种不靠谱，在这里提一句(垃圾百度)最后通过各种途径找方法总结了下面这篇文章第一次写就当是个人的记录吧。其实无论是python(describle方法)还是excel的...
• ## 四分位数SQL实现

千次阅读 2021-02-22 08:42:29
四分位数(Quartile),即统计学中,把所有数值由小到大排列并分成四等份,处于三个分割点位置的得分就是四分位数 第一四分位数 (Q1),又称'较小四分位数',等于该样本中所有数值由小到大排列后第25%的数字 第二四分位数 ...
• ## 四分位数介绍

千次阅读 2019-12-25 00:02:38
描述统计学就是将一系列复杂的数据减少为几个能够起到描述作用的数字，用这些有代表性的数字来代表所有的数据，其中有4个很重要的知识点，分别是平均值（μ）、四分位数、标准差（σ）、标准分（z） 四分位数简介 ...
• 输入用空格、制表符、回车符或(英文半角)逗号隔开的数据序列后点击计算，可求其元素数、从小到大排序、四分数位置、四分位数四分位数间距等结果。操作步骤：直接输入或复制记录表中的数据，粘贴到输入框，点击计算...
• 金额贡献的四分位 int[] param = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12,13}; // BigDecimal[] datas = new BigDecimal[param.length]; for (int i = 0; i < param.length; i++) { datas[i] = BigDecimal.... java 开发语言 后端
• 数据分组后有几组只有2个值和1个值的，想问一下这种怎么表示四分位数啊！ 1.比如，有的组只有两个值，0.62 和0.9 ，用SPSS算出来是 P25为0.62，P50 为0.76，P75没有，我的表示方法是 0.76（0.62,-）,这个-就是没有...
• 最近学习python数据分析，遇到了四分位数计算问题，因四分位数计算公式不一致，导致结果不一样，坑爹的百度只给了一种计算方法，容易迷惑初学者，故总结如下：计算方法三个四分位数的确定：先按从小到大方法排序，...
• 本期给大家介绍的是数据分析基础系列，主要给大家介绍的是四分位数的原理与应用，四分位数的计算方式，并基于四分位数，画出箱体图，简要介绍如何通过箱体图来检测数据离群值。结合学习成绩与收入的案例分析，内容...
• ## 四分位数怎么算

千次阅读 2021-03-11 17:14:08
展开全部1、将数据从小到大排序，计为数组a(1 to n)，n代表数据的长度2、确定四分位数的位置：b= 1+(n-1) × 0.25= 2.25，b的整数部e5a48de588b662616964757a686964616f31333366303130分计为c b的小数部分计为d计算...
• 函数原型DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation=’linear’)参数- q : float or array-like, default 0.5 (50% quantile 即中位数-第2四分位数)0
• 总结一个小知识，仅供参考。 1、数列项为奇数时： 3、5、9、11、17、19、35 先计算位置，在通过位置计算对应的数值 ...当下标正好为整数时，对应的数值为Q1=5、Q2=11、Q3=19 ...Q2：(n+1)*...  ...