精华内容
下载资源
问答
  • 四分位数

    千次阅读 2019-09-30 10:41:42
    四分位数是统计学里一个很重要的概念,实际应用中,所画出来的箱图,就使用到了这个概念,只有懂了四分位的概念才能看懂箱图所表达的意思。我这里通过一个实际的案例来说明四分位数的求取过程。 首先我们看下数据的...

    四分位数是统计学里一个很重要的概念,实际应用中,所画出来的箱图,就使用到了这个概念,只有懂了四分位的概念才能看懂箱图所表达的意思。我这里通过一个实际的案例来说明四分位数的求取过程。

     

    首先我们看下数据的情况,如下图所示,数据的总个数为10个

    1、在求取四分位数据时,首先必须做的是要对数据进行升序排序,如下图。

    image

     

    2、四分位求取,首先需要得出该四分位数的位置,如下是四分位数在排序后数据中的位置的公式

    image

    在Excel中使用以上的公式来计算第0、1、2、3、4个四分位值处的位置,得出如下结果

    image

     

    3、四分位数的求取,使用的是以下所说明的一套计算规则

    image

    公式可能看起来较为复杂,说明一下:

    1、四分位数最终的结果由2部分相加得到,其一是四分位位置的整数部分对应的数据,其二是四分位位置的小数部分乘以差值得到

    2、差值是四分位位置整数部分对应数据的下一个数据减去四分位位置整数部分对应的数据。

     

    在Excel中将上面那个公式实现的结果如下:

    image

     

    注意:如果数据的总个数n减去1,得到的结果是4的整数倍的话,那么四分位数的位置都是整数值,各四分位数就可以在升序排序后的数列中直接找到,否则就需要通过以上方式计算小数部分。

    例如:n的值为5、9、13等等,就是可以在数列中直接找到各四分位数。

     

    最后我们用一张图来看看箱图中各个图形元素:

    1、Q1表示第1个四分位值,Q2表示第2个四分位值,即中位数,Q3表示第3个中位数

    2、IQR是第3个中位数与第1个中位数的差值

    3、虚线最左侧的划线表示Q1-1.5倍IQR,虚线最右侧的划线表示Q1+1.5倍IQR

    4、左右的黑点是离群点,最左侧的离群点是最小值,最右侧离群点是最大值

    image

     

    从以上结果可以解答以往对四分位数的误区:

    1、四分之一分位数并非中位数的2倍,也并非是四分之三分为的3倍

    2、不要将箱图虚线左右侧的划线理解为四分之0分位,四分之四分位值,这2个分位值分别是最小值和最大值,它们可能会成为离群点

    转载于:https://www.cnblogs.com/alexywt/p/11408460.html

    展开全文
  • 四分位数和百分位数_20种四分位数

    千次阅读 2020-07-22 10:43:06
    四分位数和百分位数 四分位数 (Quartiles) To calculate a quartile of a sample is in theory easy, and is much like calculating the median. The difficult part is the implementation; contrary to ...

    四分位数和百分位数

    四分位数 (Quartiles)

    To calculate a quartile of a sample is in theory easy, and is much like calculating the median. The difficult part is the implementation; contrary to calculating the median, there exists no single specific method that stands above the rest or can be considered the "best" method among the about twenty known methods for calculating a quartile. The "best" method will be the method that fits the purpose or - in some areas - is considered a de-facto standard.

    从理论上说,计算样本的四分位数很容易,并且很像计算中位数。 困难的部分是执行; 与计算中位数相反,在计算四分位数的大约二十种已知方法中,没有任何一种特定的方法可以胜过其他方法,也可以认为是“最佳”方法。 “最佳”方法将是适合目的的方法,或者在某些方面被认为是事实上的标准。

    Why, how, and when to calculate quartiles using which method is outside the scope of this article. Many articles and even books covering this have been written. However, the day you face the task to calculate a quartile using some specific method, the functions here will help you.

    为什么,如何以及何时 使用哪种方法来计算四分位数不在本文的讨论范围之内。 已经写了许多有关此的文章,甚至书籍。 但是,当您面对使用某种特定方法计算四分位数的任务时,此处的功能将为您提供帮助。

    方法 (Methods)

    It is quite hard to even obtain a list of known methods for calculating a quartile, not to say proven results from these. The best source, I've located (see bottom of the article), is quite old and lists 14 methods:

    甚至很难获得已知的计算四分位数的方法的列表,更不用说从中得出的可靠结果。 我找到的最好的资源(请参阅本文的底部)已经很旧了,并列出了14种方法:

    The additional six methods, I have located here and there. Unfortunately, the sources have vanished.

    我已经在这里和那里找到了另外六个方法。 不幸的是,消息来源已经消失了。

    If you are aware of any good source, please add a comment to the article.

    如果您知道任何好的来源,请在文章中添加评论。

    The methods have been collected as an enum including as in-line comments their names, applications, and sources, together with their basic calculation methods for the first and the third quartile (the second is always calculated as the median):

    这些方法已作为一个枚举收集,其中包括它们的名称,应用程序和来源以及它们在第一第三四分位数中的基本计算方法(作为内联注释,以内嵌注释)( 第二个始终以中位数计算):

    ' Quartile calculation methods.
    ' Values equal those listed in the source. See function Quartile.
    '
    ' Common names of variables used in calculation formulas.
    '
    ' L: Q1, Lower quartile.
    ' H: Q3, Higher quartile.
    ' M: Q2, Median (not used here).
    ' n: Count of elements.
    ' p: Calculated position of quartile.
    ' j: Element of dataset.
    ' g: Decimal part of p to be used for interpolation between j and j+1.
    '
    Public Enum ApQuartileMethod
        [_First] = 1
        
        ' Basic calculation methods.
        
        ' Step. Mendenhall and Sincich method.
        '   SAS #3.
        '   Round up to actual element of dataset.
        '   L:  -Int(-n/4)
        '   H: n-Int(-n/4)
        apMendenhallSincich = 1
        
        ' Average step.
        '   SAS #5, Minitab (%DESCRIBE), GLIM (percentile).    '
        '   Add bias of one on basis of n/4.
        '   L:   CLng((n+2)/2)/2
        '   H: n-Clng((n+2)/2)/2
        '   Note:
        '       Replaces these original formulas that don't return the expected values.
        '   L:   (Int((n+1)/4)+Int(n/4))/2+1
        '   H: n-(Int((n+1)/4)+Int(n/4))/2+1
        apAverage = 2
        
        ' Nearest integer to np.
        '   SAS #2.
        '   Round to nearest integer on basis of n/4.
        '   L:   CLng(n/4)
        '   H: n-CLng(n/4)
        '   Note:
        '       Replaces these original formulas that don't return the expected values.
        '   L:   Int((n+2)/4)
        '   H: n-Int((n+2)/4)
        apNearestInteger = 3
        
        ' Parzen method.
        '   Method 1 with interpolation.
        '   SAS #1.
        '   L: n/4
        '   H: 3n/4
        apParzen = 4
        
        ' Hazen method.
        '   Values midway between method 1 steps.
        '   GLIM (interpolate).
        '   Wikipedia method 3.
        '   Add bias of 2, don't round to actual element of dataset.
        '   L: (n+2)/4
        '   H: 3(n+2)/4-1
        apHazen = 5
        
        ' Weibull method.
        '   SAS #4. Minitab (DECRIBE), SPSS, BMDP, Excel exclusive.
        '   Add bias of 1, don't round to actual element of dataset.
        '   L: (n+1)/4
        '   H: 3(n+1)/4
        apWeibull = 6
        
        ' Freund, J. and Perles, B., Gumbell method.
        '   S-PLUS, R, Excel legacy, Excel inclusive, Star Office Calc.
        '   Add bias of 3, don't round to actual element of dataset.
        '   L: (n+3)/4
        '   H: (3n+1)/4
        apFreundPerlesGumbell = 7
        
        ' Median Position.
        '   Median unbiased.
        '   L: (3n+5)/12
        '   H: (9n+7)/12
        apMedianPosition = 8
        
        ' Bernard and Bos-Levenbach.
        '   L: (n/4)+0.4
        '   H: (3n/4)/+0.6
        '   Note:
        '       Reference claims L to be (n/4)+0.31.
        apBernardBosLevenbach = 9
        
        ' Blom's Plotting Position.
        '   Better approximation when the distribution is normal.
        '   L: (4n+7)/16
        '   H: (12n+9)/16
        apBlom = 10
        
        ' Moore's first method.
        '   Add bias of one half step.
        '   L: (n+0.5)/4
        '   H: n-(n+0.5)/4
        apMooreFirst = 11
        
        ' Moore's second method.
        '   Add bias of one or two steps on basis of (n+1)/4.
        '   L:   (Int((n+1)/4)+Int(n/4))/2+1
        '   H: n-(Int((n+1)/4)+Int(n/4))/2+1
        apMooreSecond = 12
        
        ' John Tukey's method.
        '   Include median from odd dataset in dataset for quartile.
        '   Wikipedia method 2.
        '   L:   (1-Int(-n/2))/2
        '   H: n-(-1-Int(-n/2))/2
        apTukey = 13
        
        ' Moore and McCabe (M & M), variation of John Tukey's method.
        '   TI-83.
        '   Wikipedia method 1.
        '   Exclude median from odd dataset in dataset for quartile.
        '   L:   (Int(n/2)+1)/2
        '   H: n-(Int(n/2)-1)/2
        apTukeyMooreMcCabe = 14
        
        ' Additional variations between Weibull's and Hazen's methods, from
        '   (i-0.000)/(n+1.00)
        ' to
        '   (i-0.500)/(n+0.00)
        
        ' Variation of Weibull.
        '   L: n(n/4-0)/(n+1)
        '   H: n(3n/4-0)/(n+1)
        apWeibullVariation = 15
        
        ' Variation of Blom.
        '   L: n(n/4-3/8)/(n+1/4)
        '   H: n(3n/4-3/8)/(n+1/4)
        apBlomVariation = 16
        
        ' Variation of Tukey.
        '   L: n(n/4-1/3)/(n+1/3)
        '   H: n(3n/4-1/3)/(n+1/3)
        apTukeyVariation = 17
        
        ' Variation of Cunnane.
        '   L: n(n/4-2/5)/(n+1/5)
        '   H: n(3n/4-2/5)/(n+1/5)
        apCunnaneVariation = 18
        
        ' Variation of Gringorten.
        '   L: n(n/4-0.44)/(n+0.12)
        '   H: n(3n/4-0.44)/(n+0.12)
        apGringortenVariation = 19
        
        ' Variation of Hazen.
        '   L: n(n/4-1/2)/n
        '   H: n(3n/4-1/2)/n
        apHazenVariation = 20
        
        [_Last] = 20
    End Enum 
    

    The actual calculation methods have been tweaked a little to fit VBA and to correct for weird results when a sample consists of very few elements.

    实际计算方法已进行了一些调整,以适合VBA并在样本包含很少元素的情况下纠正怪异的结果。

    功能 (Functions)

    The main function is named Quartile and has the native domain aggregate functions, DAvg etc., in mind as it takes an Expression, a Domain, and a Criteria (filter) as arguments. Other arguments are the quartile Part to return and the Method to use:

    主函数被命名为四分位数,并具有本机域聚合函数DAvg等,因为它需要一个表达式,一个和一个条件 (过滤器)作为参数。 其他参数是要返回的四分位数部分和要使用的方法

    Expression: Name of the field or an expression to analyse.
    Domain    : Name of the source/query, or an SQL select query, to analyse.
    Criteria  : Optional. A filter expression for Domain.
    Part      : Optional. Which median/quartile or min/max value to return.
                Default is the median value.
    Method    : Optional. Method for calculation of lower/higher quartile.
                Default is the method by Freund, Perles, and Gumbell (Excel).  
    

    The function can be regarded to have four main parts:

    该功能可以认为具有四个主要部分:

    1. Build the SQL to retrieve the ordered samples

      构建SQL以检索有序的样本
    2. Calculate either the minimum or maximum value, the first or third quartile, or the median

      计算最小值或最大值,第一或第三四分位数或中位数
    3. Prepare for interpolation

      准备插值
    4. Calculate the final output

      计算最终输出
    Public Function Quartile( _
        ByVal Expression As String, _
        ByVal Domain As String, _
        Optional ByVal Criteria As String, _
        Optional ByVal Part As ApQuartilePart = ApQuartilePart.apMedian, _
        Optional ByVal Method As ApQuartileMethod = ApQuartileMethod.apFreundPerlesGumbell) _
        As Double
      
        ' SQL.
        Const SqlMask           As String = "Select {0} From {1} {2}"
        Const SqlLead           As String = "Select "
        Const SubMask           As String = "({0}) As T"
        Const FilterMask        As String = "Where {0} "
        Const OrderByMask       As String = "Order By {0} Asc"
        
        Dim Records     As DAO.Recordset
        
        Dim Sql         As String
        Dim SqlSub      As String
        Dim Filter      As String
        Dim Count       As Long     ' n.
        Dim Position    As Double   ' p.
        Dim Element     As Long     ' j.
        Dim Interpolate As Double   ' g.
        Dim ValueOne    As Double
        Dim ValueTwo    As Double
        Dim Value       As Double
        
        ' Return default quartile part if choice of part is
        ' outside the range of ApQuartilePart.
        If Not IsQuartilePart(Part) Then
            Part = ApQuartilePart.apMedian
        End If
        
        ' Use a default calculation method if choice of method is
        ' outside the range of ApQuartileMethod.
        If Not IsQuartileMethod(Method) Then
            Method = ApQuartileMethod.apFreundPerlesGumbell
        End If
        
        If Domain <> "" And Expression <> "" Then
            ' Build SQL to lookup values.
            If InStr(1, LTrim(Domain), SqlLead, vbTextCompare) = 1 Then
                ' Domain is an SQL expression.
                SqlSub = Replace(SubMask, "{0}", Domain)
            Else
                ' Domain is a table or query name.
                SqlSub = Domain
            End If
            If Trim(Criteria) <> "" Then
                ' Build Where clause.
                Filter = Replace(FilterMask, "{0}", Criteria)
            End If
            ' Build final SQL.
            Sql = Replace(Replace(Replace(SqlMask, "{0}", Expression), "{1}", SqlSub), "{2}", Filter) & _
                Replace(OrderByMask, "{0}", Expression)
            Set Records = CurrentDb.OpenRecordset(Sql, dbOpenSnapshot)
          
            With Records
                If Not .EOF = True Then
                    If Part = ApQuartilePart.apMinimum Then
                        ' No need to count records.
                        Count = 1
                    Else
                        ' Count records.
                        .MoveLast
                        Count = .RecordCount
                    End If
                    Select Case Part
                        Case ApQuartilePart.apMinimum
                            ' Current record is first record.
                            ' Read value of this record.
                        Case ApQuartilePart.apMaximum
                            ' Current record is last record.
                            ' Read value of this record.
                        Case ApQuartilePart.apMedian
                            ' Locate position of median.
                            Position = (Count + 1) / 2
                        Case ApQuartilePart.apLower
                            Select Case Method
                                Case ApQuartileMethod.apMendenhallSincich
                                    Position = -Int(-Count / 4)
                                Case ApQuartileMethod.apAverage
                                    Position = CLng((Count + 2) / 2) / 2
                                Case ApQuartileMethod.apNearestInteger
                                    Position = CLng(Count / 4)
                                Case ApQuartileMethod.apParzen
                                    Position = Count / 4
                                Case ApQuartileMethod.apHazen
                                    Position = (Count + 2) / 4
                                Case ApQuartileMethod.apWeibull
                                    Position = (Count + 1) / 4
                                Case ApQuartileMethod.apFreundPerlesGumbell
                                    Position = (Count + 3) / 4
                                Case ApQuartileMethod.apMedianPosition
                                    Position = (3 * Count + 5) / 12
                                Case ApQuartileMethod.apBernardBosLevenbach
                                    Position = (Count / 4) + 0.4
                                Case ApQuartileMethod.apBlom
                                    Position = (4 * Count + 7) / 16
                                Case ApQuartileMethod.apMooreFirst
                                    Position = (Count + 0.5) / 4
                                Case ApQuartileMethod.apMooreSecond
                                    Position = (Int((Count + 1) / 4) + Int(Count / 4)) / 2 + 1
                                Case ApQuartileMethod.apTukey
                                    Position = (1 - Int(-Count / 2)) / 2
                                Case ApQuartileMethod.apTukeyMooreMcCabe
                                    Position = (Int(Count / 2) + 1) / 2
                                Case ApQuartileMethod.apWeibullVariation
                                    Position = Count * (Count / 4) / (Count + 1)
                                Case ApQuartileMethod.apBlomVariation
                                    Position = Count * (Count / 4 - 3 / 8) / (Count + 1 / 4)
                                Case ApQuartileMethod.apTukeyVariation
                                    Position = Count * (Count / 4 - 1 / 3) / (Count + 1 / 3)
                                Case ApQuartileMethod.apCunnaneVariation
                                    Position = Count * (Count / 4 - 2 / 5) / (Count + 1 / 5)
                                Case ApQuartileMethod.apGringortenVariation
                                    Position = Count * (Count / 4 - 0.44) / (Count + 0.12)
                                Case ApQuartileMethod.apHazenVariation
                                    Position = Count * (Count / 4 - 1 / 2) / Count
                            End Select
                        Case ApQuartilePart.apUpper
                            ' Default position for very low counts for several methods
                            Position = Count
                            Select Case Method
                                Case ApQuartileMethod.apMendenhallSincich
                                    If Count > 2 Then
                                        Position = Count - (-Int(-Count / 4))
                                    End If
                                Case ApQuartileMethod.apAverage
                                    If Count > 2 Then
                                        Position = Count - CLng((Count + 2) / 2) / 2
                                    End If
                                Case ApQuartileMethod.apNearestInteger
                                    Position = Count - CLng(Count / 4)
                                Case ApQuartileMethod.apParzen
                                    Position = 3 * Count / 4
                                Case ApQuartileMethod.apHazen
                                    If Count > 1 Then
                                        Position = 3 * (Count + 2) / 4 - 1
                                    End If
                                Case ApQuartileMethod.apWeibull
                                    If Count > 2 Then
                                        Position = 3 * (Count + 1) / 4
                                    End If
                                Case ApQuartileMethod.apFreundPerlesGumbell
                                    Position = (3 * Count + 1) / 4
                                Case ApQuartileMethod.apMedianPosition
                                    If Count > 2 Then
                                        Position = (9 * Count + 7) / 12
                                    End If
                                Case ApQuartileMethod.apBernardBosLevenbach
                                    If Count > 2 Then
                                        Position = (3 * Count / 4) + 0.6
                                    End If
                                Case ApQuartileMethod.apBlom
                                    If Count > 2 Then
                                        Position = (12 * Count + 9) / 16
                                    End If
                                Case ApQuartileMethod.apMooreFirst
                                    Position = Count - (Count + 0.5) / 4
                                Case ApQuartileMethod.apMooreSecond
                                    ' Basic calculation method. Will fail for 2 or 3 elements.
                                    '   Position = Count - (Int((Count + 1) / 4) + Int(Count / 4)) / 2 + 1
                                    ' Calculation method adjusted to accept 2 or 3 elements.
                                    Position = Count - (Int((Count + Int((Count * 2) / (Count + 4))) / 4) + Int(Count / 4)) / 2 + 1
                                Case ApQuartileMethod.apTukey
                                    Position = Count - (-1 - Int(-Count / 2)) / 2
                                Case ApQuartileMethod.apTukeyMooreMcCabe
                                    If Count > 1 Then
                                        Position = Count - (Int(Count / 2) - 1) / 2
                                    End If
                                Case ApQuartileMethod.apWeibullVariation
                                    Position = Count * (3 * Count / 4) / (Count + 1)
                                Case ApQuartileMethod.apBlomVariation
                                    Position = Count * (3 * Count / 4 - 3 / 8) / (Count + 1 / 4)
                                Case ApQuartileMethod.apTukeyVariation
                                    Position = Count * (3 * Count / 4 - 1 / 3) / (Count + 1 / 3)
                                Case ApQuartileMethod.apCunnaneVariation
                                    Position = Count * (3 * Count / 4 - 2 / 5) / (Count + 1 / 5)
                                Case ApQuartileMethod.apGringortenVariation
                                    Position = Count * (3 * Count / 4 - 0.44) / (Count + 0.12)
                                Case ApQuartileMethod.apHazenVariation
                                    Position = Count * (3 * Count / 4 - 1 / 2) / Count
                            End Select
                    End Select
                    Select Case Part
                        Case ApQuartilePart.apMinimum, ApQuartilePart.apMaximum
                            ' Read current row.
                        Case Else
                            .MoveFirst
                            ' Find position of first observation to retrieve.
                            ' If Element is 0, then upper position is first record.
                            ' If Element is not 0 and position is not an integer, then
                            ' read the next observation too.
                            Element = Fix(Position)
                            Interpolate = Position - Element
                            If Count = 1 Then
                                ' Nowhere else to move.
                                If Interpolate < 0 Then
                                    ' Prevent values to be created by extrapolation beyond zero from observation one
                                    ' for these methods:
                                    '   ApQuartileMethod.apBlomVariation
                                    '   ApQuartileMethod.apTukeyVariation
                                    '   ApQuartileMethod.apCunnaneVariation
                                    '   ApQuartileMethod.apGringortenVariation
                                    '   ApQuartileMethod.apHazenVariation
                                    '
                                    ' Comment this line out, if reading by extrapolation *is* requested.
                                    Interpolate = 0
                                End If
                            ElseIf Element > 1 Then
                                ' Move to the record to read.
                                .Move Element - 1
                                ' Special case for apMooreSecond and upper quartile for 2 and 3 elements.
                                If .EOF Then
                                    .MoveLast
                                End If
                            End If
                    End Select
                    ' Retrieve value from first observation.
                    ValueOne = .Fields(0).Value
              
                    Select Case Part
                        Case ApQuartilePart.apMinimum, ApQuartilePart.apMaximum
                            Value = ValueOne
                        Case Else
                            If Interpolate = 0 Then
                                ' Only one observation to read.
                                If Element = 0 Then
                                    ' Return 0.
                                Else
                                    Value = ValueOne
                                End If
                            Else
                                If Element = 0 Or Element = Count Then
                                    ' No first/last observation to retrieve.
                                    ValueTwo = ValueOne
                                    If ValueOne > 0 Then
                                        ' Use 0 as other observation.
                                        ValueOne = 0
                                    Else
                                        ValueOne = 2 * ValueOne
                                    End If
                                Else
                                    ' Move to next observation.
                                    .MoveNext
                                    ' Retrieve value from second observation.
                                    ValueTwo = .Fields(0).Value
                                End If
                                ' For positive values interpolate between 0 and ValueOne.
                                ' For negative values interpolate between 2 * ValueOne and ValueOne.
                                ' Calculate quartile using linear interpolation.
                                Value = ValueOne + Interpolate * CDec(ValueTwo - ValueOne)
                            End If
                    End Select
                End If
                .Close
            End With
        End If
          
        Quartile = Value
    
    End Function 
    

    Two important features are, that the Domain argument can be an SQL select query, and the samples in the passed records do not have to be sorted. The function will itself take care of sorting the samples. 

    两个重要功能是,Domain参数可以是SQL select查询 ,并且传递记录中的样本不必排序 。 该函数本身将负责对样本进行排序。

    Thus, typical usages can be as listed here where the resulting SQL has been included for better understanding of the parsing of the Domain argument done by the function:

    因此,典型用法可以列在此处,其中包括了生成SQL,以更好地理解函数完成的Domain参数的解析:

    ' Example calls and the internally generated SQL:
    '
    '   With fieldname as expression, table (or query) as domain, no filter, and default sorting:
    '       Q1 = Quartile("Data", "Observation", , apFirst, apFreundPerlesGumbell)
    '       Select Data From Observation Order By Data Asc
    '
    '   With two fieldnames as expression, table (or query) as domain, no filter, and sorting on two fields:
    '       Q1 = Quartile("Data, Step", "Observation", , apFirst, apFreundPerlesGumbell)
    '       Select Data, Step From Observation Order By Data, Step Asc
    '
    '   With fieldname as expression, SQL as domain, no filter, and default sorting:
    '       Q1 = Quartile("Data", "Select Data From Observation", , apFirst, apFreundPerlesGumbell)
    '       Select Data From (Select Data From Observation) As T Order By Data Asc
    '
    '   With fieldname as expression, SQL as domain, simple filter, and sorting on one field:
    '       Q1 = Quartile("Data", "Select Data, Step From Observation", "Step = 10", apFirst, apFreundPerlesGumbell)
    '       Select Data From (Select Data, Step From Observation) As T Where Step = 10 Order By Data Asc
    '
    '   With calculated expression, SQL as domain, extended filter, and sorting on one field:
    '       Q1 = Quartile("Data * 10", "Select Data, Step From Observation", "Step = 10 And Data <= 40", apFirst, apFreundPerlesGumbell)
    '       Select Data * 10 From (Select Data, Step From Observation) As T Where Step = 10 And Data <= 40 Order By Data * 10 Asc
    '
    '   With filtered SQL domain, additional filter, and sorting on one field:
    '       Q1 = Quartile("Data", "Select Data, Step From Observation Where Step = 10", "Data <= 40", apFirst, apFreundPerlesGumbell)
    '       Select Data From (Select Data, Step From Observation Where Step = 10) As T Where Data <= 40 Order By Data Asc
    '
    '   With filtered SQL domain, additional filter, and sorting on two fields:
    '       Q1 = Quartile("Step, Data", "Select Data, Step From Observation Where Step = 10", "Data <= 40", apFirst, apFreundPerlesGumbell)
    '       Select Step, Data From (Select Data, Step From Observation Where Step = 10) As T Where Data <= 40 Order By Step, Data Asc 
    

    Note please, that the function is heavily in-line documented as the code otherwise would be uncomprehensive.

    请注意,该函数已大量内联文档,否则代码将不完整。

    域功能 (Domain functions)

    To ease the use, indeed in queries, two domain functions supplement the main function:

    为了简化在查询中的使用,两个域函数补充了主要功能:

    DMedian

    DMedian

    DQuartile

    四分位数

    These mimic the native Dxxx domain aggregate functions and take only the arguments needed, using default values - for DMedian, for the part to return and, for DQuartile, for the calculation method to use; that method has been chosen to be the original method used by Excel (formulas QUARTILE and QUARTILE.INCL):

    它们模仿本地的Dxxx域聚合函数,并使用默认值仅接受所需的参数-对于DMedian,用于返回的部分,对于DQuartile,用于使用的计算方法; 该方法已被选为Excel所使用的原始方法(公式QUARTILE和QUARTILE.INCL):

    ' Returns the median of a field of a table/query.
    '
    ' Parameters:
    '   Expression: Name of the field or an expression to analyse.
    '   Domain    : Name of the source/query, or an SQL select query, to analyse.
    '   Criteria  : Optional. A filter expression for Domain.
    '
    ' Reference and examples: See function Quartile.
    '
    ' Data must be in ascending order by Field.
    '
    ' 2019-08-15. Gustav Brock, Cactus Data ApS, CPH.
    '
    Public Function DMedian( _
        ByVal Expression As String, _
        ByVal Domain As String, _
        Optional ByVal Criteria As String) _
        As Double
        
        Dim Value       As Double
        
        Value = Quartile(Expression, Domain, Criteria)
        
        DMedian = Value
    
    End Function 
    
    ' Returns the upper or lower quartile or the median or the
    ' minimum or maximum value of a field of a table/query
    ' using the method by Freund, Perles, and Gumbell (Excel).
    '
    ' Parameters:
    '   Expression: Name of the field or an expression to analyse.
    '   Domain    : Name of the source/query, or an SQL select query, to analyse.
    '   Criteria  : Optional. A filter expression for Domain.
    '   Part      : Optional. Which median/quartile or min/max value to return.
    '               Default is the median value.
    '
    ' Reference and examples: See function Quartile.
    '
    ' 2019-08-15. Gustav Brock, Cactus Data ApS, CPH.
    '
    Public Function DQuartile( _
        ByVal Expression As String, _
        ByVal Domain As String, _
        Optional ByVal Criteria As String, _
        Optional ByVal Part As ApQuartilePart = ApQuartilePart.apMedian) _
        As Double
        
        Dim Value       As Double
        
        Value = Quartile(Expression, Domain, Criteria, Part)
        
        DQuartile = Value
    
    End Function 
    

    结果 (Results)

    An example workbook with generated results from the Excel formulas is attached for reference. 

    随附一个示例工作簿,其中包含从Excel公式生成的结果,以供参考。

    It displays like this:

    它显示如下:

    The output from the function ListExcelQuartile, found in the attached Access example file, lists identical values.

    在附件的Access示例文件中找到的ListExcelQuartile函数的输出列出了相同的值。

    The two methods are our methods 7 and 6, or the enum elements apFreundPerlesGumbell and apWeibull:

    这两种方法是我们的​​方法76,或者是枚举元素apFreundPerlesGumbellapWeibull

                   100           99            98            97            96            95 
    INCLUDE (LEGACY)
     7            25,75         25,50         25,25         25,00         24,75         24,50         
     7            50,50         50,00         49,50         49,00         48,50         48,00         
     7            75,25         74,50         73,75         73,00         72,25         71,50         
    
    EXCLUDE
     6            25,25         25,00         24,75         24,50         24,25         24,00         
     6            50,50         50,00         49,50         49,00         48,50         48,00         
     6            75,75         75,00         74,25         73,50         72,75         72,00  
    

    Likewise, the function ListFirstQuartile returns an output similar to the results from the main source (table H-4 at top):

    同样,函数ListFirstQuartile返回的输出类似于主源的结果(顶部的表H-4):

                   40            50            60            70 
     1            10,00         20,00         20,00         20,00         
     2            15,00         20,00         20,00         20,00         
     3            10,00         10,00         20,00         20,00         
     4            10,00         12,50         15,00         17,50         
     5            15,00         17,50         20,00         22,50         
     6            12,50         15,00         17,50         20,00         
     7            17,50         20,00         22,50         25,00         
     8            14,17         16,67         19,17         21,67         
     9            14,00         16,50         19,00         21,50         
     10           14,38         16,88         19,38         21,88         
     11           11,25         13,75         16,25         18,75         
     12           20,00         20,00         20,00         25,00         
     13           15,00         20,00         20,00         25,00         
     14           15,00         15,00         20,00         20,00         
     15           8,00          10,42         12,86         15,31         
     16           5,88          8,33          10,80         13,28         
     17           6,15          8,59          11,05         13,52         
     18           5,71          8,17          10,65         13,13         
     19           5,44          7,91          10,39         12,88         
     20           5,00          7,50          10,00         12,50         
    
                   100           99            98            97            96            95 
     1            25,00         25,00         25,00         25,00         24,00         24,00         
     2            25,50         25,00         25,00         25,00         24,50         24,00         
     3            25,00         25,00         24,00         24,00         24,00         24,00         
     4            25,00         24,75         24,50         24,25         24,00         23,75         
     5            25,50         25,25         25,00         24,75         24,50         24,25         
     6            25,25         25,00         24,75         24,50         24,25         24,00         
     7            25,75         25,50         25,25         25,00         24,75         24,50         
     8            25,42         25,17         24,92         24,67         24,42         24,17         
     9            25,40         25,15         24,90         24,65         24,40         24,15         
     10           25,44         25,19         24,94         24,69         24,44         24,19         
     11           25,13         24,88         24,63         24,38         24,13         23,88         
     12           26,00         25,50         25,00         25,00         25,00         24,50         
     13           25,50         25,50         25,00         25,00         24,50         24,50         
     14           25,50         25,00         25,00         24,50         24,50         24,00         
     15           24,75         24,50         24,25         24,00         23,75         23,50         
     16           24,56         24,31         24,06         23,81         23,56         23,31         
     17           24,58         24,33         24,08         23,83         23,58         23,33         
     18           24,55         24,30         24,05         23,80         23,55         23,30         
     19           24,53         24,28         24,03         23,78         23,53         23,28         
     20           24,50         24,25         24,00         23,75         23,50         23,25          
    

    Note please, that column 100-96 here contain the correct values, while in Table H-4 they hold the values for samples 99-95.

    请注意,此处的100-96列包含正确的值,而在表H-4中,它们保留了样本99-95的值。

    The two small examples found on Wikipedia display the results using three different methods which equal our methods 14, 13, and 5 respectively, or the enum elements apTukeyMooreMcCabe, apTukey, and apHazen:

    维基百科上发现的两个小的例子显示使用,它们分别等于我们的方法14,图13,图5三种不同的方法,或枚举元素apTukeyMooreMcCabe,apTukey和 apHazen结果

    例子1
    例子1
    (Example 1 )

    例子2 (Example 2)

    These can be reproduced by the function ListWikipediaSamples:

    这些可以由功能ListWikipediaSamples复制

                  Method 1      Method 2      Method 3
                  
    Q1             15            25,5          20,25 
    Q2             40            40            40 
    Q3             43            42,5          42,75 
    
    Q1             15            15            15 
    Q2             37,5          37,5          37,5 
    Q3             40            40            40  
    

    Also, a query, FirstQuartileAllMethods, is included which will list the results for all sets of samples between 1 and 100 for all 20 methods for the lower quartile. Here's a snip:

    此外,还包含一个查询FirstQuartileAllMethods ,它将针对下四分位数的所有20种方法列出1至100之间的所有样本集的结果。 这是一个片段:

    Finally, a form is included which lets you select any method and then have the results for all three quartiles for every sample between 1 and 100 listed:

    最后,包含一个表格,您可以选择任何方法,然后列出列出的1至100之间的每个样本的所有三个四分位数的结果:


    (
    )

    实作 (Implementation)

    To be able to calculate quartiles, import the module QuartileCode in your application. That's all.

    为了能够计算四分位数,请在您的应用程序中导入模块QuartileCode 。 就这样。

    The other module, QuartileDemo, is only needed for testing and for the demo form (also named QuartileDemo) to display.

    其他模块QuartileDemo仅用于测试和显示的演示表单(也称为QuartileDemo)。

    Bonus tip: Study the form's code to see how to right-align numbers in a Listbox column.

    温馨提示: 研究表单的代码以查看如何在“列表框”列中将数字右对齐。

    结论 (Conclusion)

    From the sparse sources to be located, a function has been created that for just about any practical purpose will allow for the quartiles of a sample of records to be calculated by twenty different methods.

    从要定位的稀疏源中创建了一个函数,该函数几乎可以用于任何实际目的,从而可以通过二十种不同的方法来计算记录样本的四分位数。

    In addition, simplified functions intended to supplement the native domain aggregate functions have been presented. Also, a collection of functions and a query for testing and demonstration have been included.

    另外,已经提出了旨在补充本地域聚合功能的简化功能。 此外,还包括功能集合以及用于测试和演示的查询。

    资料来源 (Sources)

    Original source (now off-line) by David A. Heiser: http://www.daheiser.info/excel/notes/NOTE%20N.pdf

    David A. Heiser的原始资源(现已离线): http ://www.daheiser.info/excel/notes/NOTE%20N.pdf

    Archived source at The Internet Archive: NOTE 20

    Internet存档中的存档源: NOTE 20

    Notes: 

    笔记:

    1. Table H-4, p. 4, has correct data for the dataset for 1-96 while the datasets for 1-100 to 1-97 actually are the datasets for 1-99 to 1-96 shifted one column left. Thus, the dataset for 1-100 is missing, and that for 1-96 is listed twice.

      表H-4,第6页。 4,具有1-96数据集的正确数据,而1-100到1-97的数据集实际上是1-99到1-96的数据集向左移动了一列。 因此,缺少1-100的数据集,并且两次列出了1-96的数据集。
    2. Method 3b is not implemented as no one seems to use it. Neither is no example data given. Thus method 3a has here been labeled method

      方法3b未实现,因为似乎没有人使用它。 没有给出示例数据。 因此,方法3a在这里被标记为方法

    Further notes on quartiles and methods can be found here:

    有关四分位数和方法的更多说明,请参见:

    Wikipedia

    维基百科

    Math Forum

    数学论坛

    HaiWeb

    海网

    murdoch.edu.au (archived)

    murdoch.edu.au(已归档)

    Should you be aware of any good source that can supplement or improve this article, please do not hesitate posting a link as comment.

    如果您知道可以补充或改进本文的任何好的资源,请不要犹豫发布链接作为评论。

    下载 (Download)

    The full and current code is available for download at GitHub: VBA.Quartiles

    完整和当前的代码可从GitHub下载: VBA.Quartiles

    Also, code and a demo application is here: Quartiles 1.0.1.zip 

    另外,代码和演示应用程序也在这里: Quartiles 1.0.1.zip

    An Excel workbook with the presented example: Quartiles.xlsx

    一个带有示例的Excel工作簿: Quartiles.xlsx

    I hope you found this article useful. You are encouraged to ask questions, report any bugs or make any other comments about it below.

    希望本文对您有所帮助。 鼓励您在下面提出问题,报告任何错误或对此作出任何其他评论。

    Note: If you need further "Support" about this topic, please consider using the Ask a Question feature of Experts Exchange. I monitor questions asked and would be pleased to provide any additional support required in questions asked in this manner, along with other EE experts.

    注意 :如果您需要有关此主题的更多“支持”,请考虑使用Experts Exchange 的“提问”功能。 我会监督提出的问题,并很高兴与其他电子工程师一起为以这种方式提出的问题提供所需的任何其他支持。

    Please do not forget to press the "Thumbs Up" button if you think this article was helpful and valuable for EE members.

    如果您认为本文对EE成员有用且有价值,请不要忘记按下“竖起大拇指”按钮。

    翻译自: https://www.experts-exchange.com/articles/33718/20-Varieties-of-Quartiles.html

    四分位数和百分位数

    展开全文
  • 四分位数求法

    2020-03-28 17:52:45
    四分位数间距:是上四分位数与下四分位数之差,用四分位数间距可反映变异程度的大小. 即:Q3 --Q1 四分位数求法 第一步  确定四分位数的位置 四分位数是将数列等分成四个部分的数,一个数列有三个四分位数,设下四分...

    四分位数间距:是上四分位数与下四分位数之差,用四分位数间距可反映变异程度的大小.

    即:Q3 --Q1

    四分位数求法
    第一步
      确定四分位数的位置

    四分位数是将数列等分成四个部分的数,一个数列有三个四分位数,设下四分位数、中位数和上四分位数分别为Q1、Q2、Q3,则:Q1、Q2、Q3的位置可由下述公式确定:

    Q1的位置 (n+1)/4

    Q2的位置 (n+1) /2

    Q3的位置 3(n+1)/4

    式中n表示资料的项数

    第二步
      根据第一步所确定的四分位数的位置,确定其相应的四分位数。

    例1
      例如:某车间某月份的工人生产某产品的数量分别为13、13.5、13.8、13.9、14、14.6、14.8、15、15.2、15.4、15.7公斤,则三个四分位数的位置分别为:

    Q1的位置 (n+1)/4 =(11+1)/4=3

    Q2的位置 (n+1) /2=(11+1)/2=6

    Q3的位置 3(n+1)/4=3(11+1)/4=9

    即变量数列中的第三个、第六个、第九个工人的某种产品产量分别为下四分位数、中位数和上四分位数。即:

    Q1 = 13.8公斤、Q2 = 14.6公斤、Q3 = 15.2公斤

    展开全文
  • 四分位数介绍

    千次阅读 2019-12-25 00:02:38
    描述统计学就是将一系列复杂的数据减少为几个能够起到描述作用的数字,用这些有代表性的数字来代表所有的数据,其中有4个很重要的知识点,分别是平均值(μ)、四分位数、标准差(σ)、标准分(z) 四分位数简介 ...

    前言

    描述统计学就是将一系列复杂的数据减少为几个能够起到描述作用的数字,用这些有代表性的数字来代表所有的数据,其中有4个很重要的知识点,分别是平均值(μ)、四分位数、标准差(σ)、标准分(z)

    四分位数简介

    四分位数(Quartile)也称四分位点,是指在统计学中把所有数值由小到大排列并分成四等份,处于三个分割点位置的数值。多应用于统计学中的箱线图绘制。它是一组数据排序后处于25%和75%位置上的值。四分位数是通过3个点将全部数据等分为4部分,其中每部分包含25%的数据。很显然,中间的四分位数就是中位数,因此通常所说的四分位数是指处在25%位置上的数值(称为下四分位数)和处在75%位置上的数值(称为上四分位数)。与中位数的计算方法类似,根据未分组数据计算四分位数时,首先对数据进行排序,然后确定四分位数所在的位置,该位置上的数值就是四分位数。与中位数不同的是,四分位数位置的确定方法有几种,每种方法得到的结果会有一定差异,但差异不会很大

    实例介绍

     

    首先我们看下数据的情况,如下图所示,数据的总个数为10个

    1、在求取四分位数据时,首先必须做的是要对数据进行升序排序,如下图。

    image

     

    2、四分位求取,首先需要得出该四分位数的位置,如下是四分位数在排序后数据中的位置的公式

    image

    在Excel中使用以上的公式来计算第0、1、2、3、4个四分位值处的位置,得出如下结果

    image

     

    3、四分位数的求取,使用的是以下所说明的一套计算规则

    image

    公式可能看起来较为复杂,说明一下:

    1、四分位数最终的结果由2部分相加得到,其一是四分位位置的整数部分对应的数据,其二是四分位位置的小数部分乘以差值得到

    2、差值是四分位位置整数部分对应数据的下一个数据减去四分位位置整数部分对应的数据。

     

    在Excel中将上面那个公式实现的结果如下:

    image

     

    注意:如果数据的总个数n减去1,得到的结果是4的整数倍的话,那么四分位数的位置都是整数值,各四分位数就可以在升序排序后的数列中直接找到,否则就需要通过以上方式计算小数部分。

    例如:n的值为5、9、13等等,就是可以在数列中直接找到各四分位数。

     

    最后我们用一张图来看看箱图中各个图形元素:

    1、Q1表示第1个四分位值,Q2表示第2个四分位值,即中位数,Q3表示第3个中位数

    2、IQR是第3个中位数与第1个中位数的差值

    3、虚线最左侧的划线表示Q1-1.5倍IQR,虚线最右侧的划线表示Q1+1.5倍IQR

    4、左右的黑点是离群点,最左侧的离群点是最小值,最右侧离群点是最大值

    image

     

    从以上结果可以解答以往对四分位数的误区:

    1、四分之一分位数并非中位数的2倍,也并非是四分之三分为的3倍

    2、不要将箱图虚线左右侧的划线理解为四分之0分位,四分之四分位值,这2个分位值分别是最小值和最大值,它们可能会成为离群点

    参考链接

    后记

    平安夜,一周年纪念

    展开全文
  • spss四分位数

    千次阅读 2016-06-26 19:23:30
    四分位数:将所有数值按大小顺序排列并分成四等份,处于三个分割点位置即为四分位数。 Q1=下四分位数,即第25百分位数; Q2=中位数,即第50百分位数; Q3=上四分位数,即第75百分位数。 通过Q1,Q2,Q3比较,分析其...
  • 四分位数SQL实现

    2021-02-22 08:42:29
    四分位数(Quartile),即统计学中,把所有数值由小到大排列并分成四等份,处于三个分割点位置的得分就是四分位数 第一四分位数 (Q1),又称'较小四分位数',等于该样本中所有数值由小到大排列后第25%的数字 第二四分位数 ...
  • 描述性统计,最有代表性的统计量。...#依次为:最小值、下四分位数、中位数、上四分位数、最大值 验证: > min(x) [1] 3> max(x)[1] 10> median(x)[1] 5> quantile(x) 0% 25% 50% 75% 100% 3 4 5 ...
  • 四分位数的两种计算方法

    千次阅读 2020-12-29 20:16:54
    关于四分位数的两种求法 在数据导论课上,我们学习了如何求解四分位数的方法,其实操作起来也不难先用 (n+1) / 4 * i 计算出四分位数的位置,再求出该位置上的数的值即可。如一组数据 【1,3,6,8,10】 根据公式先...
  • 箱线图&上下四分位数

    千次阅读 2020-09-01 20:20:28
    文章内容输出来源:拉勾数据...很显然,中间的四分位数就是中位数,因此通常所说的四分位数是指处在25%位置上的数值(称为下四分位数)和处在75%位置上的数值(称为上四分位数)。与中位数的计算方法类似,根据未分组数据
  • 2009年第20期中国高新技术企业NO.20.2009总第131期Chinesehi-techenterprisesCumulativetyNO.131统计学中四分位数的计算张云华江西财经职业学院江西
  • 四分位数计算方法

    千次阅读 2019-07-20 12:45:00
    四分位数的位置Q1=(n+1)*0.25 中位数的位置Q2=(n+1)*0.5 上四分位数的位置Q3=(n+1)*0.75 转载于:https://www.cnblogs.com/bravesunforever/p/11217344.html...
  • 四分位数与pandas中的quantile函数

    千次阅读 2020-05-23 21:17:41
    四分位数与pandas中的quantile函数 1.分位数概念 统计学上的有分位数这个概念,一般用p来表示。原则上p是可以取0到1之间的任意值的。但是有一个四分位数是p分位数中较为有名的。 所谓四分位数;即把数值由小到大排列...
  • 四分位数的数学计算以及使用pandas计算前言参考举例1(奇数个)第一四分位数(下四分位数)中位数第三四分位数(上四分位数)举例2(偶数个)第一四分位数(下四分位数)中位数第三四分位数(上四分位数) ...
  • 统计学的Python实现-009:四分位数

    千次阅读 2020-05-20 12:10:25
    四分位数有三个,第一个四分位数称为下四分位数,第二个四分位数就是中位数,第三个四分位数称为上四分位数,分别用Q1、Q2、Q3表示。 统计学解释 四分位数位置的确定方法有两种。其一是Excel函数QUARTILE.EXC的方法...
  • 四分位数与pandas中的quantile函数 1.分位数概念 统计学上的有分位数这个概念,一般用p来表示。原则上p是可以取0到1之间的任意值的。但是有一个四分位数是p分位数中较为有名的。 所谓四分位数;即把数值由小到大排列...
  • 箱线图和四分位数

    千次阅读 2018-01-22 18:01:37
    四分位数(Quartile),即统计学中,把所有数值由小到大排列并分成四等份,处于三个分割点位置的得分就是四分位数。   简介 第一四分位数(Q1),又称“较小四分位数”,等于该样本中所有数值由小到大排列后第25%...
  • 四分位数及matlab实现

    千次阅读 2019-06-18 15:15:00
    四分位数(quantile),解释及调用形式如下。 quantile(x,y,z)的三个参数的说明如下:x表示要求的矩阵或者向量;y的取值为表示要求的分位数,如四分之一中位数0.25,四分之三中位数0.75等;z的取值为1或者2,若值为1则...
  • 该指标显示样本的第一, 第二和第三的四分位数
  • 脚本的第一行包含一个示例数据集。... 吝啬的1-sigma(标准偏差) 中位数第一个四分位数(第 25 个百分位数) 第二个四分位数(第 50 个百分位数) 第三四分位数(第 75 个百分位数) 第 k 个百分位智商标准识别码
  • 四分位数和箱线图

    千次阅读 2019-10-25 11:31:02
    四分位数 四分位数(Quartile)也称四分位点,是指在统计学中把所有数值由小到大排列并分成四等份,处于三个分割点位置的数值。 箱线图 利用数据中的五个统计量:最小值、第一四分位数、中位数、第三四分位数与...
  • 四分位数 箱线图

    万次阅读 2017-07-28 15:39:08
    四分位数 箱线图 建议采用Origin来制作箱线图 作用 箱线图(Boxplot)也称箱须图(Box-whisker Plot),它是用一组数据中的最小值、第一四分位数、中位数、第三四分位数和最大值来反映数据分布的中心位置和散布...
  • 四分位数分析

    2017-01-05 00:14:00
    四分位数分析 import numpy as np import pandas as pd import matplotlib.pyplot as plt import datetime as dt from pandas import Series,DataFrame from datetime import datetime from dateutil.parser import ...
  • 通常称它们为四分位数四分位数给出分布的中心、散布和形状的某种指示。第1个四分位数记作,是第25个百分位数。第3个四分位数记作,是第75个百分位数。 第1个和第3个四分位数之间的距离是散布的一种...
  • 四分位数Quartile算法C语言四分位数Quartile算法完整源码(main函数测试) C语言四分位数Quartile算法完整源码(main函数测试) #include <math.h> #include <stdio.h> #include <stdlib.h> int ...
  • 解决如何在分组数据寻找四分位数

    千次阅读 2020-08-24 16:12:55
    本博文源于对统计学知识基础的学习,旨在解决如何在分组数据中寻求四分位数

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 866
精华内容 346
关键字:

四分位数