精华内容
下载资源
问答
  • scale缩放方式

    2021-04-16 22:33:23
    scale缩放方式 1.辐射缩放 代码: <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name=...

    scale缩放方式

    1.辐射缩放
    代码:

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Document</title>
        <style type="text/css">
        /* .c{
            height: 100vh;
            background: brown;
            position: relative;
        } */
     /*    .a{
            width: 100%;
            height: 400px;
            background-color: brown;
        } */ 
        .b{
            /* position: relative;
            left: 100px; */
            width: 100px;
            height: 300px;
            background-color: aqua;
            transform: scale(2);/* 辐射b扩大 */
        }
        </style>
    </head>
    <body>
    <!-- <div class="c"> -->
        <div class="b"></div>
    <!-- </div> -->
    </body>
    </html>
    

    效果:
    在这里插入图片描述
    可见宽高并没有我们想象到的200×600。但是当我们将其向右移动100px时。
    在这里插入图片描述
    宽度为200px而高还未是600px但估计也可以通过移动来实现,所以因为辐射缩放。(注意由于transform,不会改变内容的大小和位置所以不可以通过开发者工具来看)

    再来看一个例子。
    代码:

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Document</title>
        <style type="text/css">
        body{
            height: 100vh;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        .a{
            display: flex;
            align-items: center;
            justify-content: center;
            width: 300px;
            height: 600px;
            background-color: #7fff00;;
        }
        .b{
            width: 200px;
            height: 200px;
            background-color: cornflowerblue;
            transform: /* translateZ(-50px) */ /* scale(1.5) */;
        }
        </style>
    </head>
    <body>
        <div class="a">
            <div class="b"></div>
        </div>
    </body>
    </html>
    

    效果:
    在这里插入图片描述
    加上scale
    代码:

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Document</title>
        <style type="text/css">
        body{
            height: 100vh;
            display: flex;
            align-items: center;
            justify-content: center;
        }
        .a{
            display: flex;
            align-items: center;
            justify-content: center;
            width: 300px;
            height: 600px;
            background-color: #7fff00;;
        }
        .b{
            width: 200px;
            height: 200px;
            background-color: cornflowerblue;
            transform: /* translateZ(-50px) */ scale(1.5);
        }
        </style>
    </head>
    <body>
        <div class="a">
            <div class="b"></div>
        </div>
    </body>
    </html>
    

    效果:
    在这里插入图片描述

    展开全文
  • nominal是定类 也就是categorical 例如水果种类有苹果、桃子、梨 ordinal是定序 比如年龄有少年、中年、老年 即不同种类之间是有大小顺序关系的 数据处理中四种基本数据类型 定比、定距、定类、定序 .....

    nominal是定类

    也就是categorical

    例如水果种类有苹果、桃子、梨
     

    ordinal是定序

    比如年龄有少年、中年、老年

    即不同种类之间是有大小顺序关系的 

     

     

    数据处理中四种基本数据类型

    定比、定距、定类、定序

    展开全文
  • 数据分类,文献中nominal,ordinal,数据类型的解释与例子,本资源为PPT是在统计学中将数据分为不同的类型
  • Inferential Statistics)7.1.2 总体与样本(Populations and Samples)7.1.3 四种度量衡(Measurement Scales)一、名义尺度(Nominal Scale)二、排序尺度(Ordinal Scale)三、区间尺度(Interval Scale)四、比

    7.1 统计学的基本概念

    7.1.1 描述性统计学与推断性统计学(Descriptive Statistics vs. Inferential Statistics)

    • 描述性统计:descript characteristics of large data sets.
    • 推断性统计:used to make forecasts, estimates or judgements about a large date set on the basis of the statistical characteristics of sample.

    forecast: n. (天气、财经等的)预测,预报;预想

    7.1.2 总体与样本(Populations and Samples)

    • 总体:the set of all possible members of a stated group.
    • 样本:a subset of the population.
    • 参数(parameter):any descriptive measure of a population characteristic.
    • 样本统计量(sample statistics):quantity computed from or used to describe a sample.

    possible: adj. 可能的;合理的;合适的

    subset: n. [数] 子集;子设备;小团体

    compute: vt. 计算;估算;用计算机计算

    measure: n. 测量;措施;程度;尺寸

    7.1.3 四种度量衡(Measurement Scales)

    一、名义尺度(Nominal Scale)

    • contains the lease information;
    • classifies observation by characteristics without order.

    lease: n. 租约;租期;租赁物;租赁权

    observation: n. 观察;监视;观察报告

    二、排序尺度(Ordinal Scale)

    • higher level measurement than nominal scales;
    • classified observation by characteristics with order;
    • can decide which one is better or not;
    • but can’t tell the difference between any two obervations.

    三、区间尺度(Interval Scale)

    • provide relative ranking, like ordinal scales;
    • can decide how much the difference is(that is, provide equal difference between scale values) ;
    • doesn’t have a real zero(which means zero doesn’t mean nothing).

    ranking: n. 等级;地位

    四、比率尺度(Ratio Scale)

    • most refined level of measurement;
    • provide ranking and equal differences between scale values;
    • have true zero as the origin.

    refined: adj. 精炼的;精确的;微妙的;有教养的

    origin: n. 起源;原点;出身;开端

    7.1.4 数据的展现

    • Holding period return formula

    R t = P t − P t − 1 + D t P t − 1 W h e r e : P t = price per share at the end of time period t P t − 1 = price per share at the end of time period t − 1, the time period immediately preceding time period t D t = cash distributions received during time period t \begin{array}{l} R_t = \frac{P_t - P_{t-1} + D_t}{P_{t-1}} \\ Where: \\ P_t = \text{price per share at the end of time period t} \\ P_{t−1} = \text{price per share at the end of time period t − 1, the time period immediately preceding time period t} \\ D_t = \text{cash distributions received during time period t} \end{array} Rt=Pt1PtPt1+DtWhere:Pt=price per share at the end of time period tPt1=price per share at the end of time period t − 1, the time period immediately preceding time period tDt=cash distributions received during time period t

    • two characteristics:
      • First, it has an element of time attached to it.
      • Second, rate of return has no currency unit attached to it.

    formula: n. 公式,准则;配方;婴儿食品

    holding period return (HPR): 持有期收益,同holding period yield(HPY)

    preceding: adj. 在前的;前述的

    distribution: n. 分布;分配;供应

    attached: adj. 附加的;依恋的,充满爱心的

    element: n. 元素;要素;原理;成分;自然环境

    currency: n. 货币;通货

    unit: n. 单位,单元;装置;部队;部件

    一、频率分布(Frequency Distribution)

    • 频数分布(Frequency distributions):summarize statistical data by assigning it to specified groups or intervals.

    • Steps to construct a frequency distribution:

      1. Step 1: Define the intervals (Sort the data in ascending order (升序), Calculate the range of the data, Decide on the number of intervals (k) in the frequency distribution)
      2. Step 2: Tally the observations (Determine interval width)
      3. Step 3: Count the observations (Count the number of observations , and Construct a table of the intervals)
      • As the above procedure makes clear, a frequency distribution groups data into a set of intervals.
      • Each observation falls into only one interval (mutually exclusive), and the total number of intervals covers all the values represented in the data.
      • In practice, we may want the intervals to begin and end with whole numbers for ease of interpretation.
      • In practice, we also need to explain the choice of the number of intervals, k.
        • A large number of empty intervals may indicate that we are trying to organize the data to present too much detail.
        • we can consider increasingly larger intervals (smaller values of k) until we have a frequency distribution that effectively summarizes the distribution.
    • 众数区间(modal interval):For any frequency distribution, the interval with the greatest frequency.

    • 绝对频数(absolute frequency, simply the frequency):The actual number of observations in a given interval.

    • 相对频数(relative frequency):the absolute frequency of each interval divided by the total number of observations.
      r e l a t i v e   f r e q u e n c y = a b s o l u t e   f r e q u e n c y t o t a l   n u m b e r   o f   o b s e r v a t i o n s relative \ frequency = \frac{absolute \ frequency}{total \ number \ of \ observations} relative frequency=total number of observationsabsolute frequency

    • 累积频数(cumulative absolute frequency, cumulative relative frequency):sum the absolute or relative frequencies starting at the lowest interval and progressing through the highest.

      • Cumulative frequency distribution:

        • plot either the cumulative absolute or cumulative relative frequency against the upper interval limit.
        • X-axis: upper limits;
        • Y-axis: cumulative (absolute/relative) frequency.
        • allows us to see how many or what percent of the observations lie below a certain value.
          在这里插入图片描述
      • The change in the cumulative relative frequency as we move from one interval to the next is the next interval’s relative frequency (or absolute frequency).

      • The fact that the slope is steep indicates that these frequencies are large.

    summarize: vt. 总结;概述

    assigning: v. 分配,分派(工作);指定;委派;确定(价值、时间等);把……归因于;转让

    specified: adj. 规定的;详细说明的

    tally: vt. 使符合;计算;记录

    procedure: n. 程序,手续;步骤

    mutually: adv. 互相地;互助

    exclusive: adj. 独有的;排外的;专一的

    represented: v. 代表;表现;描写

    ease: n. 容易;舒适;安逸

    ​ v. 减轻,缓解;小心缓缓地移动;使容易;放松;(使)贬值;(股票价格、利率等)下降,下跌

    interpretation: n. 解释;翻译;演出

    choice: n. 选择;选择权;精选品

    indicate: vt. 表明;指出;预示;象征

    effectively: adv. 有效地,生效地;有力地;实际上

    plot: vt. 密谋;绘图;划分;标绘

    against: prep. 反对,违反;对……不利;紧靠,倚,碰撞;针对;预防,抵御;(体育比赛)对阵;以……为背景;参照,和……相比;(赌博)预计……的失败

    slope: n. 斜坡;倾斜;斜率;扛枪姿势

    steep: adj. 陡峭的;不合理的;夸大的;急剧升降的

    二、直方图与频数多边形(Histogram vs Frequency Polygon)

    • Histogram

      • the bar chart of absolute frequency distribution;
      • X-axis: intervals;
      • Y-axis: absolute frequency of intervals.
      • we can find the mode (the most concentrated observations) quickly from histogram.
        在这里插入图片描述

    concentrated: adj. 集中的;浓缩的;全神贯注的

    • Frequency Polygon

      • X-axis: the midpoint of the interval;
      • Y-axis: absolute frequency of intervals.
        在这里插入图片描述

    midpoint: n. 中点;正中央

    7.2 中心趋势

    • Measures of location include not only measures of central tendency but other measures that illustrate the location or distribution of data.
    • A measure of central tendency specifies where the data are centered.
      • the arithmetic mean, the median, the mode, the weighted mean, and the geometric mean.

    tendency: n. 倾向,趋势;癖好

    illustrate: vt. 阐明,举例说明;图解

    specifies: vt. 指定;详细说明;列举;把…列入说明书

    7.2.1 均值(Mean)

    一、算术平均(Arithmetic Mean)

    • arithmetic means: the sum of the observation values divided by the number of observations (single-period).

      • population mean:
        μ = ∑ i = 1 N X i N N : 总 体 个 数 \mu = \frac{\sum_{i=1}^N X_i}{N} \\ N: 总体个数 μ=Ni=1NXiN:

      • sample mean:
        X ˉ = ∑ i = 1 n X i n n : 样 本 容 量 \bar X = \frac{\sum_{i=1}^n X_i}{n} \\ n: 样本容量 Xˉ=ni=1nXin:

      • Features of arithmetic means:

        • All interval and ratio data sets have an arithmetic mean.
        • All data values (including outliers) are considered and included in the arithmetic mean computation.
        • A data set has only one arithmetic mean.
    • Tips:

      • Arithmetic mean uses all observations, which may lead to this mean is not a good representation of the data set when there are outliers.

      • Arithmetic mean uses all observations, which means that this mean can be true mean of the data set and can estimates this mean as the value of next observation.

      • This arithmetic mean is the only measure of central tendency for which the sum of the deviations from the mean is zero. Mathematically, this property can be expressed as follows:
        s u m   o f   m e a n   d e v i a t i o n s = ∑ i = 1 n ( X i − X ˉ ) = 0 sum \ of \ mean \ deviations = \sum_{i=1}^n (X_i - \bar X) = 0 sum of mean deviations=i=1n(XiXˉ)=0

    feature: n. 特色,特征;容貌;特写或专题节目

    outliers: n. 离开主体的人(或物);(地质)外露层;(统计)异常值;局外人(远离业务、职务)

    computation: n. 估计,计算

    representation: n. 代表;表现;表示法;陈述

    deviation: n. 偏差;误差;背离

    mathematically: adv. 算术地,数学上地

    二、几何平均(Geometric Mean)

    • calculates investment returns over multiple periods (for the past or for the future);

    • measures compound growth rates.
      G = X 1 ∗ X 2 ∗ ⋯ ∗ X n n = ( X 1 ∗ X 2 ∗ ⋯ ∗ X n ) 1 n R G = ( 1 + R 1 ) ∗ ( 1 + R 2 ) ∗ ⋯ ∗ ( 1 + R n ) n − 1 R n : 代 表 第 n 期 的 收 益 率 G = \sqrt[n]{X_1 * X_2 * \cdots * X_n}=(X_1 * X_2 * \cdots * X_n)^\frac{1}{n} \\ R_G = \sqrt[n]{(1+R_1)*(1+R_2)*\cdots*(1+R_n)}-1 \\ R_n: 代表第n期的收益率 G=nX1X2Xn =(X1X2Xn)n1RG=n(1+R1)(1+R2)(1+Rn) 1Rn:n

    • Tips:

      1. Geometric mean is always less than arithmetic mean.
      2. When all values are equal, the geometric mean and the arithmetic mean are same.
      3. In general, the difference between the arithmetic and geometric means increases with the variability in the period-by-period observations

    variability: n. 可变性,变化性;变异性

    三、调和平均数(Harmonic Mean)

    • Harmonic mean: for computing average cost of shares purchased over time.

    X ˉ H a r m o n i c = N ∑ i = 1 N 1 X i \bar X_{Harmonic} = \frac{N}{\sum_{i=1}^N \frac{1}{X_i}} XˉHarmonic=i=1NXi1N

    • Tips:
      1. for values that not all equal: harmonic mean < geometric mean < arithmetic mean
      2. geometric mean of past annual returns is the appropriate measure of past performance and of future multi-year performance, gives us the average annual compound return
      3. arithmetic mean of past annual returns is the statiscally best estimator of the next year’s (one year) returns.
      4. for forward-looking model, compounding with arithmetic mean of future returns is much better compounding with geometric mean of future returns.
        • Uncertainty in cash flows or returns causes the arithmetic mean to be larger than the geometric mean.
        • The more uncertain the returns, the more divergence exists between the arithmetic and geometric means.

    appropriate: adj. 适当的;恰当的;合适的

    uncertainty: n. 犹豫;不确定的事物;不确定度

    divergence: n. 差异;分歧;分散,发散;(气流或海洋的)分开处

    四、加权平均(Weighted Mean)

    • weighted mean: offer different weights to different observations.
      X ˉ w = ∑ i = 1 n w i X i = ( w 1 X i + w 2 X 2 + ⋯ + w n X n ) w h e r e : X 1 , X 2 , ⋯ X n : o b s e r v e d   v a l u e s w 1 , w 2 , ⋯ w n : w e i g h t s   t o   c o r r e s p o n d i n g   X ( ∑ w i = 1 ) . \begin{array}{l} \bar X_w = \sum_{i=1}^n w_iX_i = (w_1X_i + w_2X_2 + \cdots + w_nX_n) \\ where: \\ X_1,X_2, \cdots X_n: observed \ values \\ w_1,w_2, \cdots w_n: weights \ to \ corresponding \ X (\sum{w_i=1}). \end{array} Xˉw=i=1nwiXi=(w1Xi+w2X2++wnXn)where:X1,X2,Xn:observed valuesw1,w2,wn:weights to corresponding X(wi=1).

    corresponding: adj. 相当的,相应的;一致的;通信的

    7.2.2 中位数(Median)

    • median: the midpoint of a data set when the data is arranged in ascending or descending order.

      • median is not affected by outliers, which means median is better than arithmetic mean to measure the central tendency.

      • the median is less mathematically tractable than the mean.

        • The median, however, does not use all the information about the size and magnitude of the observations.
        • Calculating the median is also more complex.

    arranged: adj. 安排的

    tractable: adj. 易于管教的;易驾驭的;易处理的;驯良的

    mathematically: adv. 算术地,数学上地

    magnitude: n. 大小;量级;震级;重要;光度

    7.2.3 众数(Mode)

    • mode: is the value which occurs most frequently in a data set.
      • there may be more than one mode or no mode.
      • unimodal, bimodal, trimodal.
      • When all the values in a data set are different, the distribution has no mode because no value occurs more frequently than any other value.

    unimodal: adj. 单峰的;用单峰分布描述的

    bimodal: adj. 双峰的;有两种统计方式的

    trimodal: adj. 三峰的;有三种统计方式的

    7.3 离散程度

    • Dispersion: deviation around the central tendency.

    7.3.1 绝对离散程度(Absolute Dispersion)

    • If mean return addresses reward, dispersion addresses risk.
      • Absolute dispersion is the amount of variability present without comparison to any reference point or benchmark.
        • range, mean absolute deviation, variance, and standard deviation.

    reward: n. 报酬;报答;酬谢

    comparison: n. 比较;对照;比喻;比较关系

    reference: n. 参考,参照;涉及,提及;参考书目;介绍信;证明书

    benchmark: n. 基准;标准检查程序

    variance: n. 变异;变化;不一致;分歧;方差

    一、分位数(Quantiles)

    • 四分位数(Quartiles): divided into quarters

    • 五分位数(Quintiles): divided into fifths

    • 十分位数(Deciles): divided into tenths

    • 百分位数(Percentiles): divided into hundredths

    • Given a set of observations, the yth percentile is the value at or below which y percent of observations lie.

    • To find quantiles:

      1. decide the positon using the following formula:
        L y = ( n + 1 ) y 100 w h e r e : y: given percentile n: number of data sorted in ascending order. L y : Location of y \begin{array}{l} L_y = (n + 1)\frac{y}{100} \\ where: \\ \text{y: given percentile} \\ \text{n: number of data sorted in ascending order.} \\ L_y: \text{Location of y} \end{array} Ly=(n+1)100ywhere:y: given percentilen: number of data sorted in ascending order.Ly:Location of y

      2. When L y L_y Ly is whole number, find the value directly; When L y L_y Ly is not a whole number: linear iteration.

    lie: vi. 躺;说谎;位于;展现

    formula: n. 公式,准则;配方;婴儿食品

    linear: adj. 线的,线型的;直线的,线状的;长度的

    iteration: n. 迭代;反复;重复

    二、极差(Range)

    • Range: the distance between the largest and the samllest value in data set.
      • Range= maximum value - minimum value
      • One advantage of the range is ease of computation.
      • A disadvantage is that the range uses only two pieces of information from the distribution. It cannot tell us how the data are distributed (that is, the shape of the distribution).

    三、平均绝对离差(Mean Absolute Deviation,MAD)

    • MAD: the mean value of the absolute values of the deviation of individual observations from the arithmetic mean.
      M A D = ∑ i = 1 n ∣ X i − X ˉ ∣ n MAD = \frac{\sum_{i=1}^n \mid X_i - \bar X \mid}{n} MAD=ni=1nXiXˉ

      • The mean absolute deviation uses all of the observations in the sample and is thus superior to the range as a measure of dispersion.
      • One technical drawback of MAD is that it is difficult to manipulate mathematically compared with the next measure we will introduce, variance.

    individual: adj. 个人的;个别的;独特的

    absolute value: 绝对值

    technical: adj. 工艺的,科技的;技术上的;专门的

    drawback: n. 缺点,不利条件;退税

    manipulate: vt. 操纵;操作;巧妙地处理;篡改

    四、方差与标准差(Variance and Standard Deviation)

    • Variance is defined as the average of the squared deviations around the mean.

    • Standard deviation is the positive square root of the variance.

    • Population variance: the arithmetic average squared deviations from the mean.
      σ 2 = ∑ i = 1 N ( X i − μ ) 2 N \sigma^2 = \frac{\sum_{i=1}^N (X_i - \mu)^2}{N} σ2=Ni=1N(Xiμ)2

    • Population standard deviation: the positive squared root of the population variance.
      σ = ∑ i = 1 N ( X i − μ ) 2 N \sigma = \sqrt{\frac{\sum_{i=1}^N (X_i - \mu)^2}{N}} σ=Ni=1N(Xiμ)2

    • Sample variance: the statistic that measures of dispersion for sample of n observations from a population.
      s 2 = ∑ i = 1 n ( X i − X ˉ ) 2 n − 1 Use the entire number of sample observations, n, instead of n − 1, will systematically underestimate the population parameter,  σ 2 , particularly for small sample sizes. s^2 = \frac{\sum_{i=1}^n (X_i - \bar X)^2}{n - 1} \\ \text{Use the entire number of sample observations, n, instead of n − 1, will systematically underestimate the population parameter, } σ^2 \text{, particularly for small sample sizes.} s2=n1i=1n(XiXˉ)2Use the entire number of sample observations, n, instead of n − 1, will systematically underestimate the population parameter, σ2, particularly for small sample sizes.

    • Sample standard deviation: positive square root of the sample variance.
      s = ∑ i = 1 n ( X i − X ˉ ) 2 n − 1 s = \sqrt{\frac{\sum_{i=1}^n (X_i - \bar X)^2}{n - 1}} \\ s=n1i=1n(XiXˉ)2

    positive: adj. 积极的;正的,阳性的;确定的,肯定的;实际的,真实的;绝对的

    square root: 平方根;二次根

    entire: adj. 全部的,整个的;全体的

    systematically: adv. 有系统地;有组织地

    underestimate: vt. 低估;看轻

    particularly: adv. 异乎寻常地;特别是;明确地

    7.3.2 相对离散程度(Relative Dispersion)

    一、变异系数(Coefficient of Variation,CV)

    • Relative dispersion: the amount of variability in a distribution relative to a reference point of benchmark.

    • Relative dispersion is commonly measured with the coefficient of variation(CV),which is the ratio of the standard deviation of a set of observations to their mean value.

      C V = S X X ˉ = standard deviation of X average value of X CV = \frac{S_X}{\bar X} = \frac{\text{standard deviation of X}}{\text{average value of X}} CV=XˉSX=average value of Xstandard deviation of X

    • CV measures the amount of dispersion relative to the distribution’s mean.

    • CV measures the risk (variability) per unit of expected return (mean).

    • the coefficient of variation permits direct comparisons of dispersion across different data sets.

    • the coefficient of variation is a scale-free measure (that is, it has no units of measurement).

    • 变异系数衡量的是单位收益下的风险,夏普比率衡量单位风险下的超额收益。
      Sharp ratio = R P − R F σ P R P : 代 表 资 产 P 的 收 益 率 ; σ P : 代 表 资 产 收 益 率 的 标 准 差 ; R F : 代 表 无 风 险 收 益 率 ; R P − R F : 表 示 资 产 P 超 过 无 风 险 收 益 率 的 超 额 收 益 ( e x c e s s   r e t u r n ) 。 \begin{array}{l} \text{Sharp ratio} = \frac{R_P - R_F}{\sigma_P} \\ R_P:代表资产P的收益率; \\ \sigma_P:代表资产收益率的标准差; \\ R_F:代表无风险收益率; \\ R_P - R_F:表示资产P超过无风险收益率的超额收益(excess \ return)。 \end{array} Sharp ratio=σPRPRFRPPσPRFRPRFPexcess return

      • 与变异系数不同,对于风险厌恶的投资者而言,夏普比率越大越好。

    permit: vi. 许可;允许

    二、切比雪夫不等式(Chebyshev’s Inequality)

    • Chebyshev’s inequality: According to Chebyshev’s inequality, for any distribution with finite variance, the proportion of the observations within k standard deviations of the arithmetic mean is at least 1 − 1 k 2 1 − \frac{1}{k^2} 1k21 for all k > 1.
    • The importance of Chebyshev’s inequality stems from its generality. The inequality holds for samples and populations and for discrete and continuous data regardless of the shape of the distribution.

    finite: adj. 有限的;限定的

    proportion: n. 比例,占比;部分;面积;均衡

    stems: n. 干;茎;船首;血统

    generality: n. 概论;普遍性;大部分

    discrete: adj. 离散的,不连续的

    continuous: adj. 连续的,持续的;继续的;连绵不断的

    regardless: adj. 不管的,不顾的;不注意的

    7.4 偏度与峰度(Skewness and Kurtosis)

    7.4.1 偏度(Skewness)

    • Symmetrical distribution: shaped identically on both sides of its mean.

    • One of the most important distributions is the normal distribution, which has the following characteristics:

      • Its mean, median, and mode are equal.
      • It is completely described by two parameters—its mean and variance.
      • Roughly 68 percent of its observations lie between plus and minus one standard deviation from the mean; 95 percent lie between plus and minus two standard deviations; and 99 percent lie between plus and minus three standard deviations.
        在这里插入图片描述
    • Skewness: the extent to which a distribution is not symmetrical (resulted from outliers).

      • positively skewed(right skewed): many outliers in the right tail; long upper(right) tail; frequent small losses and a few extreme gains.

      • negatively skewed(left skewed): many outliers in the left tail; long lower(left) tail; frequent small gains and a few extreme losses.

      • Median is always in the middle.
        在这里插入图片描述
        在这里插入图片描述

    • Investors should be attracted by a positive skew because the mean return falls above the median. Relative to the mean return, positive skew amounts to a limited, though frequent, downside compared with a somewhat unlimited, but less frequent, upside.

    • Skewness (sometimes referred to as relative skewness) is computed as the average cubed deviation from the mean standardized by dividing by the standard deviation cubed to make the measure free of scale.
      sample skewness = 1 n ∑ i = 1 n ( X i − X ˉ ) 3 s 3 s : sample standard deviation \begin{array}{l} \text{sample skewness} = \frac{1}{n} \frac{\sum_{i=1}^n (X_i - \bar X)^3}{s^3} \\ s: \text{sample standard deviation} \end{array} sample skewness=n1s3i=1n(XiXˉ)3s:sample standard deviation

    • symmetric: sample skewness=0.

    • right skewed: sample skewness>0; the average magnitude of positive deviations is larger than the average magnitude of negative deviations.

    • left skewed: sample skewness<0.

    symmetrical: adj. 匀称的,对称的

    identically: adv. 同一地;相等地

    roughly: adv. 粗糙地;概略地

    extent: n. 程度;范围;长度

    somewhat: adv. 有点,稍微

    cubed: adj. 切成小方块的

    7.4.2 峰度(Kurtosis)

    • Kurtosis is a measure of the combined weight of the tails of a distribution relative to the rest of the distribution—that is, the proportion of the total probability that is in the tails.

      • Leptokurtic: more peaked than a normal distribution (more peaked, fat tails)
      • Platykurtic: less peaked or flatter than a normal distribution (less peaked, thin tails)
      • Mesokurtic: same kurtosis with a normal distribution

      sample kurtosis = 1 n ∑ i = 1 n ( X i − X ˉ ) 4 s 4 s : sample standard deviation \begin{array}{l} \text{sample kurtosis} = \frac{1}{n} \frac{\sum_{i=1}^n (X_i - \bar X)^4}{s^4} \\ s: \text{sample standard deviation} \end{array} sample kurtosis=n1s4i=1n(XiXˉ)4s:sample standard deviation

    • 尖峰态(Leptokurtic)

      • The leptokurtic distribution is more likely than the normal distribution to generate observations in the tail regions defined by the intersection of graphs near a standard deviation of about ±2.5.
      • The leptokurtic distribution is also more likely to generate observations that are near the mean, defined here as the region ±1 standard deviation around the mean.
      • In compensation, to have probabilities sum to 1, the leptokurtic distribution generates fewer observations in the regions between the central region and the two tail regions.
        在这里插入图片描述
    • 超峰度(Excess kurtosis)

      • excess kurtosis: more or less kurtosis than the normal distribution.

      e x c e s s   k u r t o s i s = s a m p l e   k u r t o s i s − 3 excess \ kurtosis = sample \ kurtosis - 3 excess kurtosis=sample kurtosis3

    • For a sample of 100 or larger taken from a normal distribution, a sample excess kurtosis of 1.0 or larger would be considered unusually large.

    • In general, greater positive kurtosis and more negative skew in returns distributions indicates increased risk.

    在这里插入图片描述

    展开全文
  • 第一章 Date and Statistics1.2 Data度量数据的维度(Scales of Measurement),主要分为以下几类:nominal scale(字面量的,直接通过属性描述就能识别出价值信息), ordinal scale (the data can beranked, or ordered...

    第一章 Date and Statistics

    1.2 Data


    度量数据的维度(Scales of Measurement),主要分为以下几类:nominal scale(字面量的,直接通过属性描述就能识别出价值信息), ordinal scale (the data can be
    ranked, or ordered, with respect to the service quality),
    interval scale (Interval data are always numeric),  ratio scale.

    总结:

    nominal 数据间是独立的,无关联性。
    ordinal 数据间是具有关联性的。需要进行比较。
    interval 在ordinal 基础上进行进一步测量。
    ratio scale 在interval 基础上进行进一步测量。


    Categorical (分类)and Quantitative(量化) Data

    The statistical method appropriate for summarizing data depends upon whether the data are categorical or quantitative.

    Categorical data use either the nominal or ordinal scale of measurement.

    Quantitative data are obtained using either the interval or ratio scale of measurement.

    even when the categorical data are identified by a numerical code, arithmetic operations such as addition, subtraction, multiplication, and division do not provide meaningful results.

    Arithmetic operations provide meaningful results for quantitative variables. more alternatives for statistical analysis are possible when data are quantitative.

    总结:

    在可分类的和量化数据对比中,量化数据的对于统计分析意义更大。


    Cross-Sectional (横截面,在一个时间切面上)and Time Series (具有大的时间跨度)Data。

    具有大的时间跨度的数据对应统计分析意义更大。

    1.3 Data Sources

    Data can be obtained from existing sources or from surveys and experimental studies designed to collect new data.

    Statistical Studies

    Statistical studies can be classified as either experimental or observational.

    In an experimental study, a variable of interest is first identified.

    Nonexperimental, or observational, statistical studies make no attempt to control the variables of interest. A survey is perhaps the most common type of observational study. For instance, in a personal interview survey, research questions are first identified

    Data Acquisition Errors

    Data analysts also review data with unusually large and small values, called outliers, which are candidates for possible data errors. 

    异常值(outliter)的定义。

    1.4 Descriptive Statistics

    The most common numerical descriptive statistic is the average, or mean.

    展开全文
  • 个体(element):搜集...名义尺度(nominal scale):无顺序无等级,比如:成员,非成员 顺序尺度(ordinal scale):有顺序有等级意义,数据的间隔无意义,比如:A级,B级 间隔尺度(interval scale):具有顺序数...
  • 标称标度 nominal scale 标称属性的值是一些符号或实物的名称,每个值代表某种类别、编码或状态,所以标称属性又被看做是分类型的属性(categorical)。这些值不存在顺序关系,并且不是定量的。 如:血型、身份号码...
  • Nominal,Ordinal,Interval and Ratio分别是定类、定序、定距、定比,定类变量值只是分类,如性别变量的男女;定序变量值可以排序,但不能加减,如年级变量;定距变量值是数字型变量,可以加减;定比变量值和定距变量值...
  • 一、标准化(Z-Score)标准化或者去除均值和方差缩放公式为:(X-mean)/std 计算时对每个属性/每...实现时,有两种不同的方式:使用sklearn.preprocessing.scale()函数,可以直接将给定数据进行标准化。使用sklearn.prep
  • What is the difference between categorical, ordinal and interval ...In talking about variables, sometimes you hear variables being described as categorical (or sometimes nominal), or ordinal, o...
  • Example: temperature in Fahrenheit scale:10 20 30 etcNote that 20F is not twice as cold as 40F. So multiplication does not make sense on Interval data. But addition and subtraction works. Which ...
  • This paper presents the full research and development cycle of a high-precision absolute linear encoder based on a standard calibrated scale. Already available and used in industry standard scales of ...
  • The First Takeoff of a Biologically Inspired At-Scale Robotic Insect 第一次生物启发的大规模机器人昆虫的起飞 --Robert J. Wood,Member, IEEE Abstract—Biology is a useful tool when applied to ...
  • 让人们久等了的TCP BBR v2.0快要出炉了!

    万次阅读 多人点赞 2018-06-09 03:21:17
    这是连续的第四个雨夜了,这几天暴雨我几乎每晚都半通宵,晚上10点半左右睡觉,然后1点多就会醒来,听雨作文,无视并嘲笑着蚊子和飞机。 雨季来得晚了些,但却是猛的,我知道本周这可能是最后的雨夜了,所以我必须...
  • 学生答:课表上有,要不我才不学呢!!!于是就有了如下的感悟: 一个如此重要并且可以将数字计算机优势应用于实际系统的课程,在学生看来,不过是做题和考试。 当然也有理智的回答,如下所示: ...
  • such as phase angle, wind direction, or time of day Qualitative schemes for nominal data that has no inherent ordering, where color is used only to distinguish categories Matplotlib ships with 4 ...
  • AI:人工智能概念之机器学习、深度学习中常见关键词、参数等中英文对照(绝对干货) 导读 本博主基本收集了网上所有有关于ML、DL的中文解释词汇,机器学习、深度学习中常见关键词、参数等中英文对照,如有没有...
  • 机器学习专业名词中英文对照

    万次阅读 多人点赞 2017-09-29 09:57:15
    部分转自知乎 部分转自AI人工智能专业词汇集 部分转自百度文库 可参考链接:机器之心otheractivation 激活值 activation function 激活函数 additive noise 加性噪声 autoencoder 自编码器 ...
  • FractalNets [17] repeatedly combine several parallel layer sequences with different number of convolutional blocks to obtain a large nominal depth, while maintaining many short paths in the network....
  • DICOM世界观·第二章:[2]像素操作

    千次阅读 2018-07-21 15:16:32
    For a nominal arm’s length of 28 inches, the visual angle is therefore about 0.0213 degrees. For reading at arm’s length, 1px thus corresponds to about 0.26 mm (1/96 inch). 标准中已经假设...
  • 一、什么是数据挖掘 数据挖掘(Data Mining),也叫数据开采、数据采掘等,是从大量的、不完全的、有噪声的、模糊的、随机的实际应用数据中,自动提取隐含在其中的、人们事先不知道的,但又是潜在有用的信息的过程...
  • Ubuntu 意外死机 (Linux Crash/Hang)解决以Intel Bay Trail/J1900/N2940 为例,通常是由于linux kernel和硬件兼容性问题导致:查询网址:https://bugzilla.kernel.org/点开对应问题,就可以看到问题,和一些解决...
  • ACL 2016 Accepted Papers

    千次阅读 2017-05-30 22:39:09
    CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases Zihang Dai, Lei Li and Wei Xu Chinese Couplet Generation with Neural Network Structures Rui Yan ...
  • 位置: libavcodec/avcodec.h 描述:主要扩展API的结构体New fields can be added to the end with minor version bumps.
  • FreeType 管理字形

    千次阅读 2017-03-07 11:10:50
    Face->glyph->metrics中的度量通常以26.6象素格式(例如1/64象素)表示,除非你在调用FT_Load_Glyph或FT_Load_Char时使用了FT_LOAD_NO_SCALE标志,这样的话度量会用原始字体单位表示。  字形槽(glyph slot)对象也...
  • BYTE scale = 255 - (BYTE)(256*realDepth/0x0fff); buf[CHANNEL*i] = buf[CHANNEL*i+1] = buf[CHANNEL*i+2] = 0; switch( index ) { case 0: buf[CHANNEL*i]=scale/2; buf[CHANNEL*i+1]=scale/2; ...
  • 威卡螺旋管压力表,向后安装,直接驱动说明书pdf,Pressure gauge with spiral tube Model 116.15, back mount Direct drive version Applications ... For the measurement ... Scale ranges up to 0 … 400 bar
  • 大数据可视化Google Chart实现

    千次阅读 2016-01-04 19:37:45
    如:但 scale=2.0,translation=-1.0时,你能将属性值规范到区间[-1,+1]。 离散化(discretize) weka.filters.supervised.attribute.Discretize weka.filters.unsupervised.attribute.Discretize 。 ...
  • 一、编码录入1、区分变量的度量:Scale是定量、Ordinal是定序、Nominal是指定类2、注意定义不同的数据类型。单选、多选、排序、开放题目四种类型,他们的变量的...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 1,015
精华内容 406
关键字:

nominalscale