精华内容
下载资源
问答
  • decision tree

    2021-04-21 20:25:27
    参考: 决策树:https://github.com/apachecn/AiLearning/blob/master/src/py3.x/ml/3.DecisionTree/DecisionTree.py 画图: ...

    参考:
    决策树:https://github.com/apachecn/AiLearning/blob/master/src/py3.x/ml/3.DecisionTree/DecisionTree.py
    画图:
    https://github.com/apachecn/AiLearning/blob/master/src/py3.x/ml/3.DecisionTree/decisionTreePlot.py
    文档:
    https://apachecn.gitee.io/ailearning/#/docs/ml/3
    字典keys下表访问问题:
    https://www.runoob.com/python/python-dictionary.html

    展开全文
  • Decision Tree

    2018-07-13 23:06:00
    Decision Tree builds classification or regression models in the form of a tree structure. It break down dataset into smaller and smaller subsets while an associated decision tree in incrementally deve...

    Decision Tree builds classification or regression models in the form of a tree structure. It break down dataset into smaller and smaller subsets while an associated decision tree in incrementally developed at the same time.
    16341387.jpg
    Decision Tree learning use top-down recursive method. The basic idea is to construct one tree with a fastest declines of information entropy, the entropy value of all instance in each leaf nodes is zero. Each internal node of the tree corresponding to an attribute, and each leaf node corresponding to a class label.
    Advantages:

    • Decision is easy to explain. It results in a set of rules. It is the same approach as humans generally follow while making decisions.
    • Interpretation of a complex Decision Tree can be simplified into visualization.It can be understood by everyone.
    • It almost have no hyper-parameter.

    Infomation Gain

    • The entropy is:
      78439085.jpg
    • By the information entropy, we can calculate their Experience entropy:
      72910098.jpg
      where:
      64267068.jpg
    • we can also calculate their Experience conditions entropy:
      87829885.jpg
    • By the information entropy, we can calculate their information gain:
      95246580.jpg
    • Information gain ratio:
      23490043.jpg
    • Gini index:
      60776566.jpg
      For binary classification:
      23490043.jpg
      For binary classification and on the condition of feature A:
      6425096.jpg

    Three Building Algorithm

    • ID3: maximizing information gain
    • C4.5: maximizing the ratio of information gain
    • CART
      • Regression Tree: minimizing the square error.
      • Classification Tree: minimizing the Gini index.

    Decision Tree Algorithm Pseudocode

    • Place the best attribute of the dataset at the root of tree.The way to the selection of best attribute is shown in Three Building Algorithm above.
    • Split the train set into subset by the best attribute.
    • Repeat Step 1 and Step 2 on each subset until you find leaf nodes in all the branches of the tree.

    Random Forest

    Random Forest classifiers work around that limitation by creating a whole bunch of decision trees(hence 'forest'), each trained on random subsets of training samples(bagging, drawn with replacement) and features(drawn without replacement).Make the decision tree work together to get result.
    In one word, it build on CART with randomness.

    • Randomness 1:train the tree on the subsets of train set selected by bagging(sampling with replacement).
      13195545.jpg

    • Randomness 2:train the tree on the subsets of features(sampling without replacement). For example, select 10 features from 100 features in dataset.
      66833863.jpg
    • Randomness 3:add new feature by low-dimensional projection.
      32265435.jpg

    后记

    装逼想用英文写博客,想借此锻炼自己的写作能力,无情打脸( ̄ε(# ̄)

    Ref:https://clyyuanzi.gitbooks.io/julymlnotes/content/rf.html
    http://www.saedsayad.com/decision_tree.htm
    http://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/
    统计学习方法(李航)

    转载于:https://www.cnblogs.com/mengnan/p/9307613.html

    展开全文
  • Decision tree

    2019-07-27 16:23:37
    Decision tree 原理 选择属性判断结点:IG越大,就在越上面 信息获取量(Information Gain):Gain(A) = Info(D) - Infor_A(D) ...

                                                                Decision tree

     

    原理

    选择属性判断结点:IG越大,就在越上面

    信息获取量(Information Gain):Gain(A) = Info(D) - Infor_A(D)

     

     

     

     

    过程

    (1)读整行代码

    (2)分为特征、标签

    (3)每行的特征通过布尔来显示(比如age有youth、middle_age、senor,就可以设置为0、0、1,,将一维转为三维)

    (4)调用DecisionTreeClassifier(criterion=’entropy’)即可

    (5)也可通过改变布尔数据来度额定输出

     

     

    调试

    (1)AttributeError: '_csv.reader' object has no attribute'next'  →  reader.next()改为next(reader)

    (2)Iterator should return strings ,not bytes   →     “rb”改为“rt”即可

     

     

    优点

    直观,便于理解,小规模数据集有效     

     

     

    缺点

     (1) 处理连续变量不好

     (2)类别较多时,错误增加的比较快

     (3)可规模性一般

     

    展开全文
  • DecisionTree

    2016-05-30 15:30:16
    C++语言对统计学习中的决策树ID3算法的简单实现

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 5,176
精华内容 2,070
关键字:

decisiontree