精华内容
下载资源
问答
  • 2016-02-15 15:25:38

    什么是启发式搜索

    之前我们讲到需要优化一个重要的函数,就是 gen 函数 顾名思义就是生成待搜索的位置的函数。这个函数目前做了一个很简单的处理,就是只要一个空位的周围有邻居即可。而其实这么做是非常不合理的,它的不合理性体现在两方便:

    1. 没有对结果进行排序,完全是按照数组的遍历顺序的。而Alpha Beta 剪枝的效率是非常依赖节点顺序的,这个我们马上就会讲一下。
    2. 没有排除不需要节点。如果能减少一些不必要的节点,那么其实就是优化了 M^N 中的M,优化效果是非常明显的。

    我们接下来分别针对上面两点讲一下如何优化启发式搜索函数

    对待搜索节点进行排序

    这里写图片描述

    还是前一章的那张图,上面可以看到在第二层中,第一个节点的值是3,因为他其实是本层中的极小值,导致后面的两个节点都可以进行剪枝(这里第二个节点的第二个孩子也可以剪掉的)。这是最好的一种情况,即在MIN层中极小值是第一个节点,那么后序的所有节点都可以根据这个极小值进行剪枝,即使极小值不在第一个节点,只要大致能按照从小到大的顺序排列,也会剪掉很多节点。如果很不幸,这一层的节点是从大到小排列的,那么剪枝就完全没有用。

    对于Beta 剪枝也是同样的道理。所以说Alpha Beta剪枝的效率是取决于每一层节点的顺序的。 我们肯定是无法精确排序的,因为每一个节点的值并不能直接计算出来,需要递归计算子节点。 但是我们依然能对节点进行大致的一个排序。前面说过了,只要有一个大致的排序 其实就能很好的提升剪枝效率。

    那么如何排序呢?就是给所有待搜索的位置进行打分,按照分数的高低来排序。注意这个打分算法是对某一个空位进行打分,和对整个棋盘进行打分的 evaluate 函数是不一样的。不过打分的基本原理是相同的。具体就是根据这个位置是否能成五,活四,活三等来进行打分。具体的代码有些长就不贴出来了,请参见 evaluate-point.js

    有了打分之后,我们就可以按照分数高低进行排序了。具体实现的时候,是根据按照 成五,活四,双三,活三,其他 的顺序来排序的。

    删除不必须要的节点

    这个难度也比较高,目前能做的就是,还是按照 成五,活四,双三的顺序,因为这三种是必杀棋,只要出现了,就不用再考虑其他节点了,如果都没有,才需要考虑其他节点。

    代码实现

    综合上面两种情况,启发式搜索函数的代码如下:

    var R = require("./role.js");
    var scorePoint = require("./evaluate-point.js");
    var S = require("./score.js");
    
    var gen = function(board, deep) {
    
      var fives = [];
      var fours=[];
      var twothrees=[];
      var threes = [];
      var twos = [];
      var neighbors = [];
      var nextNeighbors = [];
    
      for(var i=0;i<board.length;i++) {
        for(var j=0;j<board[i].length;j++) {
          if(board[i][j] == R.empty) {
            if(hasNeighbor(board, [i, j], 1, 1)) { //必须是有邻居的才行
              var scoreHum = scorePoint(board, [i,j], R.hum);
              var scoreCom= scorePoint(board, [i,j], R.com);
    
              if(scoreCom >= S.FIVE) {//先看电脑能不能连成5
                return [[i, j]];
              } else if(scoreHum >= S.FIVE) {//再看玩家能不能连成5
                //别急着返回,因为遍历还没完成,说不定电脑自己能成五。
                fives.push([i, j]);
              } else if(scoreCom >= S.FOUR) {
                fours.unshift([i,j]);
              } else if(scoreHum >= S.FOUR) {
                fours.push([i,j]);
              } else if(scoreCom >= 2*S.THREE) {
                //能成双三也行
                twothrees.unshift([i,j]);
              } else if(scoreHum >= 2*S.THREE) {
                twothrees.push([i,j]);
              } else if(scoreCom >= S.THREE) {
                threes.unshift([i, j]);
              } else if(scoreHum >= S.THREE) {
                threes.push([i, j]);
              } else if(scoreCom >= S.TWO) {
                twos.unshift([i, j]);
              } else if(scoreHum >= S.TWO) {
                twos.push([i, j]);
              } else {
                neighbors.push([i, j]);
              }
            } else if(deep >= 2 && hasNeighbor(board, [i, j], 2, 2)) {
              nextNeighbors.push([i, j]);
            }
          }
        }
      }
    
      //如果成五,是必杀棋,直接返回
      if(fives.length) return [fives[0]];
    
      if(fours.length) return fours;
    
      if(twothrees.length) return twothrees;
    
      return threes.concat(
          twos.concat(
            neighbors.concat(nextNeighbors)
          )
        );
    }
    

    优化效果

    前一章讲过 Alpha Beta 剪枝之后每一步平均大约计算 50W 个节点,需要10秒钟。经过启发式函数的优化之后,每一步平均1W节点左右, 不到一秒钟的时间, 可以看到启发式搜索函数的优化效果是非常巨大的,效率提升了 50 倍。

    进一步优化

    目前master上的代码就包括了我前面讲到的全部优化方法。进一步的优化会包括置换表,以及更进一步的优化剪枝算法和启发式搜索函数。具体的做法还在考虑中。

    更多相关内容
  • C++基于启发式搜索博弈的AI五子棋.zip
  • 2、启发式搜索: 在实际中,你可以使用一些手段来判断一些点明显的好坏。还记得之前讲的策略表,我使用的策略表就有判断一个点好坏的公式。如果一个优秀的策略表,最优选择点有超大概率会是排名在前10的这些点钟。...



    前面讲到了剪枝算法作为一种对于极大极小博弈算法的优化,是提高效率的一种好办法。但是我们不满足于此。因为一般的普通人是可以思考到6层以上。为了对效率进行提高,我们还有两种办法。想象一下如下图,电脑可能走棋如果从4个变为2个,那么效率的提升就会提高一倍。



    其实深入下去就会发现,这些优化都是理所当然。因为有一些明显的不好的点直接就可以排除掉。就比如在最开始的时候,没有人会在最左上的0,0下子一样。在实际中,我进行的优化包括:


    1、我只会在已有点不超过距离2的区域内下子。如下图可能的下子范围,直接缩小区域



    bool Game::hasNeighbor(int x,int y)  //距离为2之类返回true
    {
        bool flag =false;
        int redirect[8][2] = {{1,0},{1,1},{1,-1},{0,1},{0,-1},{-1,0},{-1,1},{-1,-1}};
        for(int j = 1;j<=2;j++)
        {
            for(int i = 0;i<8;i++)
            {
                   int m  = j*redirect[i][0]+x;
                   int n =  j*redirect[i][1]+y;
                   if(m>=0 && m<15 && n >=0 && n<15 && chess[m][n])
                   {
                        flag = true;
                        break;
                   }
            }
        }
        return flag;
    }
    
    

    2、启发式搜索:

    在实际中,你可以使用一些手段来判断一些点明显的好坏。还记得之前讲的策略表,我使用的策略表就有判断一个点好坏的公式。如果一个优秀的策略表,最优选择点有超大概率会是排名在前10的这些点钟。所以我直接就可以将每一步可能的节点都锁定在10个之类,大大提高了效率。

    3、对于剪枝算法的优化,还有一点非常重要的就是对于节点的排序。考虑我们之前用到的图。

    如果顺序是这样的:

    那么我们就可以直接排除掉2个分支,这就是初略排序的威力。

    这里使用了插入排序的方式,来求得10个最大的点,并进行了排序。你也可以使用快排。但是一定要释放内存。

    这样效率就能大大提升。

    QVector<QPoint> Game::gen(int deep,int type) //返回可能选择的点
    {
        int top= -1;
        bool first =true;
        QVector<node * > result(12);
    
    
        for(int x = 0;x<15;x++)
        {
            for(int y= 0;y<15;y++)
            {
    
    
               if(chess[x][y]==0 && hasNeighbor(x,y))
               {
    
    
                   int flag = 1;
                     for(int i= 0;i<=top;i++)
                     {
                        if(result[i]->point.x()==x &&  result[i]->point.y()==y)
                        {
                            flag=0;
                            break;
                        }
                     }
                     if(!flag)
                     {
                           continue;
                     }
    
    
                    node * snode = new node;
                   snode->point =QPoint(x,y);
                   int tmpscore = snode->sscore = evaluteScore2(QPoint(x,y),type);
    //evaluteScore2是一个评估函数,和我之前策略表使用的判断方法相似,就是判断改点组成的所有五元组的得分
    
    
                    if(first)
                    {
                        result[0] =snode;
                        first=!first;
                        top=0;
                    }
                    else
                    {
                        if( top==9 && tmpscore <result[9]->sscore)
                        {
                            continue;
                        }
    
    
                        int tmp = top;
                        while(tmp>=0 && result[tmp]->sscore<tmpscore)
                        {
                            result[tmp+1] = result[tmp];
                            tmp--;
                        }
                        result[tmp+1] = snode;
                        top++;
                       if(top>9)
                       {
                           top=9;
                       }
                    }
                }
            }
        }
    QVector<QPoint> temp;
    for(int i =0 ;i<10;i++)
    {
        if(i<=top)
        temp.push_back(result[i]->point);
    }
    for(int i =0 ;i<=top;i++)
    {
       free(result[i]);
    }
    
    
        return temp;
    }
    
    
    
    

    总结:1、通过以上方式我进行6层的搜索也毫不费力,可以接受的时间内甚至可以达到8层。可以很轻松的战胜低端玩家。

    2、在QT中,一定要释放内存,不释放后果严重。我在开发中发现,由于这个gen函数会一直在min-max算法的递归中调用,我下一步棋如果不释放内存,就会增加200M。

    3、优化的3种方式:距离、预先评估该点,排序。

    展开全文
  • 五子棋ai启发式搜索 介绍 (Introduction) The special thing I found when I first started diving into the field of Artificial Intelligence was the infinite amount of parallels between how neural networks ...

    五子棋ai启发式搜索

    介绍 (Introduction)

    The special thing I found when I first started diving into the field of Artificial Intelligence was the infinite amount of parallels between how neural networks learn and my subjective experience of my own intelligence.

    我刚开始涉足人工智能领域时发现的特别之处是,神经网络的学习方式与我对自己的智能的主观体验之间有着无数的相似之处。

    So, I decided to write a series of posts about such parallels. I would summarize this first one with the following proposition:

    因此,我决定写一系列有关这种并行的文章。 我将用以下命题来总结第一个:

    We can learn new skills faster if we frame the process as a supervised learning task.

    如果我们将流程视为监督学习任务,则可以更快地学习新技能。

    监督学习 (Learning as Supervised Learning)

    When I reference supervised learning here, I am talking about the idea of models learning by leveraging exposure to relevant examples using an advanced pattern recognition system.

    当我在这里引用监督学习时,我在说的是通过使用高级模式识别系统利用相关示例的知识来进行模型学习的想法。

    If you are involved with AI you have seen a picture like this:

    如果您参与了AI,您将看到如下图片:

    Image for post

    Yes, the inputs, the outputs, the cost functions…

    是的,投入,产出,成本函数……

    But, on its core, supervised learning fascinated me because it directly spoke to the way I learn all sorts of different subjects.

    但是,从本质上讲 ,有监督的学习使我着迷,因为它直接影响了我学习各种不同学科的方式

    This becomes clearer if we break down the intuitions behind the main concepts involved in a basic neural network.

    如果我们打破了基本神经网络所涉及的主要概念的直觉,这一点将变得更加清楚。

    Inputs: data with patterns one wishes to grasp.

    输入:具有希望掌握的模式的数据。

    A forward pass: transformation of this input

    前向传递:此输入的变换

    A cost function representing how far the output is from an ideal goal and a backward pass where the model can update its strategies to learn better.

    成本函数代表输出与理想目标的距离,以及模型可以更新其策略以更好地学习的反向传递

    It is easy to see how such steps relate to our human experience of learning. You dive into a new field like math, combat sports, or cooking and you don’t know much about it, but the more you are exposed to examples of these topics, the more you learn even without consciously trying to!

    很容易看出这些步骤与我们人类的学习经验之间的关系。 您进入了数学,格斗运动或烹饪等新领域,对此一无所知,但是您接触这些主题的例子越多,即使没有自觉地尝试也学到的东西越多

    通过曝光学习 (Learning through exposure)

    Take kids for example,

    以孩子们为例,

    Image for post
    Photo by Ben White on Unsplash
    本·怀特 ( Ben White)Unsplash拍摄的照片

    in this very enlightening talk, psychologist Chris Lonsdale points out that babies learn to speak partially because they are constantly exposed to adults who are experts at that particular language and can provide them with helpful feedback, so they pick up on the language just by basic trial and error.

    这个很有启发性的演讲中 ,心理学家克里斯·朗斯代尔(Chris Lonsdale)指出,婴儿之所以学习部分语言,是因为他们经常与成年人接触,他们是该特定语言的专家 ,可以为他们提供有用的反馈,因此,他们只是通过基本的试验就可以选择该语言和错误。

    Now, I don’t want to get into what kind of complicated loss function these kid’s brains are trying to optimize,

    现在,我不想了解这些孩子的大脑正在尝试优化的复杂损失函数,

    Image for post
    Image for post

    because that is beside the point (although a very interesting one).

    因为那是重点( 尽管很有趣 )。

    The point is: they manage to learn without having to appeal to some complicated system or rule.

    关键是:他们设法学习而不必诉诸某些复杂的系统或规则。

    They learn it through their exposure.

    他们通过接触来学习。

    Consider how many adults who start to learn a language try to do it by attempting to learn the rules for making grammatic expressions rather than just exposing themselves to the target language and allowing that exposure combined with an acute awareness guide their learning process.

    考虑多少开始学习语言的成年人尝试通过尝试学习语法表达的规则而不是仅仅将自己暴露于目标语言并允许其暴露与敏锐的意识相结合来指导他们的学习过程,从而尝试做到这一点。

    At this point you might be thinking: “ok, we learn from examples, so what? For complicated stuff we still need to sit down and read countless books, process an infinite amount of information, get bored and procrastinate.”.

    此时,您可能会想:“好吧,我们从示例中学到了,那又如何呢? 对于复杂的东西,我们仍然需要坐下来阅读无数的书,处理无数的信息,感到无聊和拖延。”

    Image for post

    That might be true but there is an interesting way to tinker with our brain’s innate capacity to make associations, which means treating ourselves, to some limited degree, like a learning machine, meaning (for the purpose of this article) to automate exposure just as we would do with a machine learning task.

    也许是对的,但是有一种有趣的方式可以改变我们大脑的天生建立联系的能力,这意味着在某种程度上像对待学习机一样对待自己,这意味着(对于本文而言)意味着可以自动进行曝光我们将完成机器学习任务。

    通过程序化学习来学习法语 (Learning French through programmatic exposure)

    To clarify, I’ll give you an example. When I was in Brazil, one of my big dreams was to learn French. I, like many Brazilians, did not have the opportunity to go to France and hang out there to learn it so, for a while, I was quite upset.

    为了澄清,我举一个例子。 当我在巴西时,我最大的梦想之一就是学习法语。 我像许多巴西人一样,没有机会去法国并在那里闲逛学习,所以有一段时间,我很沮丧。

    However, I came upon this idea of systematic immersion in the environment of the target language, so I wondered: if I programmatically expose myself to French will I learn enough to acquire the language?

    但是,我想到了一种系统地沉浸在目标语言环境中的想法,所以我想知道:如果我以编程方式使自己接触法语,我是否会学得足够多的语言?

    Upon some reflection and realizing I did not have enough time to just study French all day, I decided to do something a bit unconventional: I gave a “pythonesque” twist to things and decided to automate part of this process.

    经过一番思考和意识到,我没有足够的时间整天学习法语,所以我决定做一些与众不同的事情:我对事情采取了“怪异”的态度,并决定使该过程的一部分自动化。

    自动曝光的步骤 (Steps to automate exposure)

    As I quickly mentioned before, in most deep learning projects you usually deal with a pipeline composed of the following: a dataset (with the input data the model needs to learn the task), a forward pass where the data goes through the model and it is transformed to an output representing the model prediction, a cost function that tells you how wrong the output of your model is, a backward pass to update the units allowing the model to learn and some type of evaluation process enabling you to know how good the model is when applied to data it has never seen before.

    正如我之前很快提到的,在大多数深度学习项目中,您通常处理由以下内容组成的管道: 数据集 (使用模型需要学习输入数据的模型来学习任务),数据通过模型的正向传递以及转换为代表模型预测的输出,一个告诉您模型输出有多错误的成本函数向后传递以更新允许模型学习的单位以及某种类型的评估过程,从而使您知道模型的质量模型是应用于从未有过的数据时。

    Given these structural components, what I did was to use python to build a simple system that resembled this type of pipeline so that I could automate my exposure to relevant content.

    有了这些结构组件,我要做的就是使用python构建类似于这种类型的管道的简单系统, 以便我可以自动展示相关内容。

    Let’s go through each step one by one.

    让我们一步一步地完成每个步骤。

    Dataset

    数据集

    I built a learning curriculum made out of links to videos, articles, textbooks with all the content from the web I thought could help me understand the language (I suggest around 20 to 40 links covering video, audio, and text).

    我建立了一个学习课程,它由与视频,文章,教科书的链接组成,并包含我认为可以帮助我理解该语言的所有网络内容(我建议大约20至40个包含视频,音频和文本的链接)。

    https://www.youtube.com/watch?v=6TUNC31t73w https://www.youtube.com/watch?v=vXVg-XjmY-w https://www.youtube.com/watch?v=ujDtm0hZyII https://www.youtube.com/watch?v=X8gno6Uzuo8 https://tel.archives-ouvertes.fr/file/index/docid/825854/filename/1996_Framling_Kary.pdf https://www.persee.fr/doc/intel_0769-4113_1987_num_2_1_1804 https://lejournal.cnrs.fr/dossiers/comment-lintelligence-artificielle-va-changer-nos-vies 
    .
    .
    .
    .

    I wrote a script to transfer the links to a .csv file and create the following columns:

    我编写了一个脚本,将链接转移到.csv文件并创建以下列:

    • links with all the sources to learn

      与所有资料links学习

    • attention_level how much attention you are putting in a given source (from just background noise to focused) as a number from 0 to 10

      attention_level您将给定来源中多少注意力(从背景噪声到聚焦)从0到10的数字

    • session_time to record how long you spend on each link

      session_time记录您在每个链接上花费的时间

    • date to record the date of the session

      date记录会议日期

    • session_score to record a subjective score of your understanding of that particular source

      session_score记录您对特定来源的理解的主观评分

    • last_index for when you stop learning in the middle of the dataset and do not cover all the sources. This way, in your next training session you can pick up from where you left off

      last_index用于您何时停止在数据集中学习并且不涵盖所有资源。 这样,在下一个培训课程中,您可以从上次停学的地方接起

    Forward pass

    前传

    I set up a script to go systematically through the entire list of sources in the .csv file, using python’s webbrowser framework to access the URLs.

    我设置了一个脚本,以使用python的webbrowser框架访问URL来系统地webbrowser .csv文件中的整个源列表。

    • At each step, the user is prompted to give a score to its attention_level to that content which is subjective but it gives you the opportunity to grade your attention as you move through the content

      在每个步骤中,系统都会提示用户为该内容的attention_level等级打分,这是主观的,但是当您浏览内容时,它为您提供了对attention_level进行评分的机会

    • At the end of each link the user is prompted to give a score to its “performance” (session_score)

      在每个链接的末尾,提示用户为其“表现”( session_score )评分。

    • One epoch would be going through the entire material on a batch of the dataset. Meaning, going through one or a few links per epoch depending on where you stop

      一个时期将遍历一批数据集上的整个材料。 意思是,根据每个停站的位置,每个时期都要经过一个或几个链接

    Cost Function

    成本函数

    I see a cost function as a proxy for a learning goal to which you compare the output of your model at each step of the way to know if you are doing well.

    我认为成本函数可以作为学习目标的代理,您可以在模型的每个步骤中与模型输出进行比较,以了解自己是否做得很好。

    In this case, the cost would be a metric of performance at a language-task which varied depending on the type of skill at hand: reading, listening, writing, speaking.

    在这种情况下,成本将是一项语言任务的绩效指标,该任务会根据手头的技能类型而有所不同:阅读,听力,写作和口语。

    I measured each individually.

    我分别进行了测量。

    • Listening: I measured comprehension by listening to videos and assessing how much I understood by rewatching it with subtitles.

      听力 :我通过听视频来衡量理解力,并通过重新观看字幕来评估我的理解程度。

    • Speaking: I measured only pronunciation (this is the easiest thing to measure when you are starting out) so I would repeat certain chosen phrases from the material until I felt like I sounded reasonably well.

      口语:我只测量发音(这是刚开始时最容易测量的东西),所以我会重复从教材中选择的某些短语,直到感觉听起来还不错。

    • Writing: I practiced on Duolingo for a few minutes because it is easy and it gives you quick feedback. In addition, I would finish the day with a little writing sample written in French summarizing what I had learned and then compare it with what I meant using google translate.

      写作:我在Duolingo练习了几分钟,因为它很容易并且可以给您快速的反馈。 另外,我将用法语写的一个小样本来结束这一天,总结我所学到的知识,然后将其与使用Google翻译的含义进行比较。

    • Reading: I would read an article and again double-check the meaning.

      阅读 :我会阅读一篇文章,然后再次仔细检查其含义。

    I graded all these skills and stored them in thesession_scorevariable that I collected with the script.

    我对所有这些技能进行了评分,并将它们存储在脚本收集的session_score变量中。

    Backward pass (or updating your strategies online)

    向后传递(或在线更新策略)

    Although we lack access to the gradients of our neurons, we do have access to our thoughts and mental awareness as we learn, which we can use to assess what could be updated to increase learning.

    尽管我们无法获得神经元的梯度 ,但是我们在学习过程中仍然可以获取思想和意识,可以用来评估可以进行哪些更新以增加学习。

    Below are examples of my mental process for updating my learning strategies:

    以下是更新学习策略的心理过程示例:

    • For the low-attention sessions, I would just leave the audio in the background and repeat the words I could understand out loud to practice pronunciation as I mentioned before.

      对于注意力不集中的课程,我只是将音频留在背景中,然后重复我大声理解的单词来练习发音,就像我之前提到的那样。
    • If I was sensing that the low attention sessions were working poorly I would do more focused ones (with a higher attention level).

      如果我感觉到注意力不集中的会议效果不佳,我会做重点更突出的会议(注意力水平更高)。
    • If the high attention ones required too much motivation, I would switch back.

      如果高度关注者需要太多动力,我会转回去。
    • This procedure did not guarantee an “increase in accuracy” but it guaranteed constant awareness of my mental state as I learned the content. This became a lateral skill that improved my learning process.

      此过程不能保证“准确性”的提高,但可以保证我在学习内容时不断意识到自己的精神状态 。 这成为一项横向技能,改善了我的学习过程。

    The main idea here was to be mindful of what was working and what was not, so in this case the idea of a backward pass just serves as a euphemism for updating your strategies according to your online perception of your learning performance.

    这里的主要思想是要注意什么在起作用 ,什么没在起作用 ,因此在这种情况下, 向后通过的想法只是根据在线对学习成绩的看法来更新策略的委婉说法。

    Evaluation

    评价

    Evaluation, in this case, is tricky, because I was not paying full attention to everything in the same way. Therefore, any feedback mechanism here would have been noisy so I used a few different approaches that did not involve any automation per see but, again, only a guided self-awareness:

    在这种情况下,评估很棘手,因为我没有以相同的方式全神贯注于所有事情。 因此,这里的任何反馈机制都是嘈杂的,因此我使用了几种不同的方法,这些方法在每次观看时都没有涉及任何自动化,但又只是一种引导式的自我意识:

    • Language parent. Another great technique from Mr. Lonsdale, a language parent would be someone that helps you learn without negative feedback. For me this was a friend who was a native French speaker. In the absence of someone qualified, you can use foreign meetups to be around people that can give you feedback on how well you are doing based on how much you can communicate with them (expose yourself!).

      语言父母 。 朗斯代尔先生的另一项很棒的技巧是语言父母,它可以帮助您学习而不会产生负面反馈。 对我来说,这是一个以法语为母语的朋友。 在没有合格人员的情况下,您可以使用外国聚会在周围的人周围,这些聚会可以根据您与他们进行交流的程度(向您展示自己!)向您反馈您的表现。

    • Watching novel content with French subtitles. This was an easier and effective evaluation technique. I would watch something new with subtitles in French and I would do just a personal evaluation of how much I felt I could understand by rewatching it with the original subtitles.

      观看带有法文字幕的新颖内容。 这是一种更简单有效的评估技术。 我会用法语字幕看一些新的东西,而我将通过对原始字幕的重新观看来个人评估自己的理解程度。

    回顾周期 (Reviewing the Cycle)

    At this point the idea was to use the same dataset for two or three weeks, doing one full run of the entire dataset every two or three days. In this way, at the end of the cycle you feel that you have captured the maximum that those sources could provide.

    此时的想法是使用相同的数据集两到三周,每两三天对整个数据集进行一次完整运行。 这样,在周期结束时,您会感觉到已经获得了这些资源可以提供的最大数量。

    After going through the entire dataset of sources at least 20 times, I re-evaluated if it was worth to add more content or to remove it.

    在遍历整个源数据集至少20次之后,我重新评估了是否值得添加或删除更多内容。

    个人成绩 (Personal results)

    I was quite pleased that after around one year of implementing this method I learned to speak enough French to communicate with any French-speaking person without relying on English.

    实施此方法大约一年后,我学会了足够的法语,可以与任何说法语的人进行交流,而无需依靠英语,对此我感到非常高兴。

    The fascinating thing that I learned with this experience was how much I could absorb just by automating the learning process to remove the mental energy involved with the entry barrier to start a study session (in this case just running a single line of code in the terminal):

    我从这次经验中学到的令人着迷的事情是,我可以通过自动化学习过程来消除与进入障碍有关的精神能量以开始学习课程而吸收多少(在这种情况下,只需在终端中运行一行代码即可) ):

    (env_1) C:\\path\\to\\folder\\> py -i learn_this.py

    结论 (Conclusion)

    The most important point is that my main source of learning was the exposure.

    最重要的一点是,我的主要学习来源是接触知识

    The amazing advantage for humans in comparison to neural networks is that we can create our own datasets for our own learning goals and we can control what goes inside our minds and how.

    与神经网络相比,人类的惊人优势是我们可以为自己的学习目标创建自己的数据集,并且可以控制脑海中的内心世界和方式。

    By combining programmatic exposure to acute awareness, I was able to learn a complex language without the hurdles of a classical chapter-by-chapter learning procedure.

    通过将程序性接触与敏锐的意识相结合,我能够学习复杂的语言,而没有经典的逐章学习程序的障碍。

    When you are exposed to anything in your life, you are, even in some minimal sense, learning something, be it the patterns that make up the events that happen before your eyes or the chains of potential causes that might be responsible for each of them.

    当您接触到生活中的任何事物时,即使是从最小的意义上讲,您也将学到一些东西,无论是构成眼前发生的事件的模式还是可能对每个事件负责的潜在原因链。

    When we try to learn by going through the full rule structure that makes up that process, like learning mathematics by memorizing all its theorems, we are implicitly telling ourselves that understanding that subject involves the exhaustive mapping of all its rules, which is something that carries within itself a tiresome psychological load that can wear out even the most prolific learner.

    当我们尝试通过构成该过程的完整规则结构进行学习时,例如通过记忆其所有定理来学习数学时,我们在暗中告诉自己,对主题的理解涉及其所有规则的详尽映射,这带有在自己内部,即使是最有能力的学习者,也会承受沉重的心理负担。

    As humans, we have the disadvantage of being wired for constant self-assessment which inherently leads us towards paths of self-sabotage when we decide to tackle a novel subject.

    作为人类,我们的缺点是经常进行自我评估 当我们决定解决一个新颖的主题时,这必然会引导我们走上自我破坏的道路。

    However, it is this same self-assessing nature that allows us to implement powerful changes to guide us towards higher peaks in our learning ventures.

    但是,正是这种自我评估的性质使我们能够进行有力的改变,以引导我们迈向学习事业的更高高峰。

    Learning to deal with this mental trade-off is the key to becoming skilled at anything.

    学会应对这种精神上的权衡是掌握任何技能的关键。

    There is a lot we can take from this simple idea that our brains are wired for pattern recognition. The main of which is that potentially countless subjects (not all!) can be framed as a supervised learning task.

    这个简单的想法可以为我们带来很多好处,我们的大脑被连接起来进行模式识别。 其主要目的是可以将无数潜在的科目(不是全部!)构筑为有监督的学习任务。

    All we need to do is set up our training dataset (the exposure to examples) and make sure we don’t second-guess our updating mechanisms as well as our performance throughout the process (meaning we don’t think too hard about how well or how fast we learn).

    我们需要做的就是建立训练数据集(对示例的了解),并确保我们不会在整个过程中对更新机制和性能进行第二次猜测(这意味着我们不会对效果如何进行过分思考)或我们学习的速度有多快 )。

    The source code for this project can be found here.

    该项目的源代码可以在这里找到。

    In my next post, I intend to go further into the many parallels between learning as a human and learning as a machine using the literature available as my guide and practical examples to clarify.

    在我的下一篇文章中,我打算进一步深入研究人类学习和机器学习之间的相似之处,其中使用可用的文献作为指导,并举例说明一些实际的例子。

    Leave a comment to let me know what you think!

    发表评论,让我知道您的想法!

    That’s all for today, thanks, see you next time!

    今天就这些,谢谢,下次见!

    翻译自: https://medium.com/swlh/a-quick-and-simple-ai-inspired-way-to-learn-a-language-e762754b8a5a

    五子棋ai启发式搜索

    展开全文
  • 国科大2015级本科生c语言课程大作业。 在osx平台下使用c语言实现,在windows平台下稍作了一些修改也可使用(输出格式优化...采用评分机制,哈希表缓存,启发式搜索+算杀。 在5秒内可搜索12层以上(层数可自行调节)。
  • Python 五子棋AI实现(4):启发式评估

    千次阅读 多人点赞 2019-06-06 20:10:26
    python 五子棋AI实现(3):启发式评估启发式评估 启发式评估 影响alpha beta剪枝效率的关键,是要让评分高的位置更早的被搜索到,这样可以更快的进行剪枝。 要实现这一点,就需要对每一个可以下的位置进行评分的...

    启发式评估

    影响alpha beta剪枝效率的关键,是要让评分高的位置更早的被搜索到,这样可以更快的进行剪枝。
    比如图1是上一篇文章中的例子,在这个结构下,只有节点C的第二个子节点被剪枝了
    1
    通过启发式评估后,我们可以先预估节点A,B,C的评分,假设和实际情况一样,得到评分是节点B > 节点C > 节点A,在生成博弈树时,通过调用子节点的前后顺序,就可以更快的进行剪枝。
    比如图2,就是上图博弈树重新按照子节点的预估评分进行排序后的结果。可以看到节点C 和 节点A的第二个子节点都被剪枝了,加快了搜索效率。
    2
    要实现这一点,就需要对每一个可以下的位置进行评分的预估,让预估分高的位置排在前面。目前采用的预估评分方法是对于一个空的位置,分别下白棋或黑棋,获取这个点四个方向能够形成的棋型,然后打分。

    代码实现

    genmove函数中,和前面文章中对比,可以看到获取位置评分的时候有修改,在空位置上,会调用evaluatePointScore函数获取下己方或对方棋子时形成的棋局的评分,然后将评分较高的位置(有连五,活四,或冲四)加入单独的列表中,因为如果出现有这几种评分的位置,就可以不考虑评分更低的位置。

    函数的最后可以看到有个最大位置数目 AI_LIMITED_MOVE_NUM 的限制,因为我们已经对所有可下的空位置进行了评估,经过评分的排序后,排在后面的位置基本不可能是最有利的位置,所以可以提前去掉,减少博弈树搜索的节点数目。

    获取位置的评分

    AI_LIMITED_MOVE_NUM = 20
    
    	# get all positions near chess
    	def genmove(self, board, turn):
    		fives = []
    		mfours, ofours = [], []
    		msfours, osfours = [], []
    		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
    			mine = 1
    			opponent = 2
    		else:
    			mine = 2
    			opponent = 1
    
    		moves = []
    		radius = 1
    
    		for y in range(self.len):
    			for x in range(self.len):
    				if board[y][x] == 0 and self.hasNeighbor(board, x, y, radius):
    					mscore, oscore = self.evaluatePointScore(board, x, y, mine, opponent)
    					point = (max(mscore, oscore), x, y)
    
    					if mscore >= SCORE_FIVE or oscore >= SCORE_FIVE:
    						fives.append(point)
    					elif mscore >= SCORE_FOUR:
    						mfours.append(point)
    					elif oscore >= SCORE_FOUR:
    						ofours.append(point)
    					elif mscore >= SCORE_SFOUR:
    						msfours.append(point)
    					elif oscore >= SCORE_SFOUR:
    						osfours.append(point)
    
    					moves.append(point)
    
    		if len(fives) > 0: return fives
    
    		if len(mfours) > 0: return mfours
    
    		if len(ofours) > 0:
    			if len(msfours) == 0:
    				return ofours
    			else:
    				return ofours + msfours
    
    		moves.sort(reverse=True)
    
    		# FIXME: decrease think time: only consider limited moves with higher scores
    		if self.maxdepth > 2 and len(moves) > AI_LIMITED_MOVE_NUM:
    			moves = moves[:AI_LIMITED_MOVE_NUM]
    		return moves
    

    单个位置的评估函数

    evaluatePointScore函数分别下己方或对方的棋子,对形成的棋局调用evaluatePoint 函数搜索该位置的4个方向获取棋型统计,然后调用getPointScore 函数获取该位置下己方或对方棋子的评分。

    	# evaluate score of point, to improve pruning efficiency
    	def evaluatePointScore(self, board, x, y, mine, opponent):
    		for i in range(len(self.count)):
    			for j in range(len(self.count[0])):
    				self.count[i][j] = 0
    				
    		board[y][x] = mine
    		self.evaluatePoint(board, x, y, mine, opponent, self.count[mine-1])
    		mine_count = self.count[mine-1]
    		board[y][x] = opponent
    		self.evaluatePoint(board, x, y, opponent, mine, self.count[opponent-1])
    		opponent_count = self.count[opponent-1]
    		board[y][x] = 0
    
    		mscore = self.getPointScore(mine_count)
    		oscore = self.getPointScore(opponent_count)
    
    		return (mscore, oscore)
    

    单个位置的棋型评分函数

    getPointScore函数会统计所有棋型的分数,这边将每种棋型的分数差距变大,防止优先级低的棋型分数相加会大于优先级高的棋型,导致误判。如果棋型中有连五或活四,就直接返回。在只有单独一个冲四的情况,效果和活三差不多,所以只给活三相同的分数。小于等于活三的棋型分数就直接相加。

    SCORE_FIVE, SCORE_FOUR, SCORE_SFOUR = 100000, 10000, 1000
    SCORE_THREE, SCORE_STHREE, SCORE_TWO, SCORE_STWO = 100, 10, 8, 2
    
    	def getPointScore(self, count):
    		score = 0
    		if count[FIVE] > 0:
    			return SCORE_FIVE
    
    		if count[FOUR] > 0:
    			return SCORE_FOUR
    		
    		# FIXME: the score of one chong four and no live three should be low, set it to live three
    		if count[SFOUR] > 1:
    			score += count[SFOUR] * SCORE_SFOUR
    		elif count[SFOUR] > 0 and count[THREE] > 0:
    			score += count[SFOUR] * SCORE_SFOUR
    		elif count[SFOUR] > 0:
    			score += SCORE_THREE 
    
    		if count[THREE] > 1:
    			score += 5 * SCORE_THREE
    		elif count[THREE] > 0:
    			score += SCORE_THREE
    
    		if count[STHREE] > 0:
    			score += count[STHREE] * SCORE_STHREE
    		if count[TWO] > 0:
    			score += count[TWO] * SCORE_TWO
    		if count[STWO] > 0:
    			score += count[STWO] * SCORE_STWO
    
    		return score
    

    AI搜索搜索时间

    设置的搜索深度 AI_SEARCH_DEPTH 为4,对比上一篇文章的测试,在平均搜索时间和测试中出现的最大搜索时间有很大的优化。可以看到搜索时间保持在2秒左右,最大搜索时间小于5秒。
    time[0.78] (8, 8), score[-10] alpha[4023] belta[905]
    time[1.69] (10, 8), score[-394] alpha[7833] belta[1870]
    time[0.90] (8, 7), score[-398] alpha[3431] belta[1007]
    time[0.96] (9, 6), score[-400] alpha[4675] belta[792]
    time[0.74] (5, 9), score[-396] alpha[2644] belta[535]
    time[0.50] (10, 4), score[-398] alpha[1667] belta[337]
    time[2.71] (10, 5), score[-4] alpha[11497] belta[1258]
    time[1.61] (10, 7), score[-2] alpha[5080] belta[819]
    time[1.48] (11, 4), score[-6] alpha[4381] belta[767]
    time[1.73] (12, 4), score[-6] alpha[5090] belta[853]
    time[1.51] (13, 4), score[-8] alpha[4139] belta[796]
    time[1.52] (8, 5), score[-16] alpha[4068] belta[778]
    time[2.08] (7, 6), score[-408] alpha[4769] belta[1215]
    time[1.68] (7, 9), score[-402] alpha[2620] belta[1229]
    time[1.00] (2, 4), score[-392] alpha[1512] belta[628]
    time[2.47] (8, 9), score[-16] alpha[6789] belta[1187]
    time[2.45] (8, 10), score[-20] alpha[6682] belta[1113]
    time[2.81] (11, 7), score[-18] alpha[7496] belta[1166]

    完整代码

    一共有三个文件,main.py, GameMap.pyChessAI.py这次只修改了ChessAI.py,前两个文件可以看之前文章中的代码。

    ChessAI.py

    from GameMap import *
    from enum import IntEnum
    from random import randint
    import time
    
    AI_SEARCH_DEPTH = 4
    AI_LIMITED_MOVE_NUM = 20
    
    
    class CHESS_TYPE(IntEnum):
    	NONE = 0,
    	SLEEP_TWO = 1,
    	LIVE_TWO = 2,
    	SLEEP_THREE = 3
    	LIVE_THREE = 4,
    	CHONG_FOUR = 5,
    	LIVE_FOUR = 6,
    	LIVE_FIVE = 7,
    	
    CHESS_TYPE_NUM = 8
    
    FIVE = CHESS_TYPE.LIVE_FIVE.value
    FOUR, THREE, TWO = CHESS_TYPE.LIVE_FOUR.value, CHESS_TYPE.LIVE_THREE.value, CHESS_TYPE.LIVE_TWO.value
    SFOUR, STHREE, STWO = CHESS_TYPE.CHONG_FOUR.value, CHESS_TYPE.SLEEP_THREE.value, CHESS_TYPE.SLEEP_TWO.value
    
    SCORE_MAX = 0x7fffffff
    SCORE_MIN = -1 * SCORE_MAX
    SCORE_FIVE, SCORE_FOUR, SCORE_SFOUR = 100000, 10000, 1000
    SCORE_THREE, SCORE_STHREE, SCORE_TWO, SCORE_STWO = 100, 10, 8, 2
    
    class ChessAI():
    	def __init__(self, chess_len):
    		self.len = chess_len
    		# [horizon, vertical, left diagonal, right diagonal]
    		self.record = [[[0,0,0,0] for x in range(chess_len)] for y in range(chess_len)]
    		self.count = [[0 for x in range(CHESS_TYPE_NUM)] for i in range(2)]
    		
    	def reset(self):
    		for y in range(self.len):
    			for x in range(self.len):
    				for i in range(4):
    					self.record[y][x][i] = 0
    
    		for i in range(len(self.count)):
    			for j in range(len(self.count[0])):
    				self.count[i][j] = 0
    
    	
    	def click(self, map, x, y, turn):
    		map.click(x, y, turn)
    		
    	def isWin(self, board, turn):
    		return self.evaluate(board, turn, True)
    	
    	# evaluate score of point, to improve pruning efficiency
    	def evaluatePointScore(self, board, x, y, mine, opponent):
    		dir_offset = [(1, 0), (0, 1), (1, 1), (1, -1)] # direction from left to right
    		for i in range(len(self.count)):
    			for j in range(len(self.count[0])):
    				self.count[i][j] = 0
    				
    		board[y][x] = mine
    		self.evaluatePoint(board, x, y, mine, opponent, self.count[mine-1])
    		mine_count = self.count[mine-1]
    		board[y][x] = opponent
    		self.evaluatePoint(board, x, y, opponent, mine, self.count[opponent-1])
    		opponent_count = self.count[opponent-1]
    		board[y][x] = 0
    
    		mscore = self.getPointScore(mine_count)
    		oscore = self.getPointScore(opponent_count)
    
    		return (mscore, oscore)
    
    	# check if has a none empty position in it's radius range
    	def hasNeighbor(self, board, x, y, radius):
    		start_x, end_x = (x - radius), (x + radius)
    		start_y, end_y = (y - radius), (y + radius)
    
    		for i in range(start_y, end_y+1):
    			for j in range(start_x, end_x+1):
    				if i >= 0 and i < self.len and j >= 0 and j < self.len:
    					if board[i][j] != 0:
    						return True
    		return False
    
    	# get all positions near chess
    	def genmove(self, board, turn):
    		fives = []
    		mfours, ofours = [], []
    		msfours, osfours = [], []
    		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
    			mine = 1
    			opponent = 2
    		else:
    			mine = 2
    			opponent = 1
    
    		moves = []
    		radius = 1
    
    		for y in range(self.len):
    			for x in range(self.len):
    				if board[y][x] == 0 and self.hasNeighbor(board, x, y, radius):
    					mscore, oscore = self.evaluatePointScore(board, x, y, mine, opponent)
    					point = (max(mscore, oscore), x, y)
    
    					if mscore >= SCORE_FIVE or oscore >= SCORE_FIVE:
    						fives.append(point)
    					elif mscore >= SCORE_FOUR:
    						mfours.append(point)
    					elif oscore >= SCORE_FOUR:
    						ofours.append(point)
    					elif mscore >= SCORE_SFOUR:
    						msfours.append(point)
    					elif oscore >= SCORE_SFOUR:
    						osfours.append(point)
    
    					moves.append(point)
    
    		if len(fives) > 0: return fives
    
    		if len(mfours) > 0: return mfours
    
    		if len(ofours) > 0:
    			if len(msfours) == 0:
    				return ofours
    			else:
    				return ofours + msfours
    
    		moves.sort(reverse=True)
    
    		# FIXME: decrease think time: only consider limited moves with higher scores
    		if self.maxdepth > 2 and len(moves) > AI_LIMITED_MOVE_NUM:
    			moves = moves[:AI_LIMITED_MOVE_NUM]
    		return moves
    	
    	def __search(self, board, turn, depth, alpha = SCORE_MIN, beta = SCORE_MAX):
    		score = self.evaluate(board, turn)
    		if depth <= 0 or abs(score) >= SCORE_FIVE: 
    			return score
    
    		moves = self.genmove(board, turn)
    		bestmove = None
    		self.alpha += len(moves)
    		
    		# if there are no moves, just return the score
    		if len(moves) == 0:
    			return score
    
    		for _, x, y in moves:
    			board[y][x] = turn
    			
    			if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
    				op_turn = MAP_ENTRY_TYPE.MAP_PLAYER_TWO
    			else:
    				op_turn = MAP_ENTRY_TYPE.MAP_PLAYER_ONE
    
    			score = - self.__search(board, op_turn, depth - 1, -beta, -alpha)
    
    			board[y][x] = 0
    			self.belta += 1
    
    			# alpha/beta pruning
    			if score > alpha:
    				alpha = score
    				bestmove = (x, y)
    				if alpha >= beta:
    					break
    
    		if depth == self.maxdepth and bestmove:
    			self.bestmove = bestmove
    				
    		return alpha
    
    	def search(self, board, turn, depth = 4):
    		self.maxdepth = depth
    		self.bestmove = None
    		score = self.__search(board, turn, depth)
    		x, y = self.bestmove
    		return score, x, y
    		
    	def findBestChess(self, board, turn):
    		time1 = time.time()
    		self.alpha = 0
    		self.belta = 0
    		score, x, y = self.search(board, turn, AI_SEARCH_DEPTH)
    		time2 = time.time()
    		print('time[%.2f] (%d, %d), score[%d] alpha[%d] belta[%d]' % ((time2-time1), x, y, score, self.alpha, self.belta))
    		return (x, y)
    	
    	def getPointScore(self, count):
    		score = 0
    		if count[FIVE] > 0:
    			return SCORE_FIVE
    
    		if count[FOUR] > 0:
    			return SCORE_FOUR
    		
    		# FIXME: the score of one chong four and no live three should be low, set it to live three
    		if count[SFOUR] > 1:
    			score += count[SFOUR] * SCORE_SFOUR
    		elif count[SFOUR] > 0 and count[THREE] > 0:
    			score += count[SFOUR] * SCORE_SFOUR
    		elif count[SFOUR] > 0:
    			score += SCORE_THREE 
    
    		if count[THREE] > 1:
    			score += 5 * SCORE_THREE
    		elif count[THREE] > 0:
    			score += SCORE_THREE
    
    		if count[STHREE] > 0:
    			score += count[STHREE] * SCORE_STHREE
    		if count[TWO] > 0:
    			score += count[TWO] * SCORE_TWO
    		if count[STWO] > 0:
    			score += count[STWO] * SCORE_STWO
    
    		return score
    
    	# calculate score, FIXME: May Be Improved
    	def getScore(self, mine_count, opponent_count):
    		mscore, oscore = 0, 0
    		if mine_count[FIVE] > 0:
    			return (SCORE_FIVE, 0)
    		if opponent_count[FIVE] > 0:
    			return (0, SCORE_FIVE)
    				
    		if mine_count[SFOUR] >= 2:
    			mine_count[FOUR] += 1
    		if opponent_count[SFOUR] >= 2:
    			opponent_count[FOUR] += 1
    				
    		if mine_count[FOUR] > 0:
    			return (9050, 0)
    		if mine_count[SFOUR] > 0:
    			return (9040, 0)
    			
    		if opponent_count[FOUR] > 0:
    			return (0, 9030)
    		if opponent_count[SFOUR] > 0 and opponent_count[THREE] > 0:
    			return (0, 9020)
    			
    		if mine_count[THREE] > 0 and opponent_count[SFOUR] == 0:
    			return (9010, 0)
    			
    		if (opponent_count[THREE] > 1 and mine_count[THREE] == 0 and mine_count[STHREE] == 0):
    			return (0, 9000)
    
    		if opponent_count[SFOUR] > 0:
    			oscore += 400
    
    		if mine_count[THREE] > 1:
    			mscore += 500
    		elif mine_count[THREE] > 0:
    			mscore += 100
    			
    		if opponent_count[THREE] > 1:
    			oscore += 2000
    		elif opponent_count[THREE] > 0:
    			oscore += 400
    
    		if mine_count[STHREE] > 0:
    			mscore += mine_count[STHREE] * 10
    		if opponent_count[STHREE] > 0:
    			oscore += opponent_count[STHREE] * 10
    			
    		if mine_count[TWO] > 0:
    			mscore += mine_count[TWO] * 6
    		if opponent_count[TWO] > 0:
    			oscore += opponent_count[TWO] * 6
    				
    		if mine_count[STWO] > 0:
    			mscore += mine_count[STWO] * 2
    		if opponent_count[STWO] > 0:
    			oscore += opponent_count[STWO] * 2
    		
    		return (mscore, oscore)
    
    	def evaluate(self, board, turn, checkWin=False):
    		self.reset()
    		
    		if turn == MAP_ENTRY_TYPE.MAP_PLAYER_ONE:
    			mine = 1
    			opponent = 2
    		else:
    			mine = 2
    			opponent = 1
    		
    		for y in range(self.len):
    			for x in range(self.len):
    				if board[y][x] == mine:
    					self.evaluatePoint(board, x, y, mine, opponent)
    				elif board[y][x] == opponent:
    					self.evaluatePoint(board, x, y, opponent, mine)
    		
    		mine_count = self.count[mine-1]
    		opponent_count = self.count[opponent-1]
    		if checkWin:
    			return mine_count[FIVE] > 0
    		else:	
    			mscore, oscore = self.getScore(mine_count, opponent_count)
    			return (mscore - oscore)
    	
    	def evaluatePoint(self, board, x, y, mine, opponent, count=None):
    		dir_offset = [(1, 0), (0, 1), (1, 1), (1, -1)] # direction from left to right
    		ignore_record = True
    		if count is None:
    			count = self.count[mine-1]
    			ignore_record = False
    		for i in range(4):
    			if self.record[y][x][i] == 0 or ignore_record:
    				self.analysisLine(board, x, y, i, dir_offset[i], mine, opponent, count)
    
    	
    	# line is fixed len 9: XXXXMXXXX
    	def getLine(self, board, x, y, dir_offset, mine, opponent):
    		line = [0 for i in range(9)]
    		
    		tmp_x = x + (-5 * dir_offset[0])
    		tmp_y = y + (-5 * dir_offset[1])
    		for i in range(9):
    			tmp_x += dir_offset[0]
    			tmp_y += dir_offset[1]
    			if (tmp_x < 0 or tmp_x >= self.len or 
    				tmp_y < 0 or tmp_y >= self.len):
    				line[i] = opponent # set out of range as opponent chess
    			else:
    				line[i] = board[tmp_y][tmp_x]
    						
    		return line
    		
    	def analysisLine(self, board, x, y, dir_index, dir, mine, opponent, count):
    		# record line range[left, right] as analysized
    		def setRecord(self, x, y, left, right, dir_index, dir_offset):
    			tmp_x = x + (-5 + left) * dir_offset[0]
    			tmp_y = y + (-5 + left) * dir_offset[1]
    			for i in range(left, right+1):
    				tmp_x += dir_offset[0]
    				tmp_y += dir_offset[1]
    				self.record[tmp_y][tmp_x][dir_index] = 1
    	
    		empty = MAP_ENTRY_TYPE.MAP_EMPTY.value
    		left_idx, right_idx = 4, 4
    		
    		line = self.getLine(board, x, y, dir, mine, opponent)
    
    		while right_idx < 8:
    			if line[right_idx+1] != mine:
    				break
    			right_idx += 1
    		while left_idx > 0:
    			if line[left_idx-1] != mine:
    				break
    			left_idx -= 1
    		
    		left_range, right_range = left_idx, right_idx
    		while right_range < 8:
    			if line[right_range+1] == opponent:
    				break
    			right_range += 1
    		while left_range > 0:
    			if line[left_range-1] == opponent:
    				break
    			left_range -= 1
    		
    		chess_range = right_range - left_range + 1
    		if chess_range < 5:
    			setRecord(self, x, y, left_range, right_range, dir_index, dir)
    			return CHESS_TYPE.NONE
    		
    		setRecord(self, x, y, left_idx, right_idx, dir_index, dir)
    		
    		m_range = right_idx - left_idx + 1
    		
    		# M:mine chess, P:opponent chess or out of range, X: empty
    		if m_range >= 5:
    			count[FIVE] += 1
    		
    		# Live Four : XMMMMX 
    		# Chong Four : XMMMMP, PMMMMX
    		if m_range == 4:
    			left_empty = right_empty = False
    			if line[left_idx-1] == empty:
    				left_empty = True			
    			if line[right_idx+1] == empty:
    				right_empty = True
    			if left_empty and right_empty:
    				count[FOUR] += 1
    			elif left_empty or right_empty:
    				count[SFOUR] += 1
    		
    		# Chong Four : MXMMM, MMMXM, the two types can both exist
    		# Live Three : XMMMXX, XXMMMX
    		# Sleep Three : PMMMX, XMMMP, PXMMMXP
    		if m_range == 3:
    			left_empty = right_empty = False
    			left_four = right_four = False
    			if line[left_idx-1] == empty:
    				if line[left_idx-2] == mine: # MXMMM
    					setRecord(self, x, y, left_idx-2, left_idx-1, dir_index, dir)
    					count[SFOUR] += 1
    					left_four = True
    				left_empty = True
    				
    			if line[right_idx+1] == empty:
    				if line[right_idx+2] == mine: # MMMXM
    					setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
    					count[SFOUR] += 1
    					right_four = True 
    				right_empty = True
    			
    			if left_four or right_four:
    				pass
    			elif left_empty and right_empty:
    				if chess_range > 5: # XMMMXX, XXMMMX
    					count[THREE] += 1
    				else: # PXMMMXP
    					count[STHREE] += 1
    			elif left_empty or right_empty: # PMMMX, XMMMP
    				count[STHREE] += 1
    		
    		# Chong Four: MMXMM, only check right direction
    		# Live Three: XMXMMX, XMMXMX the two types can both exist
    		# Sleep Three: PMXMMX, XMXMMP, PMMXMX, XMMXMP
    		# Live Two: XMMX
    		# Sleep Two: PMMX, XMMP
    		if m_range == 2:
    			left_empty = right_empty = False
    			left_three = right_three = False
    			if line[left_idx-1] == empty:
    				if line[left_idx-2] == mine:
    					setRecord(self, x, y, left_idx-2, left_idx-1, dir_index, dir)
    					if line[left_idx-3] == empty:
    						if line[right_idx+1] == empty: # XMXMMX
    							count[THREE] += 1
    						else: # XMXMMP
    							count[STHREE] += 1
    						left_three = True
    					elif line[left_idx-3] == opponent: # PMXMMX
    						if line[right_idx+1] == empty:
    							count[STHREE] += 1
    							left_three = True
    						
    				left_empty = True
    				
    			if line[right_idx+1] == empty:
    				if line[right_idx+2] == mine:
    					if line[right_idx+3] == mine:  # MMXMM
    						setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
    						count[SFOUR] += 1
    						right_three = True
    					elif line[right_idx+3] == empty:
    						#setRecord(self, x, y, right_idx+1, right_idx+2, dir_index, dir)
    						if left_empty:  # XMMXMX
    							count[THREE] += 1
    						else:  # PMMXMX
    							count[STHREE] += 1
    						right_three = True
    					elif left_empty: # XMMXMP
    						count[STHREE] += 1
    						right_three = True
    						
    				right_empty = True
    			
    			if left_three or right_three:
    				pass
    			elif left_empty and right_empty: # XMMX
    				count[TWO] += 1
    			elif left_empty or right_empty: # PMMX, XMMP
    				count[STWO] += 1
    		
    		# Live Two: XMXMX, XMXXMX only check right direction
    		# Sleep Two: PMXMX, XMXMP
    		if m_range == 1:
    			left_empty = right_empty = False
    			if line[left_idx-1] == empty:
    				if line[left_idx-2] == mine:
    					if line[left_idx-3] == empty:
    						if line[right_idx+1] == opponent: # XMXMP
    							count[STWO] += 1
    				left_empty = True
    
    			if line[right_idx+1] == empty:
    				if line[right_idx+2] == mine:
    					if line[right_idx+3] == empty:
    						if left_empty: # XMXMX
    							#setRecord(self, x, y, left_idx, right_idx+2, dir_index, dir)
    							count[TWO] += 1
    						else: # PMXMX
    							count[STWO] += 1
    				elif line[right_idx+2] == empty:
    					if line[right_idx+3] == mine and line[right_idx+4] == empty: # XMXXMX
    						count[TWO] += 1
    						
    		return CHESS_TYPE.NONE
    
    展开全文
  • 为什么需要启发式搜索: AB剪枝是在生成的空子序列上优化,依赖于序列的顺序。假设极小的alpha是在最后遇到,其他子几乎是逆序,那么alpha剪枝就等于没效果。 实现原理:按优先级对空子序列排序。 先排杀棋,以下...
  • 本人是学生,自己写了一个五子棋人机对弈游戏。智能还可以的。
  • 设计一个交互的应用,用户用鼠标在棋盘上单击左键表示落子,然后五子棋AI分析棋局,并在它认为最好的地方落子,双方交替,直到分出胜负或者和棋。 在分析问题的过程中,我们假定图形用户界面已经完成,并且支持...
  • 博弈树的启发式搜索

    2022-04-26 20:31:36
    中文名 博弈树启发式搜索 应用学科 计算机 适用领域范围 计算机博弈 适吃子启发 吃子启发(Capturing Heuristic)也称为静态启发(Static Heuristic),仅针对吃子着法。吃子启发通过 MVV/LVA(Most Valuable ...
  • 要求:实现交互式五子棋程序,采用博弈算法实现。 图1五子棋示例 说明: 可以采用极大极小搜索方法或者alpha-beta剪枝算法实现,选其一即可; 要求实现交互界面,参考给定的C++界面; 编程语言无要求。
  • python五子棋AI代码

    2019-12-10 07:55:15
    使用python pygame编写的五子棋AI 程序代码,AI使用极大极小值搜索和alpha beta剪枝,启发式评估等方法增加了搜索深度。
  • Minimax算法和机器学习技术... 通过分析当我们的版本与具有不同搜索深度的在线样本进行对抗时的获胜百分比,我们发现采用minimax算法的启发式算法在零和游戏的早期阶段是完美的。 由于游戏树中的某些节点对minimax算
  • 使用javascript编写的在线五子棋程序,主要使用了极大极小值搜索,Alpha Beta剪枝,启发式评估函数,Zobrist缓存,迭代加深等算法
  • 在传统经典极大极小搜索和alpha-beta剪枝基础上采用了判重,加入启发式的优化,每次选择最有“前途”的若干个决策搜索以减少搜索量,再加入基于五子棋专业棋手下棋策略,改进威胁空间搜索算法。使得计算机的搜索过程更像...
  • 五子棋.zip

    2019-08-08 18:09:12
    Java写的五子棋盘,可以实现人机对战和机机对战,里面的电脑用了极大极小值策略,剪枝策略,启发式搜索策略等等。
  • 五子棋是一种两人对弈的纯策略型棋类游戏,是起源于中国古代的传统黑白棋种之一。现代五子棋日文称之为“连珠”,英译为“Renju”,英文称之为“Gobang”或“FIR”(Five in a Row的缩写),亦有“连五子”、“五子连...
  • 基于博弈搜索算法的智能五子棋设计 0.引言 在智能过程中,搜索是必不可少的,是人工智能中的一个基本问题。这是因为人工智能研究的主要是那些没有成熟方法可依的问题领域,需要一步一步搜索求解。游戏中如何找到对...
  • 在这些部分中,您可以找到最经典的搜索算法和一些基本的启发式功能。 有关详细的想法,请查看ReadMe.pdf文件。 不知道什么是推箱子? 不用担心,请用谷歌搜索。 我不确定它是否有效,如果您遇到任何问题,请与我联系...
  • 人工智能五子棋

    2016-12-27 16:15:52
    人工智能五子棋,运用博弈树,启发式搜索,alpha-beta剪枝,基于swift3。
  • 五子棋AI算法(三)

    千次阅读 多人点赞 2020-05-06 00:26:35
    五子棋AI算法第三章——α-β剪枝与启发式搜索
  • python 五子棋AI实现(3):alpha beta剪枝和启发式评估alpha beta剪枝算法启发式评估代码实现完整代码 alpha beta剪枝算法 启发式评估 代码实现 完整代码
  • 极大值极小值搜索设计五子棋

    千次阅读 2018-07-26 11:49:30
    采用启发式函数对整个棋局形式进行评估,并作为极大值极小值搜索的依据。 一、导言 1.1 问题描述: 本次实验要求设计一个五子棋的AI,要求能与人类对战,并具有较好的用户交互界面。 1.2 背景介绍: 五子棋是...
  • 五子棋AI算法第二篇-极大极小值搜索算法

    万次阅读 多人点赞 2016-02-02 18:17:02
    AI实现的基本思路-极大极小值搜索算法五子棋看起来有各种各样的走法,而实际上把每一步的走法展开,就是一颗巨大的博弈树。在这个树中,从根节点为0开始,奇数层表示电脑可能的走法,偶数层表示玩家可能的走法。假设...

空空如也

空空如也

1 2 3 4 5 ... 11
收藏数 212
精华内容 84
关键字:

五子棋启发式搜索