How can chatbots become truly intelligent by combining five different models of conversation?
Conversational AI is all about making machines communicate with us in natural language. They are called using various names — chatbots, voice bots, virtual assistants, etc. In reality, they may be slightly different to each other. However one key feature that ties them all together is their ability to understand natural language commands and requests from us-human users.
对话式AI就是让机器以自然语言与我们交流。 它们使用不同的名称来命名-聊天机器人，语音机器人，虚拟助手等。实际上，它们可能彼此略有不同。 但是，将他们紧密联系在一起的一个关键功能是他们能够理解自然语言的命令和人类用户的要求。
In the back-end, these agents will have to deal with carrying out the request and engage in a conversation. Based on how an agent processes the input natural language (NL) request and its mapping to a response, we can create a class of Conversational AI models.
- Interactive FAQ 互动式常见问题
- Form filling 表格填写
- Question Answering 问题回答
- NL interface for databases NL数据库接口
- Dialogue Planning 对话策划
互动式常见问题 (Interactive FAQ)
Frequently Asked Questions (FAQ) are usually a common part of business websites where all the frequently asked questions for customers are listed and answered. Instead of having customers go through the list and find answers to their questions, Interactive FAQ model for chatbots allows users to ask questions in their own way, match customer question to the list of questions and then serve the prepared answer for the matched question. This process enables customers to find answers quickly instead of having to go through a long list of questions.
常见问题(FAQ)通常是商业网站的常见部分，其中列出并回答了所有针对客户的常见问题。 聊天机器人的交互式FAQ模型无需让客户浏览列表并找到问题的答案，而是允许用户以自己的方式提出问题，将客户问题与问题列表进行匹配，然后为匹配的问题提供准备好的答案。 此过程使客户能够快速找到答案，而不必经历一长串问题。
Single-vs-Multi turn — In this model, the customer query could be answered immediately within a single turn if it is a simple query. On the other hand, the chatbot may need to ask a few questions to get more info from the user before answering the question.
Intent-vs-pattern recognition — the question being asked can be identified in many ways. Intent classification is a popular approach. Here, the list of questions for which we know the answers are labelled with intent names (i.e. what is the user intending to say/ask). Each intent is then given a number of example variations of the same question. They are then fed into a machine learning algorithm that learns to classify a new unseen question from the user as one of the intents. Once intent is identified, the answer can be served.
意图与模式识别-可以多种方式识别所要提出的问题。 意图分类是一种流行的方法。 在这里，我们知道答案的问题列表用意图名称标记(即，用户打算说/问什么)。 然后，为每个意图提供相同问题的多个示例变体。 然后将它们输入到机器学习算法中，该算法学习将来自用户的新的未见问题分类为意图之一。 一旦确定了意图，就可以提供答案。
The other approach is one that has existed since the time chatbots were born (e.g. Eliza). User utterance is pattern matched with pre-defined patterns and pre-defined answers/responses are served. Several tools are available in the market to implement pattern based conversation management (e.g. Pandorabots).
Recent advances in deep learning can also be used to build seq2seq models which take a sequence of words as input and output another sequence of words. This approach can be used to build interactive single-turn FAQ models. This model of conversational AI can be used for use-cases like FAQ, troubleshooting, small talk, etc.
深度学习的最新进展也可以用于构建seq2seq模型，该模型将一个单词序列作为输入并输出另一个单词序列。 此方法可用于构建交互式单匝FAQ模型。 这种对话式AI模型可用于FAQ，故障排除，闲聊等用例。
Form-filling, as the name says, is a model of conversation that involves filling in a form. A user request is mapped to an intent or a pattern that triggers a form that needs to be filled and in order to do so, the chatbot will have to ask a number of questions. Once filled the form can then be used to either do a database search or a database update.
顾名思义，表单填写是一种对话模型，涉及到填写表单。 用户请求被映射到触发需要填写的表单的意图或模式，为此，聊天机器人将不得不提出许多问题。 填写表格后，即可用于数据库搜索或数据库更新。
Take a travel agent chatbot, for instance. It will ask a series of questions to fill in fields like source, destination, date of travel, etc to do a database search for flights. Once you choose a flight, the details of the flight will be added to a larger form to make a booking (i.e. database update). Both search and update needed information that were gathered by asking questions driven by the form. However, the downside is that intents need to be created and the conversation needs to defined meticulously every step of the way to fill in the form, submit and handle the database results.
以旅行社聊天机器人为例。 它将询问一系列问题，以填写来源，目的地，旅行日期等字段，以对航班进行数据库搜索。 选择航班后，航班的详细信息将添加到较大的表格中以进行预订(即数据库更新)。 搜索和更新所需信息都是通过询问表单驱动的问题而收集的。 但是，不利的一面是需要创建意图，并且在填写表单，提交和处理数据库结果的过程的每一步都必须仔细定义对话。
Form-filling and FAQ models are currently the most popular as these take care of the most mundane repeated conversations customers tend to engage in. Platforms like IBM Watson, DialogFlow, etc provide tools to handle these models.
问题回答 (Question Answering)
Open domain question answering has been a sub-field of Natural Language Processing research with the objective of understanding user questions in natural language and extracting answers from a large corpus of text. This as you can clearly see, is a way of reducing the human effort in curating answers to questions that customers ask. It may be nearly impossible to create an exhaustive list of prepared questions and answers. To address this problem, chatbots should use QA models that can extract answers from large corpus of text on the fly.
开放域问答已经成为自然语言处理研究的一个子领域，其目的是理解自然语言中的用户问题并从大量文本中提取答案。 正如您可以清楚地看到的那样，这是减少人力来整理客户提出的问题的答案的一种方法。 创建详尽的准备好的问题和答案列表几乎是不可能的。 为了解决这个问题，聊天机器人应使用QA模型，该模型可以即时从大型文本语料库中提取答案。
QA model for conversation can be used where there is a large body of text that customers could query from and creating intent and curated answers for each question-answer pair is an expensive proposition.
Recent advances in transformer based models like BERT, GPT-3 have made robust QA models for conversational AI possible. The following is an example of QA model (by DeepPavlov.ai toolkit) in action.
NL数据库接口 (NL Database Interfaces)
The third type of conversational model is one where the user utterance can directly be mapped on to a database query. For instance, let us assume a relational database containing information about customer transactions data. To let customers interact with this database using natural language, form-filling model can be used. However, there are many ways to query a relational database and using form-filling model, you may have to design many conversational forms to fulfill your customer needs. Instead if you can translate your customer requests in natural language to a database query, you can run the query and respond appropriately without the need for creating forms and intents.
第三种类型的会话模型是可以将用户话语直接映射到数据库查询的模型 。 例如，让我们假设一个关系数据库包含有关客户交易数据的信息。 为了让客户使用自然语言与此数据库进行交互，可以使用表单填充模型。 但是，有很多方法可以查询关系数据库并使用表单填充模型，您可能必须设计许多对话表单才能满足客户需求。 相反，如果您可以将自然语言的客户请求转换为数据库查询，则可以运行查询并进行适当响应，而无需创建表单和意图。
Query language — Depending on the type of database, the target query language will vary. For instance, for relational databases, NL queries may need to be translated into SQL. For graph databases like Neo4J and RDF triple stores, they may need to be translated into Cypher and SPARQL.
How? — There are deep learning approaches — Seq2Seq models — that can translate from NL queries into a query language. Recently, GPT-3, the largest pre-trained language models so far, has been used to translate NL to SQL query using few-shot learning.
This model allows the customer to create a number of queries about the data in natural language without constraining them to pre-defined forms.
对话策划 (Dialogue Planning)
The final model in my list is Dialogue Planning. This model uses AI Planning approach to drive conversation. AI Planning is an Artificial Intelligence approach to intelligent problem solving. In a dialogue planning model, we will treat conversation as a planning problem with an initial state and a final goal state. The AI planner’s task is then to find an optimal sequence of steps from the initial to the goal state. In a conversation, these steps will include — asking the customer for answers to specific questions, fetching or updating info from/to a back-end system, etc.
我列表中的最终模型是“对话计划”。 该模型使用AI规划方法来推动对话。 AI Planning是一种用于解决问题的人工智能方法。 在对话计划模型中，我们将对话视为具有初始状态和最终目标状态的计划问题。 AI计划者的任务是找到从初始状态到目标状态的最佳步骤顺序。 在对话中，这些步骤将包括-向客户询问特定问题的答案，从后端系统获取信息或更新信息，等等。
For instance, to book a flight ticket, the agent will come up with a plan to ask a series of questions — destination, date, etc, search for flights, summarise them, help user to choose one, ask further questions — passenger name, age, meals, etc, make a booking and send a confirmation email. While in a form-filling model, the above sequence will have to be authored by hand, in a planning model, only a set of actions will need to be provided. The agent could use the same set of actions to create another sequence to achieve a different goal. To come up with an analogy, it is like the agent is a given a number of LEGO bricks that it can put together in various ways to build different things.
例如，要预订机票，代理商将提出一个计划，询问一系列问题(目的地，日期等)，搜索航班，进行汇总，帮助用户选择一个问题，提出其他问题(乘客姓名，年龄，用餐等，请进行预订并发送确认电子邮件。 在填表模型中，必须手动编写以上序列，而在计划模型中，仅需要提供一组操作。 代理可以使用同一组动作来创建另一个序列以实现不同的目标。 举个比喻，就像代理是给定的许多乐高积木一样，它可以通过各种方式组合在一起来构建不同的事物。
Like NL Database Interfaces and QA models, it allows for users to define initial and final states using natural language without being constrained by pre-defined conversational pathways. Instead, using AI planning, new pathways are created using a library of planning operators (or dialogue actions). Dialogue planning is still largely an area of research and non-availability of toolkits makes it hard to implement this model in a production environment.
像NL数据库接口和QA模型一样，它允许用户使用自然语言定义初始和最终状态，而不受预定义的对话路径的约束。 取而代之的是，使用AI规划，使用规划操作员(或对话操作)库创建新途径。 对话计划仍然是一个主要的研究领域，并且由于无法使用工具包，因此很难在生产环境中实施此模型。
Furthermore, planning approaches can be combined with deep reinforcement learning to optimize generated plans based on experience and reward from the environment. This will turn them into learning agents as well.
混合助手 (Hybrid Assistants)
Truly intelligent conversational agents will need to combine above models in a meaningful way. Such an assistant will be a hybrid with skills to combine various conversational models based on needs of the customer, relative success and cost of each model competing to solve the same problem. Combining these approaches will come with its own set of problems — need for unified knowledge representation mechanisms, explainability and control, etc. But with problems, solutions will come too.
真正智能的对话代理将需要以有意义的方式组合上述模型。 这样的助手将具有技巧，可以根据客户的需求，相对成功和竞争解决同一问题的每个模型的成本来组合各种对话模型的技能。 将这些方法结合起来会带来自己的一系列问题-需要统一的知识表示机制，可解释性和控制性等。但是遇到问题时，解决方案也将随之而来。
While FAQ and form-filling models are particularly popular now, the need for models like Open QA, NL database interfaces and Dialogue planning are becoming more prominent as not every conversational pathway can be pre-determined, planned and scripted by human content developers. Developments in NLP and machine/deep learning over recent years — transformers like BERT, GPT-3, T5, reinforcement learning like AlphaGo, etc — show promising traits and I believe, will help us achieve our goal to build truly intelligent conversational AI.
Hope you enjoyed this write up. Please do share your comments.