nlp情感分析经典书籍推荐
A simple tutorial to analyse the sentiment of a book in Python
一个简单的教程,用Python分析一本书的情感
入门 (Getting Started)
In this tutorial, I will show you how to apply sentiment analysis to the text contained into a book through an Unsupervised Learning (UL) technique, based on the AFINN lexicon. This tutorial exploits the afinn
Python package, which is available only for English and Danish. If your text is written into a different language, you could translate it before in English and use the afinn
package.
在本教程中,我将向您展示如何基于AFINN词典通过无监督学习(UL)技术将情感分析应用于书中包含的文本。 本教程利用afinn
Python软件包,该软件包仅适用于英语和丹麦语。 如果您的文字是用其他语言书写的,则可以先使用英语将其翻译,然后使用afinn
软件包。
This notebook applies sentiment analysis the Saint Augustine Confessions, which can be downloaded from the Gutemberg Project Page. The masterpiece is split in 13 books (or chapters). We have stored each book into a different file, named number.text (e.g. 1.txt and 2.txt). Each line of every file contains just one sentence.
本笔记本对圣奥古斯丁自白进行了情感分析,可从Gutemberg Project Page下载。 杰作分为13本书(或章节)。 我们已经将每本书存储到一个名为number.text的不同文件中(例如1.txt和2.txt)。 每个文件的每一行仅包含一个句子。
You can download the code from my Github repository: https://github.com/alod83/papers/tree/master/aiucd2021
您可以从我的Github存储库下载代码: https : //github.com/alod83/papers/tree/master/aiucd2021
First of all import the Afinn
class from the afinn
package.
首先,从afinn
包中导入Afinn
类。
from afinn import Afinn
Then create a new Afinn
object, by specifying the used language.
然后通过指定使用的语言来创建一个新的Afinn
对象。
afinn = Afinn(language=’en’)
计算情绪 (Calculate the Sentiment)
使用Afinn给出的分数来计算情感 (Use the score give by Afinn to calculate the sentiment)
The afinn
object contains a method, called score()
, which receives a sentence as input and returns a score as output. The score may be either positive, negative or neutral. We calculate the score of a book, simply by summing all the scores of all the sentence of that book. We define three variables> pos, neg and neutral, which store respectively the sum of all the positive, negative and neutral scores of all the sentences of a book.
afinn
对象包含一个名为score()
的方法,该方法接收一个句子作为输入并返回一个分数作为输出。 分数可以是正面,负面或中性。 我们只需将一本书所有句子的所有分数相加即可计算出一本书的分数。 我们定义三个变量:pos,neg和neutral,它们分别存储一本书所有句子的所有正,负和中性分数的总和。
Firstly, we define three indexes, which will be used after.
首先,我们定义三个索引,之后将使用它们。
pos_index = []
neg_index = []
neutral_index = []
We open the file corresponding to each book through the open()
function, we read all the lines through the function file.readlines()
and for each line, we calculate the score.
我们通过open()
函数open()
与每本书对应的文件,通过函数file.readlines()
读取所有行,并为每一行计算分数。
Then, we can define three indexes to calculate the sentiment of a book: the positive sentiment index (pi), the negative sentiment index (ni) and the neutral sentiment index (nui). The pi of a book corresponds to the number of positive sentences in a book divided per the total number of sentences of the book. Similarly, we can calculate the ni and nui of a book.
然后,我们可以定义三个指数来计算一本书的情绪:正情绪指数(pi),负情绪指数(ni)和中性情绪指数(nui)。 一本书的pi对应于一本书中肯定的句子数除以该书的句子总数。 同样,我们可以计算一本书的ni和nui。
for book in range(1,14):
file = open('sources/' + str(book) + '.txt')
lines = file.readlines()
pos = 0
neg = 0
neutral = 0
for line in lines:
score = int(afinn.score(line))
if score > 0:
pos += 1
elif score < 0:
neg += 1
else:
neutral += 1
n = len(lines)
pos_index.append(pos / n)
neg_index.append(neg / n)
neutral_index.append(neutral / n)
绘制结果 (Plot results)
用图形表示结果 (Give a graphical representation to results)
Finally, we can plot results, by using the matplotlib
package.
最后,我们可以使用matplotlib
软件包来绘制结果。
import matplotlib.pyplot as plt
import numpy as npX = np.arange(1,14)
plt.plot(X,pos_index,'-.',label='pos')
plt.plot(X,neg_index, '--',label='neg')
plt.plot(X,neutral_index,'-',label='neu')
plt.legend()
plt.xticks(X)
plt.xlabel('Libri')
plt.ylabel('Indici')
plt.grid()
plt.savefig('plots/afinn-bsi.png')
plt.show()

经验教训 (Learned Lessons)
In this tutorial, I have shown you a simple strategy to calculate the sentiment of the text contained into a book. This can be achieved through the afinn
package provided by Python. You can follow the following steps:
在本教程中,我向您展示了一种简单的策略来计算书中包含的文本的情感。 这可以通过Python提供的afinn
包来实现。 您可以按照以下步骤操作:
- organise the text of the book in chapters and store each chapter into a separate file 将本书的文本按章组织,并将每章存储在单独的文件中
- split all the sentences of each chapter in different lines 将每一章的所有句子分成不同的行
calculate the score of each sentence separately through the
score()
method provided by theAfinn
class通过
Afinn
类提供的score()
方法分别计算每个句子的score()
- calculate the positive sentiment index, the negative sentiment index and the neutral sentiment index of each chapter 计算每章的正面情绪指数,负面情绪指数和中立情绪指数
- plot results. 绘制结果。
If you want to apply Supervised Learning techniques to perform sentiment analysis, you can stay tuned :).
如果您想应用监督学习技术来进行情绪分析,请继续保持::。
nlp情感分析经典书籍推荐