精华内容
下载资源
问答
  • python爬虫可视化
    千次阅读
    2020-12-26 11:08:43

    一、思路分析

    本文采用比特币网站作为爬取目标(https://www.ibtctrade.com/),从中获取prices、CNY、市值等,然后导出所得到的数据到excel、sqlite数据中。使用pyarm中的flask框架搭建可视化平台,使用sqlite数据库的数据制作简单的网页,并制作折线图、柱状图、散点图等等。

    二、数据爬取

    1.引入库

    代码如下:

    from bs4 import BeautifulSoup
    import re
    import urllib.error,urllib.request
    import xlwt
    import sqlite3
    

    2.获取目标网页

    代码如下:

    baseURL = 'https://www.ibtctrade.com/cryptocurrency/p_'  #比特币交易网的数据一共有27页,分别在此网址上加上后缀,即可实现每个网页的获取
    head = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
    
        }
        request = urllib.request.Request(url,headers=head)
        html = ""
    
        response = urllib.request.urlopen(request)
        html = response.read().decode('utf-8')
    
    
        # print(html)
        return html
    

    该处使用的url网络请求的数据。

    2.解析网页

    代码如下:

    findjname = re.compile(r'<strong data-v-2dd1dc90="">(.*?)</strong>')
    findname = re.compile(r'<span data-v-2dd1dc90="">(.*?)</span>')
    findnewprice = re.compile(r'<li data-v-2dd1dc90="">\n(.*?)</li>',re.S)
    findtwofourzhangdie = re.compile(r'<li class=".*" data-v-2dd1dc90="">(.*?)</li>',re.S)
    # findtwofourdie = re.compile(r'<li class="down" data-v-2dd1dc90="">(.*?)</li>',re.S)
    findcny = re.compile(r'<li data-v-2dd1dc90="">\n(.*?)</li>',re.S)
    findshizhi = re.compile(r'<li data-v-2dd1dc90="">\n(.*?)</li>',re.S)
        
    
    def getdata(baseURL):
        datalist = []
        for i in range(1,28):
            url = baseURL + str(i)+'.html'
            html = askurl(url)
    
            soup = BeautifulSoup(html,'html.parser')
            for item in soup.select('.content>a'):
                data =[]
                # print(item)
                item = str(item)
                jname = re.findall(findjname,item)[0]
    
    
                data.append(jname)
    
                name = re.findall(findname,item)[0]
    
                data.append(name)
                # print(data)
    
                newprice = re.findall(findnewprice,item)[0]
    
    
                data.append(newprice.strip())
                twofourzhangdie = re.findall(findtwofourzhangdie,item)[0]
    
    
    
                data.append(twofourzhangdie.strip())
                cny = re.findall(findcny,item)[1]
                data.append(cny.strip())
                shizhi = re.findall(findshizhi,item)[2]
                data.append(shizhi.strip())
    
                datalist.append(data)
    
        # print(datalist)
        return datalist
    
        # print(html)
        return html
    

    使用正则表达式进行数据的筛选和清洗

    3.数据保存到excel

    代码如下:

    path = "比特币简易数据.xls"
    dbpath = "比特币.db"
        # askurl(baseURL)
    def savedata(datalist,path):
        print('正在saving·······')
        book = xlwt.Workbook(encoding='utf-8',style_compression=0)
        sheet = book.add_sheet('比特币数据',cell_overwrite_ok=True)
        col = ('简称','全称','最新价格','24H涨跌幅','24H成交额','市值')
        for i in range(0,6):
            sheet.write(0,i,col[i])
        for i in range(0,700):
            data = datalist[i]
            for j in range(0,6):
                sheet.write(i+1,j,data[j])
        book.save(path)
    

    4.数据保存到sqlite数据库

    代码如下:

    path = "比特币简易数据.xls"
    dbpath = "比特币.db"
        # askurl(baseURL)
    ef savedb(datalist,dbpath):
        init_db(dbpath)
        conn = sqlite3.connect(dbpath)
        cur = conn.cursor()
        for data in datalist:
            for i in range(len(data)):
                data[i] = '"' +data[i]+'"'
                sql = """
                    insert into bitebi750
                    (jname, name,newprice,twofourzhangdie,cny,shizhi)
                    values(%s)"""%','.join(data)
            cur.execute(sql)
            conn.commit()
        cur.close()
        conn.close()
    
    
    
    
    def init_db(dbpath):
        sql = '''
            create table bitebi750
                (id integer primary key autoincrement,
                    jname text, 
                    name text,
                    newprice text,
                    twofourzhangdie text,
                    cny text,
                    shizhi text)
                
        
        
        '''
        conn =sqlite3.connect(dbpath)
        cursor =conn.cursor()
        cursor.execute(sql)
        conn.commit()
        conn.close()
    
    

    三、基于flask框架的可视化

    app.py

    提示:这里对文章进行总结:
    在app.py中对sqlite数据库的数据进行提取处理,主要把参数,传给所需要的数据,来制作图表,每个html的代码过多,不在贴出,可根据index.html自行修改.

    from flask import Flask,render_template
    import sqlite3
    app = Flask(__name__)
    
    
    @app.route('/')
    def index():
        return render_template('index.html')
    @app.route('/shuju')
    def e():
        datalist = []
        con = sqlite3.connect("比特币.db")
        cur = con.cursor()
        sql = "select*from bitebi750"
        data = cur.execute(sql)
        for item in data:
            datalist.append(item)
        cur.close()
        con.close()
        return render_template('shuju.html',movies = datalist)
    
    @app.route('/zhangdie')
    def zhangdie():
        num = []
        sum = []
        con = sqlite3.connect("比特币.db")
        cur = con.cursor()
        sql = "select jname,twofourzhangdie from bitebi750 limit 0,70"
        data = cur.execute(sql)
        for item in data:
            num.append(str(item[0]))
            sum.append(float(item[1][:-1]))
        cur.close()
        con.close()
        return render_template("zhangdie.html",num = num ,sum = sum)
    @app.route('/wordcloud')
    def wordcloud():
        return render_template('wordcloud.html')
    
    
    @app.route('/qujian')
    def qujian():
        num = []
        sum = []
        con = sqlite3.connect("比特币.db")
        cur = con.cursor()
        sql = "select jname,newprice from bitebi750 limit 0,15"
        data = cur.execute(sql)
        for item in data:
            num.append(str(item[0]))
            sum.append(float(item[1][1:]))
        cur.close()
        con.close()
    
        return render_template('qujian.html',num = num ,sum = sum)
    
    @app.route('/sandian')
    def sandian():
        num = []
        sum = []
        yum = []
        con = sqlite3.connect("比特币.db")
        cur = con.cursor()
        sql = "select jname,twofourzhangdie,shizhi from bitebi750 limit 0,50"
        data = cur.execute(sql)
        for item in data:
            num.append(str(item[0]))
            sum.append(float(item[1][:-1]))
            yum.append(float(item[2][1:-1]))
        cur.close()
        con.close()
        return render_template('sandian.html',num = num ,sum = sum ,yum =yum)
    @app.route('/shuliang')
    def shuliang():
        q = 0
        w = 0
        e = 0
        r = 0
        t = 0
        y = 0
        u = 0
    
        sum = []
        con = sqlite3.connect("比特币.db")
        cur = con.cursor()
        sql = "select jname,shizhi from bitebi750 limit 0,204"
        data = cur.execute(sql)
        for item in data:
    
            sum.append(float(item[1][1:-1]))
        for i in sum:
            if i>500 and i<1000:
                q += 1
            elif i>100 and i<500:
                w+=1
            elif i>1 and i<100:
                e+=1
        sql = "select jname,shizhi from bitebi750 limit 204,700"
        data = cur.execute(sql)
        for item in data:
    
            sum.append(float(item[1][1:-1]))
            for i in sum:
                if i>100 and i<=1000:
                    r+=1
                elif i>1000 and i<9999:
                    y+=1
                elif i > 1 and i < 10:
                    t+=1
                elif i > 10 and i < 100:
                    u+=1
    
    
        cur.close()
        con.close()
        return render_template("shuliang.html",q=q,w=w,e=e,r=r,t=t,y=y,u=u)
    if __name__ == '__main__':
        app.run()
    
    
    

    index.html

    Mamba Bootstrap Template - Index
    <!-- ======= About Us Section ======= -->
    <!-- End About Us Section -->
    
    <!-- ======= About Lists Section ======= -->
    
    
    <!-- ======= Counts Section ======= -->
    
    
    <!-- ======= Services Section ======= -->
    <section id="services" class="services">
      <div class="container">
    
        <div class="section-title">
          <h2>Services</h2>
        </div>
    
        <div class="row">
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up">
              <div class="icon"><a href="/shuju"><i class="icofont-computer"></a></i></div>
            <h4 class="title"><a href="/shuju">数据总览</a></h4>
            <p class="description">共整合了741条数据供分析</p>
          </div>
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up" data-aos-delay="100">
              <div class="icon"><a href="/zhangdie"><i class="icofont-chart-bar-graph"></a></i></div>
            <h4 class="title"><a href="">各币种涨跌幅情况</a></h4>
            <p class="description">跟着政策走,永远不回头</p>
          </div>
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up" data-aos-delay="200">
            <div class="icon"><a href="/shuliang"><i class="icofont-earth"></a></i></div>
            <h4 class="title"><a href="">市值区间币种数量</a></h4>
            <p class="description">肯定还是正太分布了</p>
          </div>
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up" data-aos-delay="300">
              <div class="icon"><a href="/qujian"><i class="icofont-image"></i></a></div>
            <h4 class="title"><a href="">最具竞争力的币种</a></h4>
            <p class="description">看看那个最厉害</p>
          </div>
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up" data-aos-delay="400">
            <div class="icon"><a href="/sandian"><i class="icofont-settings"></i></a></div>
            <h4 class="title"><a href="">热门币种市值与涨跌幅关系</a></h4>
            <p class="description">只要热门肯定就会涨的啦</p>
          </div>
          <div class="col-lg-4 col-md-6 icon-box" data-aos="fade-up" data-aos-delay="500">
            <div class="icon"><a href="/wordcloud"><i class="icofont-tasks-alt"></a></i></div>
            <h4 class="title"><a href="/wordcloud">币名词云</a></h4>
            <p class="description">猜猜那个词是最大的</p>
          </div>
        </div>
    
      </div>
    </section><!-- End Services Section -->
    
    <!-- ======= Our Portfolio Section ======= -->
    <section id="portfolio" class="portfolio section-bg">
      <div class="container" data-aos="fade-up" data-aos-delay="100">
    
        <div class="section-title">
          <h2>Our Portfolio</h2>
          <p>Magnam dolores commodi suscipit. Necessitatibus eius consequatur ex aliquid fuga eum quidem. Sit sint consectetur velit. Quisquam quos quisquam cupiditate. Et nemo qui impedit suscipit alias ea. Quia fugiat sit in iste officiis commodi quidem hic quas.</p>
        </div>
    
        <div class="row">
          <div class="col-lg-12">
            <ul id="portfolio-flters">
              <li data-filter="*" class="filter-active">All</li>
              <li data-filter=".filter-app">App</li>
              <li data-filter=".filter-card">Card</li>
              <li data-filter=".filter-web">Web</li>
            </ul>
          </div>
        </div>
    
        <div class="row portfolio-container">
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-app">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/1.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>App 1</h4>
                <p>App</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/1.jpg" data-gall="portfolioGallery" class="venobox" title="App 1"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-web">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/2.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Web 3</h4>
                <p>Web</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/2.jpg" data-gall="portfolioGallery" class="venobox" title="Web 3"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-app">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/3.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>App 2</h4>
                <p>App</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/3.jpg" data-gall="portfolioGallery" class="venobox" title="App 2"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-card">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/4.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Card 2</h4>
                <p>Card</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/4.jpg" data-gall="portfolioGallery" class="venobox" title="Card 2"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-web">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/5.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Web 2</h4>
                <p>Web</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/5.jpg" data-gall="portfolioGallery" class="venobox" title="Web 2"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-app">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/6.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>App 3</h4>
                <p>App</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/6.jpg" data-gall="portfolioGallery" class="venobox" title="App 3"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-card">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/7.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Card 1</h4>
                <p>Card</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/7.jpg" data-gall="portfolioGallery" class="venobox" title="Card 1"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-card">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/8.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Card 3</h4>
                <p>Card</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/8.jpg" data-gall="portfolioGallery" class="venobox" title="Card 3"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
          <div class="col-lg-4 col-md-6 portfolio-item filter-web">
            <div class="portfolio-wrap">
              <img src="static/assets/img/portfolio/9.jpg" class="img-fluid" alt="">
              <div class="portfolio-info">
                <h4>Web 3</h4>
                <p>Web</p>
                <div class="portfolio-links">
                  <a href="static/assets/img/portfolio/9.jpg" data-gall="portfolioGallery" class="venobox" title="Web 3"><i class="icofont-eye"></i></a>
                  <a href="#" title="More Details"><i class="icofont-external-link"></i></a>
                </div>
              </div>
            </div>
          </div>
    
        </div>
    
      </div>
    </section><!-- End Our Portfolio Section -->
    
    <!-- ======= Our Team Section ======= -->
    <section id="team" class="team">
      <div class="container">
    
        <div class="section-title">
          <h2>Our Team</h2>
          <p>Magnam dolores commodi suscipit. Necessitatibus eius consequatur ex aliquid fuga eum quidem.</p>
        </div>
    
        <div class="row">
    
    
    
          <div class="col-xl-3 col-lg-4 col-md-6" data-aos="fade-up" data-aos-delay="200">
            <div class="member">
              <div class="pic"><img src="static/assets/img/team/team-3.jpg" class="img-fluid" alt=""></div>
              <div class="member-info">
                <h4>xiangbo zhu</h4>
                <span>队长</span>
    
              </div>
            </div>
          </div>
    
          <div class="col-xl-3 col-lg-4 col-md-6" data-aos="fade-up" data-aos-delay="300">
            <div class="member">
              <div class="pic"><img src="static/assets/img/team/team-4.jpg" class="img-fluid" alt=""></div>
              <div class="member-info">
                <h4>Amanda Jepson</h4>
                <span>贴身妹子</span>
    
              </div>
            </div>
          </div>
    
        </div>
    
      </div>
    </section><!-- End Our Team Section -->
    
    <!-- ======= Frequently Asked Questions Section ======= -->
    
    
    <!-- ======= Contact Us Section ======= -->
    <section id="contact" class="contact">
      <div class="container">
    
        <div class="section-title">
          <h2>Contact Us</h2>
        </div>
    
        <div class="row">
    
          <div class="col-lg-6 d-flex align-items-stretch" data-aos="fade-up">
            <div class="info-box">
              <i class="bx bx-map"></i>
              <h3>Address</h3>
              <p>江大长山校区文理大楼数据分析实验室</p>
            </div>
          </div>
    
          <div class="col-lg-3 d-flex align-items-stretch" data-aos="fade-up" data-aos-delay="100">
            <div class="info-box">
              <i class="bx bx-envelope"></i>
              <h3>Email Us</h3>
              <p>869676614.com<br>10086.com</p>
            </div>
          </div>
    
          <div class="col-lg-3 d-flex align-items-stretch" data-aos="fade-up" data-aos-delay="200">
            <div class="info-box ">
              <i class="bx bx-phone-call"></i>
              <h3>Call Us</h3>
              <p>17836925032<br>17851006312</p>
            </div>
          </div>
    
    
    
    
    
      </div>
    </section><!-- End Contact Us Section -->
    
      <div class="container">
    
      </div>
    
    
      <div class="copyright">
        &copy; Copyright <strong><span>Mamba</span></strong>. All Rights Reserved
      </div>
      <div class="credits">
        <!-- All the links in the footer should remain intact. -->
        <!-- You can delete the links only if you purchased the pro version. -->
        <!-- Licensing information: https://bootstrapmade.com/license/ -->
        <!-- Purchase the pro version with working PHP/AJAX contact form: https://bootstrapmade.com/mamba-one-page-bootstrap-template-free/ -->
        Designed by <a href="https://bootstrapmade.com/">BootstrapMade</a>
      </div>
    

    qujian.html

    其余部分不再显示,只显示主要部分

        <div class="section-title">
          <h2>比特币数据展示</h2>
    
        </div>
    
    
        <!-- 为 ECharts 准备一个具备大小(宽高)的 DOM -->
        <div id="main" style="width : 700px ;height:800px;"></div>
          <script type="text/javascript">
            var dom = document.getElementById("main");
            var myChart = echarts.init(dom);
            var app = {};
            option = null;
    

    option = {
    title: {
    text: ‘比特币价格饼图’,
    subtext: ‘前15位’,
    left: ‘center’
    },
    tooltip: {
    trigger: ‘item’,
    formatter: ‘{a}
    {b} : {c} ({d}%)’
    },
    legend: {
    orient: ‘vertical’,
    left: ‘left’,
    data: [{{ num[0]|tojson }}, {{ num[1]|tojson }},{{ num[2]|tojson }},{{ num[3]|tojson }}, {{ num[4]|tojson }},{{ num[5]|tojson }},{{ num[6]|tojson }},{{ num[7]|tojson }},{{ num[8]|tojson }},{{ num[9]|tojson }},
    {{ num[10]|tojson }},{{ num[11]|tojson }},{{ num[12]|tojson }},{{ num[13]|tojson }},{{ num[14]|tojson }}]
    },
    series: [
    {
    name: ‘访问来源’,
    type: ‘pie’,
    radius: ‘55%’,
    center: [‘50%’, ‘60%’],
    data: [
    {value: {{ sum[0]|tojson }}, name: {{ num[0]|tojson }}},
    {value: {{ sum[1]|tojson }}, name: {{ num[1]|tojson }}},
    {value: {{ sum[2]|tojson }}, name: {{ num[2]|tojson }}},
    {value: {{ sum[3]|tojson }}, name: {{ num[3]|tojson }}},
    {value: {{ sum[4]|tojson }}, name: {{ num[4]|tojson }}},
    {value: {{ sum[5]|tojson }}, name: {{ num[5]|tojson }}},
    {value: {{ sum[6]|tojson }}, name: {{ num[6]|tojson }}},
    {value: {{ sum[7]|tojson }}, name: {{ num[7]|tojson }}},
    {value: {{ sum[8]|tojson }}, name: {{ num[8]|tojson }}},
    {value: {{ sum[9]|tojson }}, name: {{ num[9]|tojson }}},
    {value: {{ sum[10]|tojson }}, name: {{ num[10]|tojson }}},
    {value: {{ sum[11]|tojson }}, name: {{ num[11]|tojson }}},
    {value: {{ sum[12]|tojson }}, name: {{ num[12]|tojson }}},
    {value: {{ sum[13]|tojson }}, name: {{ num[13]|tojson }}},
    {value: {{ sum[14]|tojson }}, name: {{ num[14]|tojson }}}
    ],

            emphasis: {
                itemStyle: {
                    shadowBlur: 10,
                    shadowOffsetX: 0,
                    shadowColor: 'rgba(0, 0, 0, 0.5)'
                }
            }
        }
    ]
    

    };
    if (option && typeof option === “object”) {
    myChart.setOption(option, true);
    }

      </div>
    
      </div>
    </section><!-- End Counts Section -->
    
      </div>
    </section><!-- End Our Team Section -->
    

    sandian.html

    <div class="container">
    
        <div class="section-title">
          <h2>比特币数据展示</h2>
    
        </div>
    
    
        <!-- 为 ECharts 准备一个具备大小(宽高)的 DOM -->
        <div id="main" style="width : 1000px ;height:800px;"></div>
          <script type="text/javascript">
            var dom = document.getElementById("main");
            var myChart = echarts.init(dom);
            var app = {};
            option = null;
    

    var data = [[1,{{ sum[0]|tojson }},{{ yum[0]|tojson }},{{ num[0]|tojson }}],
    [2,{{ sum[1]|tojson }},{{ yum[1]|tojson }},{{ num[1]|tojson }}],
    [3,{{ sum[2]|tojson }},{{ yum[2]|tojson }},{{ num[2]|tojson }}],
    [4,{{ sum[3]|tojson }},{{ yum[3]|tojson }},{{ num[3]|tojson }}],
    [5,{{ sum[4]|tojson }},{{ yum[4]|tojson }},{{ num[4]|tojson }}],
    [6,{{ sum[5]|tojson }},{{ yum[5]|tojson }},{{ num[5]|tojson }}],
    [7,{{ sum[6]|tojson }},{{ yum[6]|tojson }},{{ num[6]|tojson }}],
    [8,{{ sum[7]|tojson }},{{ yum[7]|tojson }},{{ num[7]|tojson }}],
    [9,{{ sum[8]|tojson }},{{ yum[8]|tojson }},{{ num[8]|tojson }}],
    [10,{{ sum[9]|tojson }},{{ yum[9]|tojson }},{{ num[9]|tojson }}],
    [11,{{ sum[10]|tojson }},{{ yum[10]|tojson }},{{ num[10]|tojson }}],
    [12,{{ sum[11]|tojson }},{{ yum[11]|tojson }},{{ num[11]|tojson }}],
    [13,{{ sum[12]|tojson }},{{ yum[12]|tojson }},{{ num[12]|tojson }}],
    [14,{{ sum[13]|tojson }},{{ yum[13]|tojson }},{{ num[13]|tojson }}],
    [15,{{ sum[14]|tojson }},{{ yum[14]|tojson }},{{ num[14]|tojson }}],
    [16,{{ sum[15]|tojson }},{{ yum[15]|tojson }},{{ num[15]|tojson }}],
    [17,{{ sum[16]|tojson }},{{ yum[16]|tojson }},{{ num[16]|tojson }}],
    [18,{{ sum[17]|tojson }},{{ yum[17]|tojson }},{{ num[17]|tojson }}],
    [19,{{ sum[18]|tojson }},{{ yum[18]|tojson }},{{ num[18]|tojson }}],
    [20,{{ sum[19]|tojson }},{{ yum[19]|tojson }},{{ num[19]|tojson }}],
    [21,{{ sum[20]|tojson }},{{ yum[20]|tojson }},{{ num[20]|tojson }}],
    [22,{{ sum[21]|tojson }},{{ yum[21]|tojson }},{{ num[21]|tojson }}],
    [23,{{ sum[22]|tojson }},{{ yum[22]|tojson }},{{ num[22]|tojson }}],
    [24,{{ sum[23]|tojson }},{{ yum[23]|tojson }},{{ num[23]|tojson }}],
    [25,{{ sum[24]|tojson }},{{ yum[24]|tojson }},{{ num[24]|tojson }}],
    [26,{{ sum[25]|tojson }},{{ yum[25]|tojson }},{{ num[25]|tojson }}],
    [27,{{ sum[26]|tojson }},{{ yum[26]|tojson }},{{ num[26]|tojson }}],
    [28,{{ sum[27]|tojson }},{{ yum[27]|tojson }},{{ num[27]|tojson }}],
    [29,{{ sum[28]|tojson }},{{ yum[28]|tojson }},{{ num[28]|tojson }}],
    [30,{{ sum[29]|tojson }},{{ yum[29]|tojson }},{{ num[29]|tojson }}]

    ],
    

    option = {
    backgroundColor: new echarts.graphic.RadialGradient(0.3, 0.3, 0.8, [{
    offset: 0,
    color: ‘#f7f8fa’
    }, {
    offset: 1,
    color: ‘#cdd0d5’
    }]),
    title: {
    text: ‘排名前五十市值与涨跌幅关系’
    },
    legend: {
    right: 10,
    data: [‘1’, ‘2’]
    },
    xAxis: {
    itemStyle: {
    normal: {
    label: {
    show: true,
    positiong: ‘top’,
    formatter: ‘{c}%’
    }
    }
    },
    splitLine: {
    lineStyle: {
    type: ‘dashed’
    }
    }
    },
    yAxis: {
    axisLabel: {
    formatter: ‘{value} %’
    },
    splitLine: {
    lineStyle: {
    type: ‘dashed’
    }
    },
    scale: true

    },
    series: [{
        name: '-',
        data: data,
        type: 'scatter',
        symbolSize: function (data) {
            return Math.sqrt(data[2]) ;
        },
        emphasis: {
            label: {
                show: true,
                formatter: function (param) {
                    return param.data[3];
                },
                position: 'top'
            }
        },
        itemStyle: {
            shadowBlur: 10,
            shadowColor: 'rgba(120, 36, 50, 0.5)',
            shadowOffsetY: 5,
            color: new echarts.graphic.RadialGradient(0.4, 0.3, 1, [{
                offset: 0,
                color: 'rgb(251, 118, 123)'
            }, {
                offset: 1,
                color: 'rgb(204, 46, 72)'
            }])
        }
    }]
    

    }

            if (option && typeof option === "object") {
                myChart.setOption(option, true);
            }
                   </script>
    
      </div>
    
      </div>
    </section><!-- End Counts Section -->
    
      </div>
    </section><!-- End Our Team Section -->
    

    shuju.html

    <section class="counts section-bg">
      <div class="container">
    
    
          <table class="table table_striped">
              <tr>
                    <td>排名</td>
                    <td>简称</td>
                    <td>全称</td>
                    <td>当前价格</td>
                    <td>24小时涨跌幅</td>
                    <td>交易额</td>
                    <td>市值</td>
              </tr>
    
              {% for movie in movies %}
                <tr>
                    <td>{{ movie[0] }}</td>
                    <td>
    
                        {{ movie[1] }}
    
                    </td>
                    <td>{{ movie[2] }}</td>
                    <td>{{ movie[3] }}</td>
                    <td>
    
                        {{ movie[4] }}
    
                    </td>
                    <td>{{ movie[5] }}</td>
                    <td>{{ movie[6] }}</td>
              </tr>
              {% endfor %}
          </table>
    
    
      </div>
    </section><!-- End Counts Section -->
    
      </div>
    </section><!-- End Our Team Section -->
    

    shuliang.html

    <div class="container">
    
        <div class="section-title">
          <h2>比特币数据展示</h2>
    
        </div>
    
    
        <!-- 为 ECharts 准备一个具备大小(宽高)的 DOM -->
        <div id="main" style="width : 1000px ;height:800px;"></div>
          <script type="text/javascript">
            var dom = document.getElementById("main");
            var myChart = echarts.init(dom);
            var app = {};
    

    option = {
    color: [‘#3398DB’],
    tooltip: {
    trigger: ‘axis’,
    axisPointer: { // 坐标轴指示器,坐标轴触发有效
    type: ‘shadow’ // 默认为直线,可选为:‘line’ | ‘shadow’
    }
    },
    grid: {
    left: ‘3%’,
    right: ‘4%’,
    bottom: ‘3%’,
    containLabel: true
    },
    xAxis: [
    {
    type: ‘category’,
    data: [‘500亿-1000亿’, ‘100亿-500亿’, ‘1亿-100亿’, ‘1000万-9999万’, ‘100万-1000万’,‘10万-100万’,‘1万-10万’],
    axisTick: {
    alignWithLabel: true
    }
    }
    ],
    yAxis: [
    {
    type: ‘value’
    }
    ],
    series: [
    {
    name: ‘数据’,
    type: ‘bar’,
    barWidth: ‘60%’,
    data: [{{ q }}, {{ w }}, {{ e }}, {{ y }}, {{ r }}, {{ u }}, {{ t }},]
    }
    ]
    };

            if (option && typeof option === "object") {
                myChart.setOption(option, true);
            }
                   </script>
    
      </div>
    
      </div>
    </section><!-- End Counts Section -->
    
      </div>
    </section><!-- End Our Team Section -->
    

    zhangdie.html

        <div class="section-title">
          <h2>比特币数据展示</h2>
    
        </div>
    
    
        <!-- 为 ECharts 准备一个具备大小(宽高)的 DOM -->
        <div id="main" style="width: 1000px ;height:750px;"></div>
          <script type="text/javascript">
            var dom = document.getElementById("main");
            var myChart = echarts.init(dom);
            var app = {};
            option = null;
    

    option = {
    tooltip: {
    trigger: ‘axis’,
    position: function (pt) {
    return [pt[0], ‘10%’];
    }
    },
    title: {
    left: ‘center’,
    text: ‘排名前七十比特币市值涨跌图’,
    },
    toolbox: {
    feature: {
    dataZoom: {
    yAxisIndex: ‘none’
    },
    restore: {},
    saveAsImage: {}
    }
    },
    xAxis: {
    type: ‘category’,
    boundaryGap: false,
    data: {{ num|tojson }}
    },
    yAxis: {
    axisLabel: {
    formatter: ‘{value} %’
    },
    type: ‘value’,
    boundaryGap: [0, ‘100%’]
    },
    dataZoom: [{
    type: ‘inside’,
    start: 0,
    end: 10
    }, {
    start: 0,
    end: 10,
    handleIcon: ‘M10.7,11.9v-1.3H9.3v1.3c-4.9,0.3-8.8,4.4-8.8,9.4c0,5,3.9,9.1,8.8,9.4v1.3h1.3v-1.3c4.9-0.3,8.8-4.4,8.8-9.4C19.5,16.3,15.6,12.2,10.7,11.9z M13.3,24.4H6.7V23h6.6V24.4z M13.3,19.6H6.7v-1.4h6.6V19.6z’,
    handleSize: ‘80%’,
    handleStyle: {
    color: ‘#fff’,
    shadowBlur: 3,
    shadowColor: ‘rgba(0, 0, 0, 0.6)’,
    shadowOffsetX: 2,
    shadowOffsetY: 2
    }
    }],
    series: [
    {
    name: ‘数据’,
    type: ‘line’,
    smooth: true,
    symbol: ‘none’,
    sampling: ‘average’,
    itemStyle: {
    color: ‘rgb(255, 70, 131)’
    },
    areaStyle: {
    color: new echarts.graphic.LinearGradient(0, 0, 0, 1, [{
    offset: 0,
    color: ‘rgb(255, 158, 68)’
    }, {
    offset: 1,
    color: ‘rgb(255, 70, 131)’
    }])
    },
    data: {{ sum|tojson }}
    }
    ]
    };

            ;
            if (option && typeof option === "object") {
                myChart.setOption(option, true);
            }
                   </script>
    
      </div>
    

    wordcloud.html

        <div class="row no-gutters">
          <div class="col-lg-6 video-box">
            <img src="static/assets/234.jpg" class="img-fluid" alt="">
    
          </div>
    
          <div class="col-lg-6 d-flex flex-column justify-content-center about-content">
    
            <div class="section-title">
              <h2>词云</h2>
              <p>采用比特币的名称来制作图云,当中network, 币,比特,coin,chain等词出现的频率很高,说明了比特币的命名与本身所包含的意义相关</p>
            </div>
    
            <div class="icon-box" data-aos="fade-up" data-aos-delay="100">
              <div class="icon"><i class="bx bx-fingerprint"></i></div>
              <h4 class="title"><a href="">Lorem Ipsum</a></h4>
              <p class="description">222</p>
            </div>
    
    
    
          </div>
        </div>
    
      </div>
    </section><!-- End About Us Section -->
    

    源代码可到微信公众号"一团追梦喵"回复"python爬虫及其可视化"获取

    更多相关内容
  • Python爬虫数据可视化分析大作业,python爬取猫眼评论数据,并做可视化分析。
  • python爬虫,并将数据进行可视化分析,数据可视化包含饼图、柱状图、漏斗图、词云、另附源代码和报告书。
  • 数据处理与可视化之Altair 后言-python爬虫相关库 网络爬虫简介 网络爬虫(webcrawler,又被称为网页蜘蛛,网络机器人,在FOAF社区中间,更经常的称为网页追逐者),是一种用来自动浏览万维网的程序或者脚本。爬虫...
  • 好不容易找到的爬虫可视化的教程 分享给大家
  • Python爬虫可视化输出

    2021-12-01 17:11:02
    Python爬虫可视化输出

    一.基本情况

    目前状况,新冠疫情已成为全国人民极度关注的重点,不管是每日微博热点还是新闻报告,人们都是非常的关注。因此我们小组是对全国及全球疫情数据进行实时的爬取,爬取内容为腾讯新闻的新冠病毒疫情的实时追踪,可以更清楚、更直观地了解到目前疫情全国及全球的发展趋势,并对此进行了处理存储为柱形图、饼状图等系列可视化输出操作。

    我在本项目中主要负责对Python网络爬虫爬取到的excel进行可视化

    二.代码解析

    1.对比柱状图

    i = datetime.datetime.now()
    forei_top20 = forei_df.head(21)    # 取出数据
    # 作图:
    plt.title('截止北京时间'+ str('%s'%i)[:16]  + '外国top20累计确诊人数柱形图')
    # plt.xlabel('国家')
    # plt.ylabel('人数')
    x = forei_top20['国家']
    y = forei_top20['累计确诊']
    # plt.ylim(0, max(int_top20['累计确诊']))
    plt.xticks(rotation=65)
    plt.bar(range(len(x)), y,color='r',tick_label=x)
    name = "外国累计确诊及死亡人数top20对比柱状图"
    plt.savefig(r'D:\A疫情数据\{}.png'.format(name))
    plt.show()
    plt.close()
    print("图片已存储至指定路径")


    2.对比柱状图

    cn_province_df = cn_province_df.sort_values('累计确诊', ascending=False)   # 以累计确诊数降序排序
    i = datetime.datetime.now()
    #作图:
    plt.title('截止'+str('%s'%i)[:16]  + '的top20累计确诊人数柱形图')
    label_list = cn_province_df.head(20)['省份']
    x = range(len(cn_province_df.head(20)['省份']))
    y = cn_province_df.head(20)['累计确诊']
    plt.ylim(0, max(cn_province_df.head(20)['累计确诊']))
    plt.xticks(rotation=90)
    plt.bar(x=x, height=y, width=0.6, alpha=0.8, color='c')
    plt.ylim(0, 70000)
    plt.ylabel('人数')
    plt.xticks([index + 0.2 for index in x], label_list, rotation=65)
    name = "累计确诊及死亡人数top20对比柱状图"
    plt.savefig(r'D:\A疫情数据\{}.png'.format(name))
    plt.show()
    plt.close()
    print("图片已存储至指定路径")

    3.饼图

    plt.title('死亡率top10')
    plt.pie(labels=cn_province_df.head(10)['省份'] ,x=cn_province_df.head(10)['死亡率'],colors=['b','r','yellow','c','orange','lime'])
    name = "死亡率饼状图"
    plt.savefig(r'D:\A疫情数据\{}.png'.format(name))
    plt.show()
    plt.close()
    print("图片已存储至指定路径")

    三.用到的库

    import matplotliib.pyplot as plt

    import datetime

    import pandas as pd

    (ps:害怕数据被和谐有些敏感字眼已去除)

    展开全文
  • python实现 实时爬取某教育频道》滚动新闻。统计每日新闻数量并做可视化python可视化作业requests、matplotlib、柱状图、曲线图
  • Python爬虫数据分析可视化.rar
  • 代码所需包进入前程无忧官网我这里以搜索大数据职位信息打开开发者模式RequestHeaders里面是我们用浏览器访问网站的信息,有了信息后就能模拟浏览器访问这也是为了防止网站封禁IP,不过前程无忧一般是不会封IP的。...
  • Python爬虫数据可视化

    千次阅读 2022-05-15 10:58:23
    Python爬虫——matplotlib和pandas库数据可视化 导入需要的第三方库 import matplotlib.pyplot as plt import seaborn as sns import pandas as pd import requests import urllib3 import matplotlib as mpl ...

    Python爬虫——数据可视化

    导入需要的第三方库

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import requests
    import urllib3
    import matplotlib as mpl

    设置可以显示中文和改变字体

    mpl.rcParams['font.sans-serif'] = ['KaiTi']
    mpl.rcParams['font.serif'] = ['KaiTi']
    mpl.rcParams['axes.unicode_minus'] = False
    sns.set_style("darkgrid", {"font.sans-serif": ['KaiTi', 'Arial']})

    爬取数据:此方法只适用于表格

    urllib3.disable_warnings()
    url = "http://www.stats.gov.cn/ztjc/zdtjgz/zgrkpc/dqcrkpc/ggl/202105/t20210519_1817699.html"
    response = requests.get(url, verify=False)
    response.encoding = response.apparent_encoding
    html = response.text
    data = pd.read_html(html, header=0)[1]
    print(data)

    提取数据中的某一列

    city = list(data["地区"])#提取数据中名为“地区”的一列
    year1 = list(data["2020年"])#提取数据中名为“2020年”的一列
    year2 = list(data["2010年"])#提取数据中名为“2010年”的一列

    将数据可视化为折线图

    he = {"2020年": year1, "2010年": year2}#在图中显示数据命名year1命名为“2020年”
                                          #在图中显示数据命名year2命名为“2010年”
    df = pd.DataFrame(he, city)#he代表纵坐标,city代表横坐标
    df.plot.line()#定义绘制折线图
    plt.show()#将折线图展示

    完整代码如下

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import requests
    import urllib3
    import matplotlib as mpl
    
    #设置可以显示中文和改变字体
    mpl.rcParams['font.sans-serif'] = ['KaiTi']
    mpl.rcParams['font.serif'] = ['KaiTi']
    mpl.rcParams['axes.unicode_minus'] = False
    sns.set_style("darkgrid", {"font.sans-serif": ['KaiTi', 'Arial']})
    
    #爬取数据只适用于表格
    urllib3.disable_warnings()
    url = "http://www.stats.gov.cn/ztjc/zdtjgz/zgrkpc/dqcrkpc/ggl/202105/t20210519_1817699.html"
    response = requests.get(url, verify=False)
    response.encoding = response.apparent_encoding
    html = response.text
    data = pd.read_html(html, header=0)[1]
    print(data)
    
    #提取数据中的某一列
    city = list(data["地区"])
    year1 = list(data["2020年"])
    year2 = list(data["2010年"])
    
    #将数据可视化为折线图
    he = {"2020年": year1, "2010年": year2}
    df = pd.DataFrame(he, city)
    df.plot.line()
    plt.show()
    

     最后展示的折线图

    展开全文
  • python 爬虫及数据可视化展示

    千次阅读 2022-02-11 10:17:54
    学了有关python爬虫及数据可视化的知识,想着做一些总结,加强自己的学习成果,也能给各位小伙伴一些小小的启发。 1、做任何事情都要明确自己的目的,想要做什么,打算怎么做,做到什么样的程度,自己有一个清晰的...

    python 爬虫及数据可视化展示

    学了有关python爬虫及数据可视化的知识,想着做一些总结,加强自己的学习成果,也能给各位小伙伴一些小小的启发。

    1、做任何事情都要明确自己的目的,想要做什么,打算怎么做,做到什么样的程度,自己有一个清晰的定位,虽然计划永远赶不上变化,但是按计划走,见招拆招或许也是不错的选择。

    2、本项目是爬取豆瓣的250部电影,将电影名,电影链接,评分等信息爬取保存到本地。将相关信息以列表的形式展示在网页上,访问者可通我的网站直接挑转到豆瓣查看电影,将评分制作评分走势图,将电影制作成词云图在网页上展示,共有五个网页,可相互跳转。

    项目流程图:

    在这里插入图片描述

    数据爬取:

    # -*- codeing = utf-8 -*-
    # @Time : 2022/1/11 22:39
    # @Author : lj
    # @File : spider.PY
    # @Software: 4{PRODUCT_NAME}
    import bs4  # 网页解析,获取数据 对网页的数据进行拆分
    import re   #正则表达式,进行文字匹配   对数据进行提炼
    import urllib.request,urllib.error  #指定url 获取网页数据 怕网页
    import xlwt  #进行excel 操作  存再excel中
    import sqlite3 #进行sqllite 数据库操作 存在数据库中
    import time
    # 主函数
    def main():
        # 调用函数
        url = "https://movie.douban.com/top250?start="
        datalist1 = allData(url)
        # savepath = "豆瓣电影top250.xls"
        # savedata(datalist1,savepath)
        dbpath = "move.db"
        savedatasql(datalist1,dbpath)
    #匹配所需内容的正则表达式
    linkpattern = re.compile(r'<a href="(.*?)/">')
    #匹配图片的正则表达式
    imagepattern = re.compile(r'<img .*src=".*"/>',re.S)#re.S 忽略换行符,.表示除了换行符以外的所有字符
    #匹配影名的正则表达式
    namepattern = re.compile(r'<span class="title">(.*)</span>')
    # 影片评分
    gradepattern = re.compile(r'<span class="rating_num" property="v:average">(.*)</span>')
    # 评价人数
    peoplepattern = re.compile(r'<span>(\d*)人评价</span>')#(\d*) 零个或多个
    #概况
    thinkpattern = re.compile(r'<span class="inq">(.*)</span>')
    #影片的相关内容
    contentpattern = re.compile(r'<p class="">(.*?)</p>',re.S)#忽略换行符
    
    #1、爬取一个网页
    def getData(url1):
        head = {
            "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0"}
        request = urllib.request.Request(url1,headers=head)
        html = ""
        try:
            response = urllib.request.urlopen(request)
            html = response.read().decode("utf-8")
            # print(html)
        except urllib.error.URLError as e:
            if hasattr(e,"code"):
                print(e.code)
            if hasattr(e,"reason"):
                print(e.reason)
        return html  #返回给调用的地方
    # 2、爬取所有网页,匹配分析
    def allData(url):
        datalist = []
        for i in range(0, 10):  # 左闭右开 调用十次,每次二十五条信息
            url1 = url + str(i * 25)
            html = getData(url1) #保存获取到的网页源码
            time.sleep(1)
            # 逐页解析
            soup = bs4.BeautifulSoup(html,"html.parser") #返回树型结构
            for item in soup.find_all('div',class_="item"): #查找符合要求的字符串,返回列表,class加下划线
                data = []
                item = str(item)
                #link 获取到影片的超链接
                link = re.findall(linkpattern,item)[0]
                data.append(link)
                # 影片图片
                image = re.findall(imagepattern,item)[0]
                data.append(image)
                # 影片名
                name = re.findall(namepattern,item)
                if(len(name)==2):
                    chinaname = name[0]
                    data.append(chinaname)
                    outername = name[1].replace("/","")#.replace("/","") 列表内置的方法,将/替换为空""
                    data.append(outername)
                else:
                    data.append(name[0])
                    data.append('  ')#外文名空出来
                # 影片评分
                grade = re.findall(gradepattern,item)[0]
                data.append(grade)
                # 影片评价人数
                people = re.findall(peoplepattern, item)[0]
                data.append(people)
                # 影片概况
                think = re.findall(thinkpattern, item)
                if len(think) != 0:
                    think = think[0].replace("。","")
                    data.append(think)
                else:
                    data.append("  ")
                # 影片内容
                content = re.findall(contentpattern, item)[0]
                content = re.sub('<br(\s+)?/>(\s+)?'," ",content)#替换内容中多余的符号和内容
                content = re.sub('/'," ",content)
                data.append(content.strip())#去除列表中的空格
                datalist.append(data)
        return datalist
    #3、保存数据到excel
    # def savedata(datalist1,savepath):
    #     workplace = xlwt.Workbook(encoding="utf-8",style_compression=0)#style_compression=0·压缩样式
    #     worksheet = workplace.add_sheet("豆瓣电影top250",cell_overwrite_ok="true")#cell_overwrite_ok=true 是否可以覆盖
    #     col = ('电影详情链接','电影图片链接','影片中文名','影片外文名','评分','评价人数','概况','影片内容')
    #     for i in range(0,8):
    #         worksheet.write(0,i,col[i])
    #     for i in range(0,250):
    #         print("打印了%d条" %(i+1))
    #         databuffer = datalist1[i]
    #         for j in range(0,8):
    #             worksheet.write(i+1,j,databuffer[j])
    #     workplace.save(savepath) #保存
    #3、保存数据到数据库
    def savedatasql(datalist1,dbpath):#dbpath 数据库的路径位置
        init_db(dbpath)
        conn = sqlite3.connect(dbpath)
        cur = conn.cursor()#获取一个游标,存放sql语句的执行结果
        #cur 获得了游标可存放执行结果的对象
        #将datalist1中的数据依次遍历写入数据库
        for data in datalist1:
            for index in range(len(data)):
    
                if index == 4 or index == 5:
                    continue
                # data[index] = "'" + data[index].replace(u'\xa0','') + "'"
                data[index] = data[index].replace("'", "")
                data[index] = "'"+data[index]+"'"
    
            sql = '''
                    insert into move250(
                    move_link,img_link,move_chinaname,move_foriername,grade,numbers,introduction,content)
                    values (%s)'''%",".join(data)
            #print(sql)
            cur.execute(sql)
            conn.commit()
        cur.close()
        conn.close()
        print("write successful")
    
    #创建数据库
    def init_db(dbpath):
    
        sql = '''
            create table move250
            (
            id integer primary key autoincrement,
            move_link text,
            img_link text,
            move_chinaname varchar,
            move_foriername varchar,
            grade numeric,
            numbers numeric,
            introduction text,
            content text
            )
        '''
        c = sqlite3.connect(dbpath)
        buffer = c.cursor()  # 获取游标
        buffer.execute(sql)  # 内置的方法执行sql语句
        c.commit()   #         数据提交,写入数据库
        c.close()   #          数据库关闭
        print("database create successful")
    if __name__ == "__main__":
        #调用函数
        main()
        print("爬取完毕!")
    
    
    

    数据库数据

    在这里插入图片描述

    excel数据

    在这里插入图片描述

    可视化制作

    在这里插入图片描述

    路由分配,网页渲染

    import sqlite3
    
    from flask import Flask,render_template
    
    app = Flask(__name__)
    
    @app.route('/')
    def first():  # put application's code here
        return render_template("index-1.html")
    # 每一个函数对应一个路由解析
    @app.route('/index')
    def first1():
        return render_template('index-1.html')
    
    @app.route('/movie')
    def mv():
        datalist =  []
        con = sqlite3.Connection('move.db')
        cursor = con.cursor()
        sql = "select * from move250"
        data = cursor.execute(sql)
        for items in data:
            datalist.append(items)
        cursor.close()
        con.close()
    
        return render_template('about.html',mmm = datalist)
    
    @app.route('/score')
    def sc():
        x = []
        y = []
        con = sqlite3.Connection('move.db')
        cursor = con.cursor()
        sql = "select grade,count(grade) from move250 group by grade"
        data1 = cursor.execute(sql)
        for ite in data1:
           x.append(ite[0])
           y.append(ite[1])
        cursor.close()
        con.close()
        return render_template('services.html',score = x,number = y)
    
    @app.route('/word')
    def wd():
        return render_template('projects-details.html')
    
    @app.route('/team')
    def te():
        return render_template('team.html')
    
    if __name__ == '__main__':
        app.run()
    

    网页源码,由于网页源码过多,只放一页

    网页效果

    在这里插入图片描述

    <!DOCTYPE html>
    <html lang="en">
    
    <head>
        <style>
            #container{
                border: 2px solid red;
                height: 1000px;
                width: 1200px;
                top: 50px;
                left: 100px;
            }
        </style>
        
        <!-- ========== Meta Tags ========== -->
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta name="viewport" content="width=device-width, initial-scale=1">
        <meta name="description" content="Buskey - Corporate Business Template">
    
        <!-- ========== Page Title ========== -->
        <title>Buskey - Corporate Business Template</title>
    
        <!-- ========== Start Stylesheet ========== -->
        <link href="../static/css/plugins.min.css" rel="stylesheet">
        <link href="../static/css/flaticon-business-set.css" rel="stylesheet">
        <link href="../static/css/style.css" rel="stylesheet">
        <link href="../static/css/responsive.css" rel="stylesheet">
        <!-- ========== End Stylesheet ========== -->
    
        <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
        <!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
        <!--[if lt IE 9]>
    
    
     <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
        <![endif]-->
    
        <!-- ========== Google Fonts ========== -->
        <link href="../static/css/css.css" rel="stylesheet">
        <link href="../static/css/css1.css" rel="stylesheet">
    
    </head>
    
    <body>
    
        <!-- Preloader Start -->
        <div class="se-pre-con"></div>
        <!-- Preloader Ends -->
    
        <!-- Start Header Top 
        ============================================= -->
        <div class="top-bar-area bg-theme text-light">
            <div class="container">
                <div class="row">
                    <div class="col-md-9">
                        <div class="info box">
                            <ul>
                                <li>
                                    <div class="icon">
                                        <i class="fas fa-map-marker-alt"></i>
                                    </div>
                                    <div class="info">
                                        <p>
                                            china
                                        </p>
                                    </div>
                                </li>
                                <li>
                                    <div class="icon">
                                        <i class="fas fa-envelope-open"></i>
                                    </div>
                                    <div class="info">
                                        <p>
                                            1751108164@qq.com
                                        </p>
                                    </div>
                                </li>
                                <li>
                                    <div class="icon">
                                        <i class="fas fa-mobile-alt"></i>
                                    </div>
                                    <div class="info">
                                        <p>
                                            +123 456 7890
                                        </p>
                                    </div>
                                </li>
                            </ul>
                        </div>
                    </div>
                    <div class="topbar-social col-md-3">
    
                    </div>
                </div>
            </div>
        </div>
        <!-- End Header Top -->
    
        <!-- Header 
        ============================================= -->
        <header>
    
            <!-- Start Navigation -->
            <nav class="navbar navbar-default navbar-sticky bootsnav">
    
                <div class="container">
    
                    <!-- Start Header Navigation -->
                    <div class="navbar-header">
                        <button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-menu">
                            <i class="fa fa-bars"></i>
                        </button>
                        <a class="navbar-brand" href="index.html">
                            <img src="../static/picture/logo-light.png" class="logo logo-display" alt="Logo">
                        </a>
                    </div>
                    <!-- End Header Navigation -->
    
                    <!-- Collect the nav links, forms, and other content for toggling -->
                    <div class="collapse navbar-collapse" id="navbar-menu">
                        <ul class="nav navbar-nav navbar-right" data-in="#" data-out="#">
                            <li class="dropdown">
                                <a href="index-1.html" class="dropdown-toggle active" data-toggle="dropdown">首页</a>
    
                            </li>
                            <li>
                                <a href="about.html">电影列表</a>
                            </li>
                            <li>
                                <a href="">评分</a>
                            </li>
                            <li class="dropdown">
                                <a href="projects-details.html" class="dropdown-toggle" data-toggle="dropdown">词云</a>
    
                            </li>
                            <li class="dropdown">
                                <a href="team.html" class="dropdown-toggle" data-toggle="dropdown">团队</a>
    
                            </li>
                        </ul>
                    </div><!-- /.navbar-collapse -->
                </div>
    
            </nav>
            <!-- End Navigation -->
    
        </header>
        <!-- End Header -->
    
        <!-- Start Breadcrumb
        ============================================= -->
        <div class="breadcrumb-area shadow dark bg-fixed text-center padding-xl text-light" style="background-image: url(../static/image/21.jpg);">
            <div class="container">
                <div class="row">
                    <div class="col-md-6 col-sm-6 text-left">
                        <h1>影线评分</h1>
                    </div>
                    <div class="col-md-6 col-sm-6 text-right">
    
                    </div>
                </div>
            </div>
        </div>
        <!-- End Breadcrumb -->
    
        <!-- Start Fun Factor -->
    
        <!-- Start Fun Factor -->
    <div class="fun-factor-area default-padding text-center bg-fixed shadow theme-hard parallax parralax-shadow" data-parallax="scroll" style="background-image: url(../static/image/12.jpg);">
            <div class="container">
                <div class="row">
                    <div class="col-md-12">
                        <a href="about.html">
                        <div class="col-md-3 col-sm-6 item">
                            <div class="fun-fact">
                                <i class="flaticon-world-map"></i>
                                <div class="timer" data-to="250" data-speed="5000"></div>
                                <span class="medium">经典电影</span>
                            </div>
                        </div>
                            </a>
                        <a href="services.html">
                        <div class="col-md-3 col-sm-6 item">
                            <div class="fun-fact">
                                <i class="flaticon-gears"></i>
                                <div class="timer" data-to="1" data-speed="5000"></div>
                                <span class="medium">评分统计</span>
                            </div>
                        </div>
                            </a>
                        <a href="projects-details.html">
                        <div class="col-md-3 col-sm-6 item">
                            <div class="fun-fact">
                                <i class="flaticon-id-card"></i>
                                <div class="timer" data-to="5693" data-speed="5000"></div>
                                <span class="medium">高频词汇</span>
                            </div>
                        </div>
                            </a>
                        <a href="team.html ">
                        <div class="col-md-3 col-sm-6 item">
                            <div class="fun-fact">
                                <i class="flaticon-id"></i>
                                <div class="timer" data-to="5" data-speed="5000"></div>
                                <span class="medium">专业团队</span>
                            </div>
                        </div>
                            </a>
                    </div>
                </div>
            </div>
        </div>
        <!-- End Fun Factor -->
        <!-- Start Services
        ============================================= -->
        <div class="carousel-services-area bg-gray">
            <div class="container-box oh">
                <div class="carousel-service-items owl-carousel owl-theme">
                    <div id="container" ></div>
                </div>
            </div>
        </div>
        <!-- End Services -->
    
        <!-- jQuery Frameworks
        ============================================= -->
    
         <!-- echarts 图表
        ============================================= -->
        <script src="../static/js/plugins.min.js"></script>
        <script src="../static/js/main.js"></script>
            <script type="text/javascript" src="echarts.min.js"></script>
            <!-- Uncomment this line if you want to dataTool extension
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts@5/dist/extension/dataTool.min.js"></script>
            -->
            <!-- Uncomment this line if you want to use gl extension
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts-gl@2/dist/echarts-gl.min.js"></script>
            -->
            <!-- Uncomment this line if you want to echarts-stat extension
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts-stat@latest/dist/ecStat.min.js"></script>
            -->
            <!-- Uncomment this line if you want to use map
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts@5/map/js/china.js"></script>
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts@5/map/js/world.js"></script>
            -->
            <!-- Uncomment these two lines if you want to use bmap extension
            <script type="text/javascript" src="https://api.map.baidu.com/api?v=2.0&ak=<Your Key Here>"></script>
            <script type="text/javascript" src="https://cdn.jsdelivr.net/npm/echarts@{{version}}/dist/extension/bmap.min.js"></script>
            -->
    
            <script type="text/javascript">
    var dom = document.getElementById("container");
    var myChart = echarts.init(dom);
    var app = {};
    
    var option;
    
    option = {
      xAxis: {
        type: 'category',
        data: [8.3,8.4,8.5,8.6,8.7,8.8,8.9,9,9.1,9.2,9.3,9.4,9.5,9.6,9.7]
      },
      yAxis: {
        type: 'value'
      },
      series: [
        {
          data: [1,3,9,23,40,40,39,15,28,17,20,7,4,3,1,],
          type: 'bar',
          showBackground: true,
          backgroundStyle: {
            color: 'rgba(180, 180, 180, 0.2)'
          }
        }
      ]
    };
    
    option && myChart.setOption(option);
            </script>
    
    </body>
    </html>
    

    网页的免费模板很多,可以挑选自己喜欢的进行修改,并插入图表等。

    词云图片制作,本项目采用电影的简介制作

    # -*- codeing = utf-8 -*-
    # @Time : 2022/2/2 12:53
    # @Author : lj
    # @File : testcloud.PY
    # @Software: 4{PRODUCT_NAME}
    import jieba
    from matplotlib import pyplot as plt
    from wordcloud import WordCloud
    from PIL import Image
    import numpy as np
    import sqlite3
    # 主要的流程,1.jieba提取文字,2.生成词云
    #准备好数据
    con = sqlite3.connect('move.db')
    cur = con.cursor()
    sql = 'select introduction from move250'
    data = cur.execute(sql)
    text = ""
    for item in data:
        text = text + item[0]
    # print(text)
    cur.close()
    con.close()
    #文本分词分句
    cut = jieba.cut(text)
    r = ' '.join(cut)
    
    print(len(r))
    
    # 处理图片的
    img = Image.open(r'.\templates\TEST\img\img/ll.jpg')
    img_arrays = np.array(img)  #将图片转换为数组
    
    
    #词云库封装一个对象
    wc =WordCloud(
        background_color='white',  #输出图片的颜色
         mask=img_arrays,    #导入处理好的图片
        font_path = 'STXINWEI.TTF' #字体的文件
    )
    #词云对象处理已经分好的词
    wc.generate(r)
    
    
    #绘制图片,plt matplotlib的库别名
    fig = plt.figure(1)  #matplotlib库figure方法可绘图
    plt.imshow(wc)   #按照wc的规则显示图片
    plt.axis('off')  #不显示x轴
    
    # plt.show() #生成的词云图片
    
    #输出词云图片到文件中,设置清晰度
    plt.savefig(r'.\templates\TEST\img\img/wordcloud2.jpg',dpi=800)
    

    词云图展示:

    在这里插入图片描述

    将制作好的图片放在网页的合适网页展示即可


    在这里插入图片描述

    至此此项完成

    问题:1、爬取之后在数据库中的数据无法在网页上显示。

    2、先查看网页后在网页中跳转没有问题,但是通过端口路径访问页面之后跳转到其他网页就无法跳转。欢迎大佬指正。

    展开全文
  • 听别人说要把数字和汉字剥离才能数据分析,可是就是不知道怎么操作,哪位大拿教教我啊
  • 大学计算机课程报告-Python爬虫可视化.pdf
  • Python爬虫及数据可视化网页实现

    千次阅读 2021-10-29 15:57:31
    目录 前言 一、爬虫部分 (1)基本思路 (2)库的使用 二、数据库部分 三、Flask框架部分 ...四、数据可视化部分 ...[Python爬虫编程基础5天速成(2021全新合集)Python入门+数据分析] 学前基础: ...
  • 利用Python爬取并简单地可视化分析当当网的图书数据。开发工具Python版本:3.6.4相关模块:requests模块;bs4模块;wordcloud模块;jieba模块;pillow模块;pyecharts模块;以及一些Python自带的模块。环境搭建安装...
  • 包含了所有的源代码,本项目是一个练手的爬虫小案例。
  • python豆瓣网站爬虫可视化,爬取数据并分析数据,抓取top250电影的上映时间、国家、评分、类型、评价人数,豆瓣top250电影类型数量占比,构建top250电影中出自国家最多的20个国家的列表,豆瓣top10电影上映与时间...
  • Python爬虫、Flask框架与ECharts实现数据可视化,源码无错误!希望大家可以好好学习,用好资源,原创,搬运请表明来源,谢谢!又不足请大佬们指正!
  • Python爬虫实战数据可视化分析

    千次阅读 2021-03-06 16:44:42
     int(total_page[0])+1): data = { "pn":i, "kd":"python" } print(i) page_url = "https://www.lagou.com/jobs/positionAjax.json?px=default&city=%s&needAddtionalResult=false"%city referer_url = ...
  • python爬虫及数据可视化分析

    万次阅读 多人点赞 2021-01-07 09:46:58
    对于刚开始学习编程的小伙伴来说,肯定都对爬虫有一定的兴趣,对于新手来说,从Python爬虫如入门应该是简单易懂的。Python是一种跨平台的计算机程序设计语言。 是一个高层次的结合了解释性、编译性、互动性和面向...
  • Python 网络爬虫及数据可视化

    千次阅读 多人点赞 2021-01-11 18:47:13
    1.3 数据可视化 2 1.4 Python环境介绍 2 1.4.1 简介 2 1.4.2 特点 3 1.5 扩展库介绍 3 1.5.1 安装模块 3 1.5.2 主要模块介绍 3 ① pandas模块 3 ② requests模块 4 ③ bs4模块 4 ④ selenium模块 4 ⑤ matplotlib...
  • Python爬虫爬取博客实现可视化过程解析爬虫爬取博客实现可视化过程解析源码:from pyecharts import Barimport reimport requestsnum=0b=[] for i in range(1,11):link='... Win64;...Bar("柱状图", "每个博客阅读数量")#...
  • 今天主要是来说一下怎么可视化来监控你的爬虫的状态。文中通过实例代码给大家分析了Python实现数据可视化看如何监控你的爬虫状态,感兴趣的朋友一起看看吧
  • 大数据时代的到来,随着人们线上互动以及网络交易,用户...而Python这一语言,在爬虫领域独占鳌头,拥有强大高效便捷的爬虫框架,如Selenium、Scrapy、PySpider等[1],可以对程序进行有效的集中式的进行自动数据集合采
  • Python爬虫以及数据可视化分析

    千次阅读 多人点赞 2022-07-28 17:22:54
    简单几步,通过Python对B站番剧排行数据进行爬取,并进行可视化分析...PS作为Python爬虫初学者,如有不正确的地方,望各路大神不吝赐教[抱拳]本项目将会对B站番剧排行的数据进行网页信息爬取以及数据可视化分析。...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 32,635
精华内容 13,054
关键字:

python爬虫可视化