• Big Data

    2012-04-30 09:32:52
    Big data 介绍 .最新的Big data技术
  • Big data

    2017-12-17 23:04:00
    Big dataisdata setsthat are so voluminous and complex that traditionaldata processingapplication softwareare inadequate to deal with them. Big data challenges includecapturing data,data storage,...

    Big data is data sets that are so voluminous and complex that traditional data processingapplication software are inadequate to deal with them. Big data challenges include capturing data,data storagedata analysis, search, sharingtransfervisualizationquerying, updating andinformation privacy. There are three dimensions to big data known as Volume, Variety and Velocity.

    Lately, the term "big data" tends to refer to the use of predictive analyticsuser behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem." Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on." Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet searchfintechurban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorologygenomicsconnectomics, complex physics simulations, biology and environmental research.

    Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing Internet of things devices such as mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.] The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;[8] as of 2012, every day 2.5 exabytes (2.5×1018) of data are generated. By 2025, IDC predicts there will be 163 zettabytes of data. One question for large enterprises is determining who should own big-data initiatives that affect the entire organization.

    Relational database management systems and desktop statistics- and visualization-packages often have difficulty handling big data. The work may require "massively parallel software running on tens, hundreds, or even thousands of servers". What counts as "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."


  • bigdata

    2012-04-24 22:10:08



  • BIGDATA-源码

    2021-03-16 22:40:37
  • bigdata-源码

    2021-03-16 17:36:26
  • bigdata笔记

    2017-05-22 11:03:05
  • BigData文档

    2017-05-18 06:48:17
  • 大数据 bigdata学习之路
  • nutn_bigdata:Nunn Bigdata课程的回购
  • Alex Gorelik - The Enterprise Big Data Lake_ Delivering the Promise of Big Data and Data Science-O’Reilly Media (2019)
  • bigdata:“ BigData的分类模型”文章的源代码
  • bigdata笔记1

    2017-05-22 11:05:29
  • BigData_SidePro_New BigData_SidePro_New
  • BigData文档笔记

    2017-05-18 08:42:43
  • bigdata_interview 面试总结
  • BigData_DataFactory-源码

    2021-04-20 05:23:35
  • sql on big data

    2019-10-14 17:21:40
    sql on big data sql on big data sql on big data sql on big data
  • Big Data Analytics

    2019-01-30 04:31:55
    Big Data Analytics - A handy reference guide for data analysts and data scientists to help to obtain value from big data analytics using Spark on Hadoop clusters
  • Big Data Retriever

    2017-10-28 10:34:50
    Big Data RetrieverBig Data RetrieverBig Data RetrieverBig Data Retriever
  • ogg big data

    2019-01-16 17:00:15
    ogg for bigdata ,用于配置ogg for big data,可以将生成的文件放到hdfs目录

    2011-11-30 00:05:07
    the stream-based model inverts the traditional data management model by assuming users to be passive and the data management system to be active. 4.baidu人很多魅力,我指的是杨栋老师。baidu...


    1.Hadoop真的很火。关于这套分布式的框架,只在研一接触过,现在实习阶段并未实施,可惜。Hadoop主要由HDFS、MapReduce和Hbase成。HDFS是Google File System(GFS)的开源实现。MapReduce是Google MapReduce的开源实现。HBase是Google BigTable的开源实现。研一时,系统结构的吴老师就一直推荐mapreduce,自己并没有去做过,倒是感觉倒排索引跟它的思想是极其吻合的。以后想要去看一下HyperTable。另外一点,hadoop效率确实一般般吧,主要是用来做离线的数据处理。

    2.Nosql不错,很侥幸自己研一做search engine时用到了mongodb。继续深入浅出mongodb。

    3.流计算,没有仔细接触过,网上摘过来一些,the  stream-based model inverts the traditional data management model by assuming  users  to  be  passive  and  the  data  management system to be active.




  • Big Data Visualization

    2018-08-15 17:52:47
    The target audience of this book are data analysts and those with at least a basic knowledge of big data analysis who now want to learn interesting approaches to big data visualization in order to ...
  • Tianchi - BigData 该仓库托管一些我之前参加天池大数据竞赛的代码。有关打比赛的内容,欢迎访问我的博客 Snoopy_Yuan的博客 - 天池赛 或 PnYuan- Homepages - 天池赛 。 here is a repository for my code during ...
  • Data Quality And TrustIn Big Data



1 2 3 4 5 ... 20
收藏数 28,563
精华内容 11,425