精华内容
下载资源
问答
  • 之前是在自己电脑上部署的hadoop集群,但并未涉及到HA配置,这次将集群迁移到PC服务器,但是问题来了,只有三台,但是我还想配置HA,PC服务器是CentOS6.5,原来想着在上边部署VM,从而部署HA集群,但经测试,未果,...

    写在前边的话:

            转载请注明出处:@http://blog.csdn.net/gamer_gyt,Thinkagmer 撰写

            之前是在自己电脑上部署的hadoop集群,但并未涉及到HA配置,这次将集群迁移到PC服务器,但是问题来了,只有三台,但是我还想配置HA,PC服务器是CentOS6.5,原来想着在上边部署VM,从而部署HA集群,但经测试,未果,遂弃之,就想到了在三台机器上部署HA集群。

             hadoop伪分布部署参考:点击打开链接
             hadoop单机版部署参考:点击打开链接
             zookeeper,hive,hbase的分布式部署参考:点击链接
             Spark,Sqoop,Mahout的分布式部署参考:点击链接

             步骤和部署hadoop集群(点击阅读)是一样的,只是这里加入了HA的一些配置,记录如下

             关于HA架构的知识请移步该篇博客:Hadoop 容错之1.X的单点故障 到 2.X的HA和HDFS Federation


    一:架构说明

         IP                            hostname               role

        192.168.132.27       master1                  主节点

        192.168.132.28       master2                  备份主节点

        192.168.132.29       slaver1                    从节点


        zookeeper的三个节点集群,部署在这三台机子上


    二:部署Zookeeper

           Hadoop HA的部署依赖于ZK来切换主节点,所以在部署Hadoop HA之前需要先把Zookeeper集群搞定,部署参考:点击阅读


    三:部署HA

           1:文件配置

           除了配置文件mapred-site.xml,core-site.xml,hdfs-site.xml,yarn-site.xml之外和hadoo集群部署一样,这里不做陈述,可参考:点击阅读

           mapred-site.xml:

    <configuration>
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
    </configuration>

           core-site.xml:

    <configuration>
      <!-- 指定hdfs的nameservice为ns -->
      <property>    
          <name>fs.defaultFS</name>    
          <value>hdfs://master</value>    
          <!--1.x中为fs.default.name, 一定得是namenode的hostname或者 ip,并且不用加端口号(这里的名称与HA配置hdfs-site.xml中的dfs.nameservices必须保持一致) -->  
      </property>
     
      <property>  
        <name>hadoop.tmp.dir</name>  
        <value>/opt/bigdata/hadoop/tmp</value>  
        <!-- 指定hadoop临时目录 -->
      </property>   
    
      <!-- 配置HA时才用配置该项 -->
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>master1:2181,master2:2181,slaver1:2181</value>
        <!--各个ZK节点的IP/host,及客户端连接ZK的端口,该端口需与zoo.cfg中的 clientPort一致! -->
      </property>
    </configuration>

           hdfs-site.xml:

    <configuration>
    <property>  
        <name>dfs.replication</name>  
        <value>2</value>  
      </property>  
      <property>  
        <name>dfs.namenode.name.dir</name>  
        <value>file:///opt/bigdata/hadoop/dfs/name</value>  
      </property>  
      <property>  
        <name>dfs.datanode.data.dir</name>  
        <value>file:///opt/bigdata/hadoop/dfs/data</value>  
      </property>  
      <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
        <!-- 在NN和DN上开启WebHDFS (REST API)功能,不是必须 --> 
      </property>
    
      <!-- HA配置需要加如下配置-->
      <property>
        <name>dfs.nameservices</name>
        <value>master</value>
        <!--给hdfs集群起名字,这个名字必须和core-site中的统一,且下面也会用到该名字,需要和core-site.xml中的保持一致 -->
      </property>
    
      <property>
        <name>dfs.ha.namenodes.master</name>
        <value>nn1,nn2</value>
        <!-- master1下面有两个NameNode,分别是nn1,nn2,指定NameService是cluster1时的namenode有哪些,这里的值也是逻辑名称,名字随便起,相互不重复即可 -->
      </property>
    
      <property>
        <name>dfs.namenode.rpc-address.master.nn1</name>
        <value>master1:9000</value>
        <!-- nn1的RPC通信地址 -->
      </property>
    
      <property>
        <name>dfs.namenode.rpc-address.master.nn2</name>
        <value>master2:9000</value>
        <!-- nn2的RPC通信地址 -->
      </property>
    
      <property>
        <name>dfs.namenode.http-address.master.nn1</name>
        <value>master1:50070</value>
        <!-- nn1的http通信地址 -->
      </property>
      <property>
        <name>dfs.namenode.http-address.master.nn2</name>
        <value>master2:50070</value>
        <!-- nn2的http通信地址 -->
      </property>
    
      <property>
        <name>dfs.namenode.servicerpc-address.master.nn1</name>
        <value>master1:53310</value>
      </property>
    
      <property>
        <name>dfs.namenode.servicerpc-address.master.nn2</name>
        <value>master2:53310</value>
      </property>
    
      <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://master1:8485;master2:8485;slaver1:8485/master</value>
        <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
      </property> 
    
      <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/opt/bigdata/hadoop/dfs/jndata</value>
        <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
      </property>
    
      <property>
        <name>dfs.ha.automatic-failover.enabled</name>  
        <value>true</value>
        <!-- 开启NameNode失败自动切换 -->
      </property>
    
      <property>
        <name>dfs.client.failover.proxy.provider.master</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        <!-- 配置失败自动切换实现方式 -->
      </property>
    
      <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
          sshfence
          shell(/bin/true)
        </value>
        <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
      </property>
    
      <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
        <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
      </property>
    
      <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>3000</value>
        <!-- 配置sshfence隔离机制超时时间 -->
      </property>
    
    </configuration>

           yarn-site.xml:

    <configuration>
    
    <!-- Site specific YARN configuration properties -->
    
      <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
        <!-- 开启RM高可用 -->
      </property>
      
      <property>
        <!--启动自动故障转移,默认为false-->
        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
        <value>true</value>
      </property>
    
      <property>
        <!--启用一个内嵌的故障转移,与ZKRMStateStore一起使用。-->
        <name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
        <value>true</value>
      </property>
     
      <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yrc</value>
        <!-- 指定RM的cluster id -->
      </property>
    
      <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
        <!-- 指定RM的名字 -->
      </property>
     
      <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>master1</value>
        <!-- 分别指定RM的地址 -->
      </property>
      
      <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>master2</value>
        <!-- 分别指定RM的地址 -->
      </property>
    
      <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm1</value>     
        <!--如果是在主NN上 这里写rm1   如果这个配置文件是在备NN上 这里写rm2,否则RM的高可用会出问题-->
        <description>If we want to launch more than one RM in single node, we need this configuration</description>
      </property> 
    
      <property>  
        <name>yarn.resourcemanager.recovery.enabled</name>  
        <value>true</value>  
      </property>  
    
      <property>  
        <name>yarn.resourcemanager.store.class</name>  
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>  
      </property>    
    
      <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>master1:2181,master2:2181,slaver1:2181</value>
        <!-- 指定zk集群地址 -->
      </property>
    
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
    
    </configuration>

           2:启动服务,测试NameNode的自动切换

           PS:一定要注意启动的顺序,否则会出现各种各样的错误,亲测

           每台机器上启动Zookeeper:bin/zkServer.sh start 

           zookeeper集群格式化(任意一个主节点上执行即可):bin/hdfs zkfc -formatZK 

           每台机器上启动 journalnode:sbin/hadoop-daemon.sh start journalnode (如果这里不启动的话,在进行hdfs格式化的时候就会报错,同时这个进程只需在格式化的时候启动,后续启动服务则不需要)

           hdfs集群格式化(master1上进行):bin/hadoop namenode -format

           看到 “0” 表示成功了

           

           master1机器上启动服务:sbin/start-dfs.sh      sbin/start-yarn.sh

           执行jps查看进行如下(master1,master2,slaver1):

                

           master1(192.168.132.27)的web界面显示如下:

           

             备用NN同步主NN的元数据信息(master2上执行): bin/hdfs namenode -bootstrapStandby

             启动备用NN(master2上执行): sbin/hadoop-daemon.sh start namenode

             执行jps(master2上执行):

             

             Web访问:



             测试主NN和备用NN的切换:kill掉主NN进程  kill namenode_id

             再次刷新master2对应的web,实现自动切换:

        


             3:测试Resourcemanager自动切换

                  访问主NN的8088端口如下:

        

                备用NN的8088端口:

          

           kill 掉主NN的resourcemanager服务再次访问从NN的8088端口

           

            OK!大功告成

    四:遇见的问题

           1:NameNode格式化失败

                错误:failed on connection exception: java.net.ConnectException: Connection refused

                解决办法:先启动Zookeeper集群,在用sbin/hadoop-daemon.sh start journalnode 启动各个NameNode上的JournalNode进程,然后再进行格式化

                该错误参考博客:http://blog.csdn.net/u014729236/article/details/44944773

         2:Web显示live nodes 为 0

           

            解决办法:注释掉机子上 hosts文件中的原本的两行

            

         3:master2的NameNode和 ResourceManager不能启动

              查看日志发现错误
    2016-08-30 06:10:57,558 INFO org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException
    java.net.BindException: Port in use: master1:8088
            at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:919)
            at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:856)
            at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:274)
            at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:974)
            at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1074)
            at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
            at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1208)
    Caused by: java.net.BindException: Cannot assign requested address
            at sun.nio.ch.Net.bind0(Native Method)
            at sun.nio.ch.Net.bind(Net.java:444)
            at sun.nio.ch.Net.bind(Net.java:436)
            at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214)
            at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
            at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
            at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:914)
            ... 6 more
              端口被占用了,这时候要修改yarn-site.xml 中
     <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm2</value>
        <description>If we want to launch more than one RM in single node, we need this configuration</description>
      </property>
             此时再次启动OK

          4:NameNode不能自动切换

               hdfs-site.xml通过dfs.ha.fencing.methods控制自动切换的方法, sshfence是系统默认的并不能自动切换,这里可以换成
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>shell(/bin/true)</value>
        <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
      </property>
     

    五:总结

           在配置的过程中遇到了很多问题,参考了很多资料,但很多事情就是看着别人很顺利的完成,可是到了你这里就会出现各种错误,殊不知别人也是经历过各种调试才出的结果,所以不要灰心,在配置的过程中多看看日志,所有的错误都会在日志中显示,相信你会成功的。
    展开全文
  • 部署hadoop

    2015-03-27 11:41:48
    大数据这两年也是大热,自己虽然在工作方面没什么交集,但也通过公司的一些资源对这方面进行了了解,包括hadoop、spark的部署及简单用例,以下部分内容都是通过公司服务器实现了操作,部署hadoop的大致流程有下面几...

    大数据这两年也是大热,自己虽然在工作方面没什么交集,但也通过公司的一些资源对这方面进行了了解,包括hadoop、spark的部署及简单用例,以下部分内容都是通过公司服务器实现了操作,部署hadoop的大致流程有下面几个步骤:

    1.安装系统(centos 7.0)
    2.安装ssh无密钥登录服务,实现master控制其他slave无需口令
    3.安装jdk
    4.安装配置hadoop
    5.设置防火墙端口
    需要注意centos 7.0的命令变化比较大,主要用到下面几个命令变化
    如服务的开始停止:
         systemctl start firewalld.service 关闭防火墙
         systemctl disable firewalld.service 开机不启动
         systemctl  start sshd.service 
    具体流程:
    adoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html
     
    配置master的/etc/hosts
    192.168.0.7 mesos-master hadoop-master
    192.168.0.20 mesos-slave-1 hadoop-slave-1
    192.168.0.8 mesos-slave-2 hadoop-slave-2
     
     

    Setup passphraseless ssh

    Now check that you can ssh to the localhost without a passphrase:

      $ ssh localhost

    If you cannot ssh to localhost without a passphrase, execute the following commands:

      $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
      $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys


    接着在master上生成密钥并配置SSH无密码登录

    具体步骤如下:

    1、 进入.ssh文件夹

    2、 ssh-keygen -t  rsa 之后一路回车(产生秘钥)

    3、 把id_rsa.pub 追加到授权的 key 里面去(cat id_rsa.pub >> authorized_keys)

    4、 重启 SSH 服务命令使其生效


    3.3 将生成的authorized_keys文件拷贝到两台slave主机相同的文件夹下,命令如下:
     
    scp authorized_keys slave1:~/.ssh/
    scp authorized_keys slave2:~/.ssh/
     

    Hadoop configuration is driven by two types of important configuration files:

    • Read-only default configuration - core-default.xmlhdfs-default.xmlyarn-default.xml and mapred-default.xml.
    • Site-specific configuration - conf/core-site.xmlconf/hdfs-site.xmlconf/yarn-site.xml andconf/mapred-site.xml.
     

    Configuration

    Use the following:

    etc/hadoop/core-site.xml:

    <configuration>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost:9000</value>
        </property>
    </configuration>

    etc/hadoop/hdfs-site.xml:

    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>3</value>
        </property>
    </configuration>
    
    
    
    
    
    

    Hadoop Startup

    To start a Hadoop cluster you will need to start both the HDFS and YARN cluster.

    Format a new distributed filesystem:

    $ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>

    Start the HDFS with the following command, run on the designated NameNode:

    $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode

    Run a script to start DataNodes on all slaves:

    $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

    Start the YARN with the following command, run on the designated ResourceManager:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager

    Run a script to start NodeManagers on all slaves:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager

    Start a standalone WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR

    Start the MapReduce JobHistory Server with the following command, run on the designated server:

    $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR

    Hadoop Shutdown

    Stop the NameNode with the following command, run on the designated NameNode:

    $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode

    Run a script to stop DataNodes on all slaves:

    $ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode

    Stop the ResourceManager with the following command, run on the designated ResourceManager:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager

    Run a script to stop NodeManagers on all slaves:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager

    Stop the WebAppProxy server. If multiple servers are used with load balancing it should be run on each of them:

    $ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_DIR

    Stop the MapReduce JobHistory Server with the following command, run on the designated server:

    $ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR

    Operating the Hadoop Cluster

    Once all the necessary configuration is complete, distribute the files to the HADOOP_CONF_DIR directory on all the machines.

    This section also describes the various Unix users who should be starting the various components and uses the same Unix accounts and groups used previously:

     

    集群方式启动HDFS:

    hadoop-2.2.0$ sbin/start-dfs.sh

    hadoop-2.2.0$ sbin/stop-dfs.sh
     
     
    单独启动和停止datanode:
     
    ./sbin/hadoop-daemon.sh start datanode
     
    ./sbin/hadoop-daemon.sh stop datanode
     
     

    Hadoop Startup

    To start a Hadoop cluster you will need to start both the HDFS and YARN cluster.

    Format a new distributed filesystem as hdfs:

    [hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>

    Start the HDFS with the following command, run on the designated NameNode as hdfs:

    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode

    Run a script to start DataNodes on all slaves as root with a special environment variable HADOOP_SECURE_DN_USER set to hdfs:

    [root]$ HADOOP_SECURE_DN_USER=hdfs $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode

    Start the YARN with the following command, run on the designated ResourceManager as yarn:

    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager

    Run a script to start NodeManagers on all slaves as yarn:

    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager

    Start a standalone WebAppProxy server. Run on the WebAppProxy server as yarn. If multiple servers are used with load balancing it should be run on each of them:

    [yarn]$ $HADOOP_YARN_HOME/bin/yarn start proxyserver --config $HADOOP_CONF_DIR

    Start the MapReduce JobHistory Server with the following command, run on the designated server as mapred:

    [mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR

    Hadoop Shutdown

    Stop the NameNode with the following command, run on the designated NameNode as hdfs:

    [hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode

    Run a script to stop DataNodes on all slaves as root:

    [root]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode

    Stop the ResourceManager with the following command, run on the designated ResourceManager as yarn:

    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager

    Run a script to stop NodeManagers on all slaves as yarn:

    [yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager

    Stop the WebAppProxy server. Run on the WebAppProxy server as yarn. If multiple servers are used with load balancing it should be run on each of them:

    [yarn]$ $HADOOP_YARN_HOME/bin/yarn stop proxyserver --config $HADOOP_CONF_DIR

    Stop the MapReduce JobHistory Server with the following command, run on the designated server as mapred:

    [mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
    
    
    
    

    Web Interfaces

    Once the Hadoop cluster is up and running check the web-ui of the components as described below:

    Daemon Web Interface Notes
    NameNode http://nn_host:port/ Default HTTP port is 50070.
    ResourceManager http://rm_host:port/ Default HTTP port is 8088.
    MapReduce JobHistory Server http://jhs_host:port/ Default HTTP port is 19888.
    
    

    将hadoop.tmp.dir所指定的目录删除。

    (3)重新执行命令:hadoop namenode -format

    展开全文
  • 为了毕业,我使用一台腾讯云的服务器部署hadoop进行开发。 系统:CentOS Linux release 7.7.1908 (Core) hadoop版本:hadoop-3.1.2.tar.gz 1、首先,我们必须要有java环境,java版本1.8(向上兼容到了j10)。同时...

    为了毕业,我使用一台腾讯云的服务器部署hadoop进行开发。

    系统:CentOS Linux release 7.7.1908 (Core)

    hadoop版本:hadoop-3.1.2.tar.gz

    1、首先,我们必须要有java环境,java版本1.8(向上兼容到了j10)。同时要将java配置进环境变量中

    [root@shengxi ~]# java -version
    openjdk version "1.8.0_222"
    OpenJDK Runtime Environment (build 1.8.0_222-b10)
    OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
    
    [root@shengxi ~]# vim /etc/profile
    
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222/
    export JRE_HOME=$JAVA_HOME/jre  
    export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
    export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
    

    2、配置本机虚拟域名。因为我是云服务器,所以不应该使用127.0.0.1进行hadoop配置的,所以应该使用本机ip(注意:不是公网ip,而是内网ip)。

    //查看ip
    [root@shengxi ~]# ifconfig -a
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.17.x.x  netmask 255.255.240.0  broadcast 172.17.15.255
            ether 52:54:00:8a:fa:12  txqueuelen 1000  (Ethernet)
            RX packets 1767653  bytes 1932785383 (1.8 GiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 775608  bytes 93677869 (89.3 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            loop  txqueuelen 1  (Local Loopback)
            RX packets 2  bytes 276 (276.0 B)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 2  bytes 276 (276.0 B)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    

    其中eth0中的172.17.x.x就是我的本地ip了。

    [root@shengxi ~]# vim /etc/hosts
    172.17.x.x shengxi
    172.17.x.x hadoop
    127.0.0.1 localhost.localdomain localhost
    127.0.0.1 localhost4.localdomain4 localhost4
    
    ::1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6
    

     增加一句 ip: hostname来配置虚拟域名,这里配置了一个hadoop作为hadoop开发用的虚拟域名。

    3、增加一个hadoop用户,因为使用root用户启动的hadoop会出现一些报错,而且访问的时候会出现路径不对应的情况。因此我直接增加一个hadoop user配置root权限。(可以直接将/etc/passwd中的对应用户的uid改为0——root权限)

    //新增用户
    [root@shengxi ~]# adduser hadoop
    //修改密码
    [root@shengxi ~]# passwd hadoop
    Changing password for user hadoop.
    //输入两个新密码
    New password: 
    Retype new password: 
    passwd: all authentication tokens updated successfully.

         修改/etc/sudoers文件,增加一句hadoop    ALL=(ALL)     NOPASSWD:ALL这样hadoop使用root权限时就不用输入密码了。

    ## Allow root to run any commands anywhere 
    root	ALL=(ALL) 	ALL
    hadoop	ALL=(ALL) 	NOPASSWD:ALL

    3、ssh安装,因为我们是云服务器,自带ssh环境,就不需要安装了。直接配置不需要密码登录就好了。

    4、获取压缩包,并解压。

    //获取压缩包
    [root@shengxi ~]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.1.2/hadoop-3.1.2.tar.gz
    //解压
    [root@shengxi ~]# tar -zxvf hadoop-3.1.2.tar.gz 
    

    5、为了方便管理,将hadoop文件夹移动到一个你习惯的文件路径下,然后将文件夹所有人改成hadoop。

    [root@shengxi ~]# mv hadoop-3.1.2 /usr/local/
    [root@shengxi ~]# cd /usr/local/
    [root@shengxi local]# ls
    bin  games         include  lib64    qcloud  share  yd.socket.server
    etc  hadoop-3.1.2  lib      libexec  sbin    src
    [root@shengxi local]# 
    

    修改文件所属用户和用户组

    [root@shengxi local]# chown -R hadoop:root hadoop-3.1.2/
    [root@shengxi local]# ls    -l  hadoop-3.1.2/
    total 204
    drwxrwxrwx 2 hadoop root   4096 Jan 29  2019 bin
    drwxrwxrwx 3 hadoop root   4096 Jan 29  2019 etc
    drwxrwxrwx 2 hadoop root   4096 Jan 29  2019 include
    drwxrwxrwx 3 hadoop root   4096 Jan 29  2019 lib
    drwxrwxrwx 4 hadoop root   4096 Jan 29  2019 libexec
    -rwxrwxrwx 1 hadoop root 147145 Jan 23  2019 LICENSE.txt
    -rwxrwxrwx 1 hadoop root  21867 Jan 23  2019 NOTICE.txt
    -rwxrwxrwx 1 hadoop root   1366 Jan 23  2019 README.txt
    drwxrwxrwx 3 hadoop root   4096 Jan 29  2019 sbin
    drwxrwxrwx 4 hadoop root   4096 Jan 29  2019 share
    

    配置环境变量

    [hadoop@shengxi ~]$ vim /etc/profile
    
    #配置hadoop环境变量
    export HADOOP_HOME=/usr/local/hadoop-3.1.2
    export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
    export HADOOP_MAPRED_HOME=$HADOOP_HOME
    export HADOOP_YARN_HOME=$HADOOP_HOME
    export HADOOP_MAPRED_HOME=$HADOOP_HOME
    export HADOOP_COMMON_HOME=$HADOOP_HOME
    export HADOOP_HDFS_HOME=$HADOOP_HOME
    export YARN_HOME=$HADOOP_HOME
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
    
    
    [hadoop@shengxi ~]$ source /etc/profile
    
    

    6、测试单机hadoop是否成功。

        *测试hadoop version

    [hadoop@shengxi ~]$ source /etc/profile
    [hadoop@shengxi ~]$ hadoop version
    Hadoop 3.1.2
    Source code repository https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a
    Compiled by sunilg on 2019-01-29T01:39Z
    Compiled with protoc 2.5.0
    From source with checksum 64b8bdd4ca6e77cce75a93eb09ab2a9
    This command was run using /usr/local/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar
    [hadoop@shengxi ~]$ 
    

       **测试二,使用mapreduce统计单词出现的次数。

        在用户文件夹下创建一个input文件夹,在里面写几个文件。我写了三个txt,每个文件间有重复的单词。

    [hadoop@shengxi ~]$ mkdir input
    [hadoop@shengxi ~]$ cd input
    //新建并编辑三个文件
    [hadoop@shengxi input]$ vim f1.txt
    [hadoop@shengxi input]$ vim f2.txt
    [hadoop@shengxi input]$ vim f3.txt
    [hadoop@shengxi input]$ ll
    total 12
    -rw-r--r-- 1 root root 11 Oct 13 13:59 f1.txt
    -rw-r--r-- 1 root root 25 Oct 13 14:01 f2.txt
    -rw-r--r-- 1 root root 19 Oct 13 14:01 f3.txt
    
    //调用方法 注意:不可以创建输出文件夹,如果存在output文件夹,那就改结果路径,或者删除output
    hadoop jar /usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar wordcount input output
    2019-10-13 14:09:28,762 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
    2019-10-13 14:09:28,879 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
    2019-10-13 14:09:28,879 INFO impl.MetricsSystemImpl: JobTracker metrics system started
    2019-10-13 14:09:29,089 INFO input.FileInputFormat: Total input files to process : 3
    2019-10-13 14:09:29,116 INFO mapreduce.JobSubmitter: number of splits:3
    2019-10-13 14:09:29,345 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local696442689_0001
    2019-10-13 14:09:29,346 INFO mapreduce.JobSubmitter: Executing with tokens: []
    2019-10-13 14:09:29,569 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
    2019-10-13 14:09:29,570 INFO mapreduce.Job: Running job: job_local696442689_0001
    2019-10-13 14:09:29,575 INFO mapred.LocalJobRunner: OutputCommitter set in config null
    2019-10-13 14:09:29,583 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
    2019-10-13 14:09:29,583 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
    2019-10-13 14:09:29,583 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
    2019-10-13 14:09:29,650 INFO mapred.LocalJobRunner: Waiting for map tasks
    2019-10-13 14:09:29,650 INFO mapred.LocalJobRunner: Starting task: attempt_local696442689_0001_m_000000_0
    2019-10-13 14:09:29,672 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
    2019-10-13 14:09:29,672 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
    2019-10-13 14:09:29,691 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    2019-10-13 14:09:29,695 INFO mapred.MapTask: Processing split: file:/home/hadoop/input/f2.txt:0+25
    2019-10-13 14:09:29,802 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    2019-10-13 14:09:29,802 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    2019-10-13 14:09:29,802 INFO mapred.MapTask: soft limit at 83886080
    2019-10-13 14:09:29,802 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    2019-10-13 14:09:29,803 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    2019-10-13 14:09:29,810 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    2019-10-13 14:09:29,815 INFO mapred.LocalJobRunner: 
    2019-10-13 14:09:29,815 INFO mapred.MapTask: Starting flush of map output
    2019-10-13 14:09:29,815 INFO mapred.MapTask: Spilling map output
    2019-10-13 14:09:29,815 INFO mapred.MapTask: bufstart = 0; bufend = 42; bufvoid = 104857600
    2019-10-13 14:09:29,816 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214384(104857536); length = 13/6553600
    2019-10-13 14:09:29,827 INFO mapred.MapTask: Finished spill 0
    2019-10-13 14:09:29,835 INFO mapred.Task: Task:attempt_local696442689_0001_m_000000_0 is done. And is in the process of committing
    2019-10-13 14:09:29,846 INFO mapred.LocalJobRunner: map
    2019-10-13 14:09:29,846 INFO mapred.Task: Task 'attempt_local696442689_0001_m_000000_0' done.
    2019-10-13 14:09:29,853 INFO mapred.Task: Final Counters for attempt_local696442689_0001_m_000000_0: Counters: 18
    	File System Counters
    		FILE: Number of bytes read=316771
    		FILE: Number of bytes written=815692
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    	Map-Reduce Framework
    		Map input records=1
    		Map output records=4
    		Map output bytes=42
    		Map output materialized bytes=56
    		Input split bytes=95
    		Combine input records=4
    		Combine output records=4
    		Spilled Records=4
    		Failed Shuffles=0
    		Merged Map outputs=0
    		GC time elapsed (ms)=22
    		Total committed heap usage (bytes)=135335936
    	File Input Format Counters 
    		Bytes Read=25
    2019-10-13 14:09:29,853 INFO mapred.LocalJobRunner: Finishing task: attempt_local696442689_0001_m_000000_0
    2019-10-13 14:09:29,853 INFO mapred.LocalJobRunner: Starting task: attempt_local696442689_0001_m_000001_0
    2019-10-13 14:09:29,858 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
    2019-10-13 14:09:29,859 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
    2019-10-13 14:09:29,859 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    2019-10-13 14:09:29,860 INFO mapred.MapTask: Processing split: file:/home/hadoop/input/f3.txt:0+19
    2019-10-13 14:09:29,906 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    2019-10-13 14:09:29,906 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    2019-10-13 14:09:29,906 INFO mapred.MapTask: soft limit at 83886080
    2019-10-13 14:09:29,906 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    2019-10-13 14:09:29,906 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    2019-10-13 14:09:29,908 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    2019-10-13 14:09:29,909 INFO mapred.LocalJobRunner: 
    2019-10-13 14:09:29,909 INFO mapred.MapTask: Starting flush of map output
    2019-10-13 14:09:29,909 INFO mapred.MapTask: Spilling map output
    2019-10-13 14:09:29,909 INFO mapred.MapTask: bufstart = 0; bufend = 32; bufvoid = 104857600
    2019-10-13 14:09:29,909 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214388(104857552); length = 9/6553600
    2019-10-13 14:09:29,910 INFO mapred.MapTask: Finished spill 0
    2019-10-13 14:09:29,923 INFO mapred.Task: Task:attempt_local696442689_0001_m_000001_0 is done. And is in the process of committing
    2019-10-13 14:09:29,924 INFO mapred.LocalJobRunner: map
    2019-10-13 14:09:29,924 INFO mapred.Task: Task 'attempt_local696442689_0001_m_000001_0' done.
    2019-10-13 14:09:29,925 INFO mapred.Task: Final Counters for attempt_local696442689_0001_m_000001_0: Counters: 18
    	File System Counters
    		FILE: Number of bytes read=317094
    		FILE: Number of bytes written=815768
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    	Map-Reduce Framework
    		Map input records=1
    		Map output records=3
    		Map output bytes=32
    		Map output materialized bytes=44
    		Input split bytes=95
    		Combine input records=3
    		Combine output records=3
    		Spilled Records=3
    		Failed Shuffles=0
    		Merged Map outputs=0
    		GC time elapsed (ms)=21
    		Total committed heap usage (bytes)=182521856
    	File Input Format Counters 
    		Bytes Read=19
    2019-10-13 14:09:29,925 INFO mapred.LocalJobRunner: Finishing task: attempt_local696442689_0001_m_000001_0
    2019-10-13 14:09:29,925 INFO mapred.LocalJobRunner: Starting task: attempt_local696442689_0001_m_000002_0
    2019-10-13 14:09:29,935 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
    2019-10-13 14:09:29,935 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
    2019-10-13 14:09:29,935 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    2019-10-13 14:09:29,936 INFO mapred.MapTask: Processing split: file:/home/hadoop/input/f1.txt:0+11
    2019-10-13 14:09:29,979 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
    2019-10-13 14:09:29,979 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
    2019-10-13 14:09:29,979 INFO mapred.MapTask: soft limit at 83886080
    2019-10-13 14:09:29,979 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
    2019-10-13 14:09:29,979 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
    2019-10-13 14:09:29,981 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
    2019-10-13 14:09:29,982 INFO mapred.LocalJobRunner: 
    2019-10-13 14:09:29,982 INFO mapred.MapTask: Starting flush of map output
    2019-10-13 14:09:29,983 INFO mapred.MapTask: Spilling map output
    2019-10-13 14:09:29,983 INFO mapred.MapTask: bufstart = 0; bufend = 20; bufvoid = 104857600
    2019-10-13 14:09:29,983 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214392(104857568); length = 5/6553600
    2019-10-13 14:09:29,984 INFO mapred.MapTask: Finished spill 0
    2019-10-13 14:09:29,996 INFO mapred.Task: Task:attempt_local696442689_0001_m_000002_0 is done. And is in the process of committing
    2019-10-13 14:09:30,000 INFO mapred.LocalJobRunner: map
    2019-10-13 14:09:30,000 INFO mapred.Task: Task 'attempt_local696442689_0001_m_000002_0' done.
    2019-10-13 14:09:30,001 INFO mapred.Task: Final Counters for attempt_local696442689_0001_m_000002_0: Counters: 18
    	File System Counters
    		FILE: Number of bytes read=317409
    		FILE: Number of bytes written=815830
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    	Map-Reduce Framework
    		Map input records=1
    		Map output records=2
    		Map output bytes=20
    		Map output materialized bytes=30
    		Input split bytes=95
    		Combine input records=2
    		Combine output records=2
    		Spilled Records=2
    		Failed Shuffles=0
    		Merged Map outputs=0
    		GC time elapsed (ms)=25
    		Total committed heap usage (bytes)=168112128
    	File Input Format Counters 
    		Bytes Read=11
    2019-10-13 14:09:30,001 INFO mapred.LocalJobRunner: Finishing task: attempt_local696442689_0001_m_000002_0
    2019-10-13 14:09:30,001 INFO mapred.LocalJobRunner: map task executor complete.
    2019-10-13 14:09:30,007 INFO mapred.LocalJobRunner: Waiting for reduce tasks
    2019-10-13 14:09:30,007 INFO mapred.LocalJobRunner: Starting task: attempt_local696442689_0001_r_000000_0
    2019-10-13 14:09:30,034 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
    2019-10-13 14:09:30,034 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
    2019-10-13 14:09:30,035 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
    2019-10-13 14:09:30,037 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@17c3dca2
    2019-10-13 14:09:30,038 WARN impl.MetricsSystemImpl: JobTracker metrics system already initialized!
    2019-10-13 14:09:30,064 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=326402048, maxSingleShuffleLimit=81600512, mergeThreshold=215425360, ioSortFactor=10, memToMemMergeOutputsThreshold=10
    2019-10-13 14:09:30,078 INFO reduce.EventFetcher: attempt_local696442689_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
    2019-10-13 14:09:30,103 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local696442689_0001_m_000001_0 decomp: 40 len: 44 to MEMORY
    2019-10-13 14:09:30,115 INFO reduce.InMemoryMapOutput: Read 40 bytes from map-output for attempt_local696442689_0001_m_000001_0
    2019-10-13 14:09:30,116 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 40, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->40
    2019-10-13 14:09:30,118 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local696442689_0001_m_000002_0 decomp: 26 len: 30 to MEMORY
    2019-10-13 14:09:30,120 WARN io.ReadaheadPool: Failed readahead on ifile
    EBADF: Bad file descriptor
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:270)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:147)
    	at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:208)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    2019-10-13 14:09:30,122 INFO reduce.InMemoryMapOutput: Read 26 bytes from map-output for attempt_local696442689_0001_m_000002_0
    2019-10-13 14:09:30,122 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 26, inMemoryMapOutputs.size() -> 2, commitMemory -> 40, usedMemory ->66
    2019-10-13 14:09:30,124 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local696442689_0001_m_000000_0 decomp: 52 len: 56 to MEMORY
    2019-10-13 14:09:30,125 WARN io.ReadaheadPool: Failed readahead on ifile
    EBADF: Bad file descriptor
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:270)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:147)
    	at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:208)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    2019-10-13 14:09:30,125 INFO reduce.InMemoryMapOutput: Read 52 bytes from map-output for attempt_local696442689_0001_m_000000_0
    2019-10-13 14:09:30,125 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 52, inMemoryMapOutputs.size() -> 3, commitMemory -> 66, usedMemory ->118
    2019-10-13 14:09:30,126 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
    2019-10-13 14:09:30,127 INFO mapred.LocalJobRunner: 3 / 3 copied.
    2019-10-13 14:09:30,127 INFO reduce.MergeManagerImpl: finalMerge called with 3 in-memory map-outputs and 0 on-disk map-outputs
    2019-10-13 14:09:30,131 INFO mapred.Merger: Merging 3 sorted segments
    2019-10-13 14:09:30,131 INFO mapred.Merger: Down to the last merge-pass, with 3 segments left of total size: 91 bytes
    2019-10-13 14:09:30,132 INFO reduce.MergeManagerImpl: Merged 3 segments, 118 bytes to disk to satisfy reduce memory limit
    2019-10-13 14:09:30,132 INFO reduce.MergeManagerImpl: Merging 1 files, 118 bytes from disk
    2019-10-13 14:09:30,133 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
    2019-10-13 14:09:30,133 INFO mapred.Merger: Merging 1 sorted segments
    2019-10-13 14:09:30,137 WARN io.ReadaheadPool: Failed readahead on ifile
    EBADF: Bad file descriptor
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posix_fadvise(Native Method)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX.posixFadviseIfPossible(NativeIO.java:270)
    	at org.apache.hadoop.io.nativeio.NativeIO$POSIX$CacheManipulator.posixFadviseIfPossible(NativeIO.java:147)
    	at org.apache.hadoop.io.ReadaheadPool$ReadaheadRequestImpl.run(ReadaheadPool.java:208)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    2019-10-13 14:09:30,138 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 106 bytes
    2019-10-13 14:09:30,138 INFO mapred.LocalJobRunner: 3 / 3 copied.
    2019-10-13 14:09:30,140 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
    2019-10-13 14:09:30,141 INFO mapred.Task: Task:attempt_local696442689_0001_r_000000_0 is done. And is in the process of committing
    2019-10-13 14:09:30,142 INFO mapred.LocalJobRunner: 3 / 3 copied.
    2019-10-13 14:09:30,142 INFO mapred.Task: Task attempt_local696442689_0001_r_000000_0 is allowed to commit now
    2019-10-13 14:09:30,143 INFO output.FileOutputCommitter: Saved output of task 'attempt_local696442689_0001_r_000000_0' to file:/home/hadoop/output
    2019-10-13 14:09:30,147 INFO mapred.LocalJobRunner: reduce > reduce
    2019-10-13 14:09:30,147 INFO mapred.Task: Task 'attempt_local696442689_0001_r_000000_0' done.
    2019-10-13 14:09:30,148 INFO mapred.Task: Final Counters for attempt_local696442689_0001_r_000000_0: Counters: 24
    	File System Counters
    		FILE: Number of bytes read=317753
    		FILE: Number of bytes written=816010
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    	Map-Reduce Framework
    		Combine input records=0
    		Combine output records=0
    		Reduce input groups=6
    		Reduce shuffle bytes=130
    		Reduce input records=9
    		Reduce output records=6
    		Spilled Records=9
    		Shuffled Maps =3
    		Failed Shuffles=0
    		Merged Map outputs=3
    		GC time elapsed (ms)=0
    		Total committed heap usage (bytes)=168112128
    	Shuffle Errors
    		BAD_ID=0
    		CONNECTION=0
    		IO_ERROR=0
    		WRONG_LENGTH=0
    		WRONG_MAP=0
    		WRONG_REDUCE=0
    	File Output Format Counters 
    		Bytes Written=62
    2019-10-13 14:09:30,148 INFO mapred.LocalJobRunner: Finishing task: attempt_local696442689_0001_r_000000_0
    2019-10-13 14:09:30,151 INFO mapred.LocalJobRunner: reduce task executor complete.
    2019-10-13 14:09:30,574 INFO mapreduce.Job: Job job_local696442689_0001 running in uber mode : false
    2019-10-13 14:09:30,575 INFO mapreduce.Job:  map 100% reduce 100%
    2019-10-13 14:09:30,576 INFO mapreduce.Job: Job job_local696442689_0001 completed successfully
    2019-10-13 14:09:30,597 INFO mapreduce.Job: Counters: 30
    	File System Counters
    		FILE: Number of bytes read=1269027
    		FILE: Number of bytes written=3263300
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    	Map-Reduce Framework
    		Map input records=3
    		Map output records=9
    		Map output bytes=94
    		Map output materialized bytes=130
    		Input split bytes=285
    		Combine input records=9
    		Combine output records=9
    		Reduce input groups=6
    		Reduce shuffle bytes=130
    		Reduce input records=9
    		Reduce output records=6
    		Spilled Records=18
    		Shuffled Maps =3
    		Failed Shuffles=0
    		Merged Map outputs=3
    		GC time elapsed (ms)=68
    		Total committed heap usage (bytes)=654082048
    	Shuffle Errors
    		BAD_ID=0
    		CONNECTION=0
    		IO_ERROR=0
    		WRONG_LENGTH=0
    		WRONG_MAP=0
    		WRONG_REDUCE=0
    	File Input Format Counters 
    		Bytes Read=55
    	File Output Format Counters 
    		Bytes Written=62
    [hadoop@shengxi ~]$ 
    

        查看结果:

    [hadoop@shengxi home]$ cd hadoop/
    [hadoop@shengxi ~]$ ll
    total 8
    drwxrwxrwx 2 root   root   4096 Oct 13 14:00 input
    drwxr-xr-x 2 hadoop hadoop 4096 Oct 13 14:09 output
    [hadoop@shengxi ~]$ cd output/
    [hadoop@shengxi output]$ ll
    total 4
    -rw-r--r-- 1 hadoop hadoop 50 Oct 13 14:09 part-r-00000
    -rw-r--r-- 1 hadoop hadoop  0 Oct 13 14:09 _SUCCESS
    [hadoop@shengxi output]$ cat part-r-00000 
    dfads	1
    dfjlaskd	1
    hello	3
    ldlkjfh	2
    my	1
    world	1
    [hadoop@shengxi output]$ 
    

    至此,hadoop的单机版就安装好了。下面开始进行伪分布式部署。

    7、进行伪分布部署

        (1)修改在/usr/local/hadoop/etc/hadoop的各种配置

        (2)给hadoop-env.sh,yarn-env.sh,mapred-env.sh增加java_home的连接。我直接将我的sh修改部分cat出来。

    [hadoop@shengxi hadoop]$ cat hadoop-env.sh
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-1.el7_7.x86_64
    #
    
    
    [hadoop@shengxi hadoop]$ cat mapred-env.sh
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-1.el7_7.x86_64
    
    
    [hadoop@shengxi hadoop]$ cat yarn-env.sh
    export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-1.el7_7.x86_64
    

        (3) 修改core-site.xml。

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <!-- HDFS资源路径 -->
    	<!--指定namenode的通信地址,默认8020端口 -->
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://hadoop:9000</value>
        </property>
        <!-- 在见面界面访问数据使用的用户名 -->
        <property>
            <name>hadoop.http.staticuser.user</name>
            <value>hadoop</value>
        </property>
        <!-- Hadoop临时文件存放目录 -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/usr/local/hadoop/data/tmp</value>
        </property>
        <!-- 垃圾文件保留时间(秒) -->
        <property>
          <name>fs.trash.interval</name>
          <value>7200</value>
        </property>
    </configuration>
    
    

        (4)创建对应的文件夹

    //傻瓜式创建文件夹
    [hadoop@shengxi hadoop-3.1.2]$ mkdir data
    [hadoop@shengxi hadoop-3.1.2]$ cd data
    [hadoop@shengxi data]$ mkdir tmp
    [hadoop@shengxi data]$ mkdir namenode
    [hadoop@shengxi data]$ mkdir datanode
    [hadoop@shengxi data]$ cd ../
    [hadoop@shengxi hadoop-3.1.2]$ chmod -R 777 data/
    [hadoop@shengxi hadoop-3.1.2]$ 
    

        (5)修改hdfs-site.xml,副本数常规是3个,但是我们是伪分布,只使用一个就行了。

    <configuration>
        <!-- 是否进行权限检查 -->
        <property>
            <name>dfs.permissions.enabled</name>
            <value>false</value>
        </property>
        <!-- 副本数 -->
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
            <!-- namenode元数据存储路径 -->
        <property>
            <name>dfs.namenode.name.dir</name>
            <value>/usr/local/hadoop/data/namenode</value>
        </property>
            <!-- 数据存储路径 -->
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>/usr/local/hadoop/data/datanode</value>
        </property>
    </configuration>

        (6)修改 mapred-site.xml

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <!-- mapreduce运行在yarn上面 -->
        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>yarn.app.mapreduce.am.env</name>
            <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
        </property>
        <property>
            <name>mapreduce.map.env</name>
            <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
        </property>
        <property>
            <name>mapreduce.reduce.env</name>
            <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
        </property>
    	<property>
    		<name>mapreduce.map.memory.mb</name>
    		<value>2048</value>
    	</property>
    	 <!-- 历史服务器端地址 -->
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>hadoop:10020</value>
        </property>
        <!-- 历史服务器web端地址 -->
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>hadoop:19888</value>
        </property>
    </configuration>
    
    

        (7)修改yarn-site.xml

    <?xml version="1.0"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    <configuration>
        <!-- resourceManager在哪台机器 hadoop的意义就是虚拟域名 -->
        <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>hadoop</value>
        </property>
        <!-- 在nodemanager中运行mapreduce服务 -->
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <!-- 配置日志的聚集功能 -->
        <property>
            <name>yarn.log-aggregation-enable</name>
            <value>true</value>
        </property>
     <!-- 设置日志保留时间(7天) -->
        <property>
            <name>yarn.log-aggregation.retain-seconds</name>
            <value>604800</value>
        </property>
    </configuration>
    
    

    至此,配置文件修完成,实现格式化hadoop就行了。

    hadoop namenode -format
    [root@shengxi bin]# hadoop namenode -format
    WARNING: Use of this script to execute namenode is deprecated.
    WARNING: Attempting to execute replacement "hdfs namenode" instead.
    
    2019-10-13 15:21:29,211 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = shengxi/172.17.0.15
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 3.1.2
    STARTUP_MSG:   classpath = /usr/local/hadoop-3.1.2/etc/hadoop:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerby-xdr-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-server-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-net-3.6.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-beanutils-1.9.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/token-provider-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jersey-json-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/zookeeper-3.4.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-databind-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jsr311-api-1.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/httpclient-4.5.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-compress-1.18.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jul-to-slf4j-1.7.25.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/accessors-smart-1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-lang3-3.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jsch-0.1.54.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-security-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-xml-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-simplekdc-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-admin-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerby-util-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-identity-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/json-smart-2.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-common-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/snappy-java-1.0.5.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-webapp-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-client-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/avro-1.7.7.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jersey-core-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/nimbus-jose-jwt-4.41.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/javax.servlet-api-3.1.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-http-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-server-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerby-asn1-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-annotations-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-crypto-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/netty-3.10.5.Final.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jcip-annotations-1.0-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/slf4j-api-1.7.25.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/hadoop-annotations-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/metrics-core-3.2.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-servlet-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-io-2.5.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/woodstox-core-5.0.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jaxb-api-2.2.11.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jersey-server-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/curator-client-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerby-config-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/re2j-1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/asm-5.0.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-configuration2-2.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerby-pkix-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/curator-framework-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/audience-annotations-0.5.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/commons-codec-1.11.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-util-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/stax2-api-3.1.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jetty-io-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/httpcore-4.4.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/curator-recipes-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jersey-servlet-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/hadoop-auth-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-util-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/kerb-core-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/hadoop-kms-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/hadoop-common-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/common/hadoop-nfs-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-xdr-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-server-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-collections-3.2.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/okhttp-2.7.5.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-net-3.6.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-beanutils-1.9.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr305-3.0.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/token-provider-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-json-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/zookeeper-3.4.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-databind-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jsr311-api-1.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/httpclient-4.5.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-ajax-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-compress-1.18.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/paranamer-2.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/accessors-smart-1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang3-3.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jsch-0.1.54.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-security-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-xml-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jettison-1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-all-4.0.52.Final.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-simplekdc-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-admin-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-util-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-identity-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/json-smart-2.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-common-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/snappy-java-1.0.5.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-webapp-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-client-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/avro-1.7.7.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-core-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/nimbus-jose-jwt-4.41.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/javax.servlet-api-3.1.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-http-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-server-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-asn1-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-annotations-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-crypto-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/netty-3.10.5.Final.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jcip-annotations-1.0-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/okio-1.6.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-annotations-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/json-simple-1.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-servlet-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-io-2.5.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/woodstox-core-5.0.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-api-2.2.11.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-server-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-client-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-config-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/re2j-1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/asm-5.0.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-configuration2-2.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-math3-3.1.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerby-pkix-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-framework-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/audience-annotations-0.5.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/commons-codec-1.11.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-util-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/stax2-api-3.1.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/gson-2.2.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jetty-io-9.3.24.v20180605.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/httpcore-4.4.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/curator-recipes-2.13.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jersey-servlet-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/hadoop-auth-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-util-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/kerb-core-1.0.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/lib/htrace-core4-4.1.0-incubating.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-native-client-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-rbf-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-client-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-httpfs-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-nfs-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/hdfs/hadoop-hdfs-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-app-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.1.2-tests.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-uploader-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-common-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/guice-servlet-4.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/java-util-1.9.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-module-jaxb-annotations-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/snakeyaml-1.16.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/fst-2.50.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-json-provider-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/HikariCP-java7-2.4.12.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/guice-4.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/objenesis-1.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-guice-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/metrics-core-3.2.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/mssql-jdbc-6.2.1.jre7.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/jersey-client-1.19.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/jackson-jaxrs-base-2.7.8.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/dnsjava-2.1.7.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/swagger-annotations-1.5.4.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/json-io-2.5.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/ehcache-3.3.1.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/lib/geronimo-jcache_1.0_spec-1.0-alpha-1.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-registry-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-tests-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-client-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-api-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-common-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-services-core-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-router-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-common-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-api-3.1.2.jar:/usr/local/hadoop-3.1.2/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.1.2.jar
    STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a; compiled by 'sunilg' on 2019-01-29T01:39Z
    STARTUP_MSG:   java = 1.8.0_222
    ************************************************************/
    2019-10-13 15:21:29,232 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
    2019-10-13 15:21:29,421 INFO namenode.NameNode: createNameNode [-format]
    2019-10-13 15:21:30,291 INFO common.Util: Assuming 'file' scheme for path /usr/local/hadoop/data/namenode in configuration.
    2019-10-13 15:21:30,291 INFO common.Util: Assuming 'file' scheme for path /usr/local/hadoop/data/namenode in configuration.
    Formatting using clusterid: CID-d1d9f073-058a-4ff6-9edb-abf48551e43c
    2019-10-13 15:21:30,345 INFO namenode.FSEditLog: Edit logging is async:true
    2019-10-13 15:21:30,362 INFO namenode.FSNamesystem: KeyProvider: null
    2019-10-13 15:21:30,363 INFO namenode.FSNamesystem: fsLock is fair: true
    2019-10-13 15:21:30,365 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
    2019-10-13 15:21:30,373 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
    2019-10-13 15:21:30,374 INFO namenode.FSNamesystem: supergroup          = supergroup
    2019-10-13 15:21:30,374 INFO namenode.FSNamesystem: isPermissionEnabled = false
    2019-10-13 15:21:30,374 INFO namenode.FSNamesystem: HA Enabled: false
    2019-10-13 15:21:30,434 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
    2019-10-13 15:21:30,448 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
    2019-10-13 15:21:30,448 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
    2019-10-13 15:21:30,454 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
    2019-10-13 15:21:30,454 INFO blockmanagement.BlockManager: The block deletion will start around 2019 Oct 13 15:21:30
    2019-10-13 15:21:30,456 INFO util.GSet: Computing capacity for map BlocksMap
    2019-10-13 15:21:30,458 INFO util.GSet: VM type       = 64-bit
    2019-10-13 15:21:30,459 INFO util.GSet: 2.0% max memory 444.7 MB = 8.9 MB
    2019-10-13 15:21:30,459 INFO util.GSet: capacity      = 2^20 = 1048576 entries
    2019-10-13 15:21:30,469 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
    2019-10-13 15:21:30,483 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManager: defaultReplication         = 1
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManager: maxReplication             = 512
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManager: minReplication             = 1
    2019-10-13 15:21:30,483 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
    2019-10-13 15:21:30,484 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
    2019-10-13 15:21:30,484 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
    2019-10-13 15:21:30,484 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
    2019-10-13 15:21:30,538 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215
    2019-10-13 15:21:30,552 INFO util.GSet: Computing capacity for map INodeMap
    2019-10-13 15:21:30,552 INFO util.GSet: VM type       = 64-bit
    2019-10-13 15:21:30,552 INFO util.GSet: 1.0% max memory 444.7 MB = 4.4 MB
    2019-10-13 15:21:30,552 INFO util.GSet: capacity      = 2^19 = 524288 entries
    2019-10-13 15:21:30,565 INFO namenode.FSDirectory: ACLs enabled? false
    2019-10-13 15:21:30,566 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
    2019-10-13 15:21:30,566 INFO namenode.FSDirectory: XAttrs enabled? true
    2019-10-13 15:21:30,566 INFO namenode.NameNode: Caching file names occurring more than 10 times
    2019-10-13 15:21:30,571 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
    2019-10-13 15:21:30,573 INFO snapshot.SnapshotManager: SkipList is disabled
    2019-10-13 15:21:30,580 INFO util.GSet: Computing capacity for map cachedBlocks
    2019-10-13 15:21:30,580 INFO util.GSet: VM type       = 64-bit
    2019-10-13 15:21:30,580 INFO util.GSet: 0.25% max memory 444.7 MB = 1.1 MB
    2019-10-13 15:21:30,580 INFO util.GSet: capacity      = 2^17 = 131072 entries
    2019-10-13 15:21:30,587 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
    2019-10-13 15:21:30,587 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
    2019-10-13 15:21:30,587 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
    2019-10-13 15:21:30,596 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
    2019-10-13 15:21:30,596 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
    2019-10-13 15:21:30,598 INFO util.GSet: Computing capacity for map NameNodeRetryCache
    2019-10-13 15:21:30,598 INFO util.GSet: VM type       = 64-bit
    2019-10-13 15:21:30,598 INFO util.GSet: 0.029999999329447746% max memory 444.7 MB = 136.6 KB
    2019-10-13 15:21:30,598 INFO util.GSet: capacity      = 2^14 = 16384 entries
    2019-10-13 15:21:30,640 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1558567234-172.17.0.15-1570951290630
    2019-10-13 15:21:30,682 INFO common.Storage: Storage directory /usr/local/hadoop/data/namenode has been successfully formatted.
    2019-10-13 15:21:30,690 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
    2019-10-13 15:21:30,792 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 of size 391 bytes saved in 0 seconds .
    2019-10-13 15:21:30,813 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
    2019-10-13 15:21:30,819 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at shengxi/172.17.0.15
    ************************************************************/
    [root@shengxi bin]# 
    

     最重要的是下面这一段中的successfully formatted.和txid>=0有些版本还会返回status=0。

    2019-10-13 15:21:30,682 INFO common.Storage: Storage directory /usr/local/hadoop/data/namenode has been successfully formatted.
    2019-10-13 15:21:30,690 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 using no compression
    2019-10-13 15:21:30,792 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 of size 391 bytes saved in 0 seconds .
    2019-10-13 15:21:30,813 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
    

        (8)启动环境,注意2.x和3.x是不一样的。

               *   2.x的是

    sbin/hadoop-daemon.sh start namenode
    
    sbin/hadoop-daemon.sh start datanode
    
    sbin/yarn-daemon.sh start resourcemanager
    
    sbin/yarn-daemon.sh start nodemanager
    
    sbin/mr-jobhistory-daemon.sh start historyserver
    

            *     3.x的启动方式是:

    hdfs --daemon start namenode
    
    hdfs --daemon start datanode
    
    yarn --daemon start resourcemanager
    
    yarn --daemon start nodemanager
    
    yarn --daemon start timelineserver
    

     结果如下:

    [root@shengxi sbin]# hdfs --daemon start namenode
    [root@shengxi sbin]# hdfs --daemon start datanode
    [root@shengxi sbin]# yarn --daemon start resourcemanager
    WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
    [root@shengxi sbin]# yarn --daemon start nodemanager
    WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR.
    [root@shengxi sbin]# yarn --daemon start timelineserver
    WARNING: YARN_CONF_DIR has been replaced by HADOOP_CONF_DIR. Using value of YARN_CONF_DIR

    验证:

    [root@shengxi hadoop-3.1.2]# jps
    721 DataNode
    610 NameNode
    1268 ApplicationHistoryServer
    1111 NodeManager
    844 ResourceManager
    1293 Jps
    [root@shengxi hadoop-3.1.2]# 
    

    进行webUI检测(注意:云控制台要开启对应的端口)

    web管理   |2.x端口| 3.x端口
    NameNode | 8020 | 9820
    NameNode HTTP UI | 50070 | 9870
    DataNode | 50010 | 9866
    Secondary NameNode HTTP UI | 50090 | 9868
    DataNode IPC | 50020 | 9867
    DataNode HTTP UI | 50075 | 9864
    DataNode | 50010 | 9866

     

    将单机测试的input文件夹copy到分布式里面,真实路径就是在hdfs-site.xml设定的。即:/usr/local/hadoop/data/datanode

    利用web新建文件夹input将home里面的input文件夹中的文件上传到input中。

    [root@shengxi ~]#  hdfs dfs -put input/* /input
    [root@shengxi ~]# 
    

    开始测试,命令是

    hadoop jar /usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar wordcount /input /output
    

    运行过程如下

    [root@shengxi ~]# hadoop jar /usr/local/hadoop-3.1.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.2.jar wordcount /input /output
    2019-10-13 15:49:46,920 INFO client.RMProxy: Connecting to ResourceManager at hadoop/172.17.0.15:8032
    2019-10-13 15:49:47,690 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1570951999899_0002
    2019-10-13 15:49:48,063 INFO input.FileInputFormat: Total input files to process : 3
    2019-10-13 15:49:48,961 INFO mapreduce.JobSubmitter: number of splits:3
    2019-10-13 15:49:49,708 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1570951999899_0002
    2019-10-13 15:49:49,710 INFO mapreduce.JobSubmitter: Executing with tokens: []
    2019-10-13 15:49:49,967 INFO conf.Configuration: resource-types.xml not found
    2019-10-13 15:49:49,968 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
    2019-10-13 15:49:50,048 INFO impl.YarnClientImpl: Submitted application application_1570951999899_0002
    2019-10-13 15:49:50,109 INFO mapreduce.Job: The url to track the job: http://hadoop:8088/proxy/application_1570951999899_0002/
    2019-10-13 15:49:50,110 INFO mapreduce.Job: Running job: job_1570951999899_0002
    2019-10-13 15:49:58,477 INFO mapreduce.Job: Job job_1570951999899_0002 running in uber mode : false
    2019-10-13 15:49:58,479 INFO mapreduce.Job:  map 0% reduce 0%
    2019-10-13 15:50:13,719 INFO mapreduce.Job:  map 100% reduce 0%
    2019-10-13 15:50:21,799 INFO mapreduce.Job:  map 100% reduce 100%
    2019-10-13 15:50:23,828 INFO mapreduce.Job: Job job_1570951999899_0002 completed successfully
    2019-10-13 15:50:23,930 INFO mapreduce.Job: Counters: 53
    	File System Counters
    		FILE: Number of bytes read=118
    		FILE: Number of bytes written=864161
    		FILE: Number of read operations=0
    		FILE: Number of large read operations=0
    		FILE: Number of write operations=0
    		HDFS: Number of bytes read=343
    		HDFS: Number of bytes written=50
    		HDFS: Number of read operations=14
    		HDFS: Number of large read operations=0
    		HDFS: Number of write operations=2
    	Job Counters 
    		Launched map tasks=3
    		Launched reduce tasks=1
    		Data-local map tasks=3
    		Total time spent by all maps in occupied slots (ms)=75502
    		Total time spent by all reduces in occupied slots (ms)=5829
    		Total time spent by all map tasks (ms)=37751
    		Total time spent by all reduce tasks (ms)=5829
    		Total vcore-milliseconds taken by all map tasks=37751
    		Total vcore-milliseconds taken by all reduce tasks=5829
    		Total megabyte-milliseconds taken by all map tasks=77314048
    		Total megabyte-milliseconds taken by all reduce tasks=5968896
    	Map-Reduce Framework
    		Map input records=3
    		Map output records=9
    		Map output bytes=94
    		Map output materialized bytes=130
    		Input split bytes=288
    		Combine input records=9
    		Combine output records=9
    		Reduce input groups=6
    		Reduce shuffle bytes=130
    		Reduce input records=9
    		Reduce output records=6
    		Spilled Records=18
    		Shuffled Maps =3
    		Failed Shuffles=0
    		Merged Map outputs=3
    		GC time elapsed (ms)=832
    		CPU time spent (ms)=1980
    		Physical memory (bytes) snapshot=716304384
    		Virtual memory (bytes) snapshot=13735567360
    		Total committed heap usage (bytes)=436482048
    		Peak Map Physical memory (bytes)=205828096
    		Peak Map Virtual memory (bytes)=3649929216
    		Peak Reduce Physical memory (bytes)=106946560
    		Peak Reduce Virtual memory (bytes)=2791301120
    	Shuffle Errors
    		BAD_ID=0
    		CONNECTION=0
    		IO_ERROR=0
    		WRONG_LENGTH=0
    		WRONG_MAP=0
    		WRONG_REDUCE=0
    	File Input Format Counters 
    		Bytes Read=55
    	File Output Format Counters 
    		Bytes Written=50
    [root@shengxi ~]# 
    

    在命令行中查看

    hdfs dfs -cat /output/*
    //结果为
    [root@shengxi ~]# hdfs dfs -cat /output/*
    dfads	1
    dfjlaskd	1
    hello	3
    ldlkjfh	2
    my	1
    world	1
    [root@shengxi ~]# 
    

    结果是和单机版一模一样的。

     

     在web中也可以看到结果就是成功了。

     

    展开全文
  • 最近要在公司里搭建一个hadoop测试集群,于是采用docker来快速部署hadoop集群。 0. 写在前面 网上也已经有很多教程了,但是其中都有不少坑,在此记录一下自己安装过程。 目标:使用docker搭建一个一主两从三台机器...
  • 本篇将在阿里云ECS服务器部署HADOOP集群(一):Hadoop完全分布式集群环境搭建基础上搭建。 本地模式需要采用MySQL数据库存储数据。 1 环境介绍 一台阿里云ECS服务器:master 操作系统:CentOS7.3 Hadoop:hadoop...
        

    本篇将在阿里云ECS服务器部署HADOOP集群(一):Hadoop完全分布式集群环境搭建的基础上搭建。

     

    本地模式需要采用MySQL数据库存储数据。

    1 环境介绍

    2 安装 Mysql

    参照 mysql 安装(Linux、Ubuntu)

    3 Hive 下载

    下载 apache-hive-2.3.6-bin.tar.gz 并在合适的位置解压缩,笔者这里解压缩的路径为:

    /usr/local

    将解压得到的目录改名为 hive

    1 cd /usr/local
    2 mv apache-hive-2.3.6-bin/ hive/

    4 添加 Hive 环境变量

    在"/etc/profile"中添加内容:

    1 export HIVE_HOME=/usr/local/hive
    2 export PATH=$PATH:$HIVE_HOME/bin

    重新加载环境:

    source /etc/profile

    5 修改 Hive 配置信息

    1 cd $HIVE_HOME/conf
    2 # 新建 hive-site.xml 文件, 或者复制已有的 hive-default.xml.templat 模版
    3 vim hive-site.xml 

    配置可参考如下代码:

     1 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
     2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
     3 <configuration>
     4   <property>
     5     <name>javax.jdo.option.ConnectionURL</name>
     6     <value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true</value>
     7     <description>JDBC connect string for a JDBC metastore</description>
     8   </property>
     9   <property>
    10     <name>javax.jdo.option.ConnectionDriverName</name>
    11     <value>com.mysql.jdbc.Driver</value>
    12     <description>Driver class name for a JDBC metastore</description>
    13   </property>
    14   <property>
    15     <name>javax.jdo.option.ConnectionUserName</name>
    16     <value>root</value>
    17     <description>username to use against metastore database</description>
    18   </property>
    19   <property>
    20     <name>javax.jdo.option.ConnectionPassword</name>
    21     <value>password</value>
    22     <description>password to use against metastore database</description>
    23   </property>
    24 </configuration>

    6 配置Mysql

    6.1 为 Hive 添加 mysql 连接器

    下载 mysql-connector-java-5.1.47.tar.gz 并解压至 $HIVE_HOME/lib 下

    tar -zxvf mysql-connector-java-5.1.47.tar.gz -C $HIVE_HOME/lib

    6.2 启动并登陆 mysql shell

    1 service mysql start
    2 mysql -uroot -p

    6.3 配置 mysql 允许 hive 接入:

    1 # 将所有数据库的所有表的所有权限赋给 root 用户
    2 grant all on *.* to root@localhost;
    3 # 刷新mysql系统权限关系表
    4 flush privileges;

    7 启动 Hive

    启动 hive 之前,确保已启动 HDFS 和 YARN。

    1 start-all.sh
    2 hive

    在启动Hive时,有可能会出现Hive metastore database is not initialized的错误,这里给出解决方案。
    解决Hive启动,Hive metastore database is not initialized的错误。出错原因:以前曾经安装了Hive或MySQL,重新安装Hive和MySQL以后,导致版本、配置不一致。解决方法是,使用schematool工具。Hive现在包含一个用于 Hive Metastore 架构操控的脱机工具,名为 schematool.此工具可用于初始化当前 Hive 版本的 Metastore 架构。此外,其还可处理从较旧版本到新版本的架构升级。所以,解决上述错误,你可以在终端执行如下命令:

    schematool -dbType mysql -initSchema

    执行后,再启动Hive,应该就正常了。

    上述出自 http://dblab.xmu.edu.cn/blog/1080-2/

    启动进入Hive的交互式执行环境以后,输入 show databases 会先显示如下

    hive> show databases;
    OK
    default
    Time taken: 7.312 seconds, Fetched: 1 row(s)

    8 安装完成

     

    阿里云ECS服务器部署HADOOP集群系列:

    展开全文
  • 阿里云服务器centos7上部署Hadoop2.6 ------------------------core-site.xml &lt;configuration&gt; &lt;!-- 指定HDFS老大(namenode)通信地址 --&gt; &lt;property&gt; &lt;...
  • 本篇将在阿里云ECS服务器部署HADOOP集群(一):Hadoop完全分布式集群环境搭建基础上搭建。 1 环境介绍 一台阿里云ECS服务器:master 操作系统:CentOS7.3 Hadoop:hadoop-2.7.3.tar.gz Java:jdk-8u77-linux-x64....
  • 一.安装JAVA环境——JDK 二.配置hosts系统文件 ...编辑完成后保存退出,在另外两台服务器上做相同操作。 三.安装SSH 生成该节点私钥和公钥,将生成公钥(id_rsa.pub)文件中内容追加到autho...
  • 两台配置CentOS 7.3阿里云ECS服务器Hadoophadoop-2.7.3.tar.gz; Java:jdk-8u77-linux-x64.tar.gz; hostname及IP配置: 更改主机名: 由于系统为CentOS 7,可以直接使用‘hostnamectl set-hostname 主机名’...
  • 完全分布式部署Hadoop

    2018-08-04 11:01:00
    完全分布式部署 Hadoop 分析: 1)准备 3 台客户机(关闭防火墙、静态 ip、主机名称) 2)安装 ...scp 1)scp 可以实现服务器服务器之间数据拷贝。 2)案例实操 (1)将 hadoop101 中/opt/module 和/opt/softwa...
  • 本篇将在阿里云ECS服务器部署HADOOP集群(一):Hadoop完全分布式集群环境搭建基础上搭建,多添加了一个 datanode 节点 。 1 节点环境介绍: 1.1 环境介绍: 服务器:三台阿里云ECS服务器:master, slave1, slave...
  • 部署hadoop集群

    2015-11-06 11:29:01
    安装完全分布式Hadoop1.1 安装准备工作1.1.1 规划本安装示例将使用六台服务器(CentOS 6.5 64bit)来实现,其规划如下所示:IP地址主机名运行进程或扮演角色 192.168.40.30master.dbq168.comNameNode,Jo...
  • 部署Hadoop3.0高性能集群,Hadoop完全分布式模式: Hadoop的守护进程分别运行在由多个主机搭建的集群上,不同 节点担任不同的角色,在实际工作应用开发中,通常使用该模式构建企业级Hadoop系统。 在Hadoop环境中,所有...
  • 部署Hadoop高性能集群

    2017-11-23 21:01:00
    部署Hadoop高性能集群 服务器概述 1)Hadoop是什么 Hadoop是Lucene创始人Doug Cutting,根据Google相关内容山寨出来分布式文件系统和对海量数据进行分析计算基础框架系统,其中包含MapReduce程序,hdfs系统等...
  • 可以通过优先级,端口等设置常用端口如3306、50070、8088、8032等对集群另外主机开放,也可设置对集群另外主机开放全部端口,一劳永逸。 我自己刚申请到了云主机一周左右,就发生了入侵事件。后来仔细删了恶意脚本...
  • 批量部署Hadoop集群环境(1)

    千次阅读 2016-02-16 13:43:42
    前言:云火一塌糊涂,加上自大二(两年前)就跟随一位教授做大数据项目,所以很早就产生了兴趣,随着知识积累,虚拟机已经不能满足了,这次在服务器上以生产环境来部署Hadoop。已经搭建完毕,故而写出教程。不要...
  • 由于工作需要,需要搭建hadoop+zookeeper+hbase+storm+kafka集群准备了三台服务器(一台8核+32G内存+300G硬盘充当master,一台8核+16G内存+300G硬盘充当slave01,一台8+16G+500G硬盘充当slave02,并且都能上网),...
  • 云计算Hadoop:快速部署Hadoop集群 近来云计算越来越热门了,云计算已经被看作IT业新趋势。云计算可以粗略地定义为使用自己环境之外某一服务提供可伸缩计算资源,并按使用量付费。可以通过 Internet 访问“云...
  • 如何部署hadoop集群

    2013-10-29 17:20:00
    假设我们有三台服务器,他们角色我们做如下划分: ...接下来我们按照这个配置来部署hadoop集群。 1:安装jdk 下载解压。 vi /etc/profile JAVA_HOME=/usr/java/jdk1.6.0_29 CLASS_PATH=$JA...
  • 网上关于Hadoop的集群配置,很多情况下,都是把namenode和secondnamenode部署在同一服务器上。为了降低风险,一个大的集群环境,最好是把这两个配置到不同的服务器上。 三、操作 要达到这要求,需要对conf/...
  • 在docker中部署hadoop完全分布式集群

    千次阅读 2018-06-15 18:53:57
    开三台虚拟机,电脑内存就占用特别多,毕竟虚拟机比较占内存,而docker是一种操作系统级别轻量级虚拟化技术,一个docker容器占用少量内存,下面就详解部署hadoop分布式步骤(我是在阿里云服务器上部署,...
  • 通常在个人笔记本上部署Hadoop测试集群(含生态圈各组件)是个很耗时工作。Cloudera公司提供一个快速部署Docker镜像,可以快速启动一个测试集群。 测试环境为Ubuntu服务器 第一部分 Docker镜像准备 首先本机...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 1,121
精华内容 448
关键字:

部署hadoop的服务器