精华内容
下载资源
问答
  • 新建虚拟机Ubuntu并安装Hadoop步骤 (初学Hadoop可能会要多次重新安装虚拟机,所有写下来方便自己) 新建虚拟机,自定义安装,双处理机双内核,如果经常使用,尽可能给多的磁盘空间 软件包software update立即同意...

    新建虚拟机Ubuntu并安装Hadoop步骤

    (初学Hadoop可能会要多次重新安装虚拟机,所有写下来方便自己)

    新建虚拟机,自定义安装,双处理机双内核,如果经常使用,尽可能给多的磁盘空间

    软件包software update立即同意更新

    默认为英文版,修改系统时间:

    1.查看当前时间状态
    查看当前时间状态 timedatectl status:
    2.修改时区
    所有的时区名称存储在/usr/share/zoneinfo文件中。
    执行命令timedatectl set-timezone "Asia/Shanghai"就可以将时区设为上海时区。
    修改完成~

    修改系统语言为中文:
    这是修改之前
    点击Manage Install Language

    在这里插入图片描述
    点击Install language 选择Chinese简体
    下载完语言包后重启
    之后系统语言就改为中文了~

    更新apt: sudo apt-get update
    下载vim: sudo apt-get install vim

    安装ssh:
    Ubuntu 默认已安装了 SSH client,此外还需要安装 SSH server:
    sudo apt-get install openssh-server

    安装后,可以使用如下命令登陆本机:ssh localhost
    但每次登录都需要输入密码,所以接下来设置为无密码登录:
    退出刚才登录的ssh

    $ exit
    $ cd ~/.ssh/
    $ ssh-keygen -t rsa
    $ cat ./id_rsa.pub >> ./authorized_keys
    

    完成后便可以输入 ssh localhost 直接无密码登录。

    安装jdk:
    JDK1.8的安装包jdk-8u162-linux-x64.tar.gz放在了百度云盘,JDK安装包百度云盘网址,(提取码:6kmf)。请把压缩格式的文件jdk-8u162-linux-x64.tar.gz下载到虚拟机中,

    1.cd /usr/lib
    2.sudo mkdir jvm   #创建/usr/lib/jvm目录用来存放JDK文件
    3.cd ~   #进入hadoop用户的主目录
    4.cd Downloads    #注意区分大小写字母,刚才已经通过FTP软件把JDK安装包jdk-8u162-linux-x64.tar.gz上传到该目录下
    5.sudo tar -zxvf ./jdk-8u162-linux-x64.tar.gz -C /usr/lib/jvm #把JDK文件解压到/usr/lib/jvm目录下
    

    下面继续执行如下命令,设置环境变量:

    1.cd ~
    2.vim ~/.bashrc
    

    上面命令使用vim编辑器,打开了hadoop这个用户的环境变量配置文件,请在这个文件的开头位置,添加如下几行内容:

    export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_162
    export JRE_HOME=${JAVA_HOME}/jre
    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
    export PATH=${JAVA_HOME}/bin:$PATH
    

    保存.bashrc文件并退出vim编辑器。然后,继续执行如下命令让.bashrc文件的配置立即生效:
    source ~/.bashrc (Shell 命令)
    这时,可以使用如下命令查看是否安装成功:
    java -version
    安装hadoop:
    Hadoop 3 可以通过 http://mirror.bit.edu.cn/apache/hadoop/common/ 或者 http://mirrors.cnnic.cn/apache/hadoop/common/ 下载,一般选择下载最新的稳定版本,即下载 “stable” 下的 hadoop-3.x.y.tar.gz 这个格式的文件,这是编译好的,另一个包含 src 的则是 Hadoop 源代码,需要进行编译才可使用。

    我们选择将 Hadoop 安装至 /usr/local/ 中:

    1.sudo tar -zxf ~/下载/hadoop-3.2.1.tar.gz -C /usr/local    # 解压到/usr/local中
    2.cd /usr/local/
    3.sudo mv ./hadoop-3.2.1/ ./hadoop            # 将文件夹名改为hadoop
    4.sudo chown -R hadoop ./hadoop       # 修改文件权限
    

    Hadoop 解压后即可使用。输入如下命令来检查 Hadoop 是否可用,成功则会显示 Hadoop 版本信息:

    1.cd /usr/local/hadoop
    2../bin/hadoop version
    

    修改配置文件:
    执行命令: gedit ./etc/hadoop/core-site.xml

    1.<configuration>
    2.<property>
    3.<name>hadoop.tmp.dir</name>
    4.<value>file:/usr/local/hadoop/tmp</value>
    5.<description>Abase for other temporary directories.</description>
    6.</property>
    7.<property>
    8.<name>fs.defaultFS</name>
    9.<value>hdfs://localhost:9000</value>
    10.</property>
    11.</configuration>
    

    执行命令: gedit ./etc/hadoop/hdfs-site.xml

    1.<configuration>
    2.<property>
    3.<name>dfs.replication</name>
    4.<value>1</value>
    5.</property>
    6.<property>
    7.<name>dfs.namenode.name.dir</name>
    8.<value>file:/usr/local/hadoop/tmp/dfs/name</value>
    9.</property>
    10.<property>
    11.<name>dfs.datanode.data.dir</name>
    12.<value>file:/usr/local/hadoop/tmp/dfs/data</value>
    13.</property>
    14.</configuration>
    

    配置完成后,执行 NameNode 的格式化:
    执行命令:./bin/hdfs namenode -format
    这样hadoop就大致完成安装了

    在Hadoop文件下执行启动命令: ./sbin/start-dfs sh
    通过命令jps判断是否启动成功,如果成功则有四个输出:jps、SecondaryNameNode、NameNode、DataNode
    关闭hadoop命令: ./sbin/stop-dfs.sh

    这之中有些命令是在~目录下执行,有些是在/usr/local/hadoop目录下执行,请自行判断!
    (其中很多内容取自林子雨老师的Hadoop安装教程:http://dblab.xmu.edu.cn/blog/install-hadoop/ 侵删)

    展开全文
  • Hadoop–mac 上利用虚拟机安装hadoop步骤整理 本文仅提供详细步骤,不提供详尽解释 VMware fusion虚拟机配置静态IP ######mac book 执行获取 [tonerMac-Pro:~ toner]$ cd /Library/Preferences/VMware\ Fusion/vmnet...

    Hadoop–mac 上利用虚拟机安装hadoop步骤整理

    本文仅提供详细步骤,不提供详尽解释

    VMware fusion虚拟机配置静态IP

    ######mac book 执行获取
    [tonerMac-Pro:~ toner]$  cd /Library/Preferences/VMware\ Fusion/vmnet8
    [tonerMac-Pro:~ toner]$  cat dhcpd.conf
        subnet 192.168.162.0 netmask 255.255.255.0 {
          range 192.168.162.128 192.168.162.254;
          option broadcast-address 192.168.162.255;
          option domain-name-servers 192.168.162.2;
          option domain-name localdomain;
          default-lease-time 1800;                # default is 30 minutes
          max-lease-time 7200;                    # default is 2 hours
          option netbios-name-servers 192.168.162.2;
          option routers 192.168.162.2;
        }
        host vmnet8 {
          hardware ethernet 00:50:56:C0:00:08;
          fixed-address 192.168.162.1;
          option domain-name-servers 0.0.0.0;
          option domain-name "";
          option routers 0.0.0.0;
        }
    [tonerMac-Pro:~ toner]$  cat dhcpd.conf    
    		# NAT gateway address
        ip = 192.168.162.2
        netmask = 255.255.255.0
    #################查看DNS
    168.126.63.1
    114.114.114.114
    8.8.8.8
    
    #######虚拟机执行设置
    [root@localhost ~]  cd /etc/sysconfig/network-scripts
    [root@localhost network-scripts]	vi ifcfg-ens33
        TYPE=Ethernet
        PROXY_METHOD=none
        BROWSER_ONLY=no
        DEFROUTE=yes
        IPV4_FAILURE_FATAL=no
        IPV6INIT=yes
        IPV6_AUTOCONF=yes
        IPV6_DEFROUTE=yes
        IPV6_FAILURE_FATAL=no
        IPV6_ADDR_GEN_MODE=stable-privacy
        NAME=ens33
        UUID=cc1832be-068a-4d8b-b16f-7d25aa346a94
        DEVICE=ens33
    		#修改DHCP->static
    		BOOTPROTO=static
    		#修改no->yes
        ONBOOT=yes
        #新增静态IP
        IPADDR=192.168.162.200
        #新增服务网关
        GATEWAY=192.168.162.2
        #新增子网掩码
        NETMASK=255.255.255.0
        #新增DNS
        DNS1=168.126.63.1
        DNS2=114.114.114.114
        DNS3=8.8.8.8
    [root@localhost network-scripts]  service network restart
    

    ssh 免密登陆登陆

    ######mac book 执行获取
    #使用 ssh-keygen 命令,一直按回车,就可以生成当前机器的公钥 id_rsa.pub
    [tonerMac-Pro:~ toner]$  ssh-keygen  
    [tonerMac-Pro:~ toner]$  cat .ssh/id_rsa.pub  
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQ.........2axF toner@tonerMac-Pro.local
    
    ########快捷登陆设置
    [tonerMac-Pro:~ toner]$  cd .ssh/
    [tonerMac-Pro:~ toner]$  vi config
            #自定义别名
            Host            hadoop1
            #替换为你的ssh服务器ip或domain
            HostName        192.168.162.200
            #ssh服务器端口默认为22
            Port            22
            #ssh服务器用户名
            User            root
            #第一个步骤生成的公钥文件对应的私钥文件
            IdentityFile    ~/.ssh/id_rsa
    #免密快捷登陆
    [tonerMac-Pro:~ toner]$  ssh hadooop1
    
    ######虚拟机 执行设置
    #进行免密配置,或者新建.ssh文件
    [root@localhost ~]  ssh-keygen
    [root@localhost ~]  cd .ssh/
    #将mac上面的id_rsa.pub内容复制进去,即可实现mac对虚拟机的免密登陆
    [root@localhost ~]  vi authorized_keys
    

    JDK1.8安装

    • 创建/usr/local/java目录
    [root@localhost ~] mkdir/usr/local/java
    [root@localhost ~] cd /usr/loacl/java
    
    • 上传并解压jdk
    [tonerMac-Pro:~ toner] scp jdk hadoop1 hadoop1:/usr/loacl/java/jdk
    [root@localhost ~] tar -zxvf jdk-8u211-linux-x64.tar.gz
    
    • 配置环境变量
    [root@localhost ~] vi /etc/profile
    #添加以下java环境变量
    export JAVA_HOME=/usr/local/java/jdk1.8
    export JRE_HOME=${JAVA_HOME}/jre
    export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:$CLASSPATH
    export JAVA_PATH=${JAVA_HOME}/bin:${JRE_HOME}/bin
    export PATH=$PATH:${JAVA_PATH}
    
    [root@localhost ~] source /etc/profile
    [root@localhost java] java -version
    java version "1.8.0_211"
    Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
    Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
    

    修改主机名

    [root@localhost ~]  hostnamectl set-hostname hadoop1
    

    安装与配置Hadoop

    修改配置文件

    路径:./etc/hadoop

    1)Hadoop-env.sh

    修改JDK安装目录

    2)core-site.xml

    <property>
    
      <name>fs.defaultFS</name>
    	<!--active主机以及端口\也可使用zookeeper-->
      <value>hdfs://SY-0217:8020</value>
    
      <description>The name of the default file system.  A URI whose
    
      scheme and authority determine the FileSystem implementation.  The
    
      uri's scheme determines the config property (fs.SCHEME.impl) naming
    
      the FileSystem implementation class.  The uri's authority is used to
    
      determine the host, port, etc. for a filesystem.</description>
    
    </property>
    
    
    

    3)maprd-site.xml

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
      <description>The runtime framework for executing MapReduce jobs.
      Can be one of local, classic or yarn.
      </description>
    </property>
    
    <!-- jobhistory properties -->
    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>SY-0355:10020</value>
      <description>MapReduce JobHistory Server IPC host:port</description>
    </property>
    
    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>SY-0355:19888</value>
      <description>MapReduce JobHistory Server Web UI host:port</description>
    </property>
    

    4)hdfs-site.xml

    <!--关键,自定义服务名称-->
    <property>
      <name>dfs.nameservices</name>
      <value>hadoop-test</value>
      <description>
        Comma-separated list of nameservices.
      </description>
    </property>
    
    <!--namenode节点名称-->
    <property>
      <name>dfs.ha.namenodes.hadoop-test</name>
      <value>nn1,nn2</value>
      <description>
        The prefix for a given nameservice, contains a comma-separated
        list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
      </description>
    </property>
    
    <!--节点一主机端口配置-->
    <property>
      <name>dfs.namenode.rpc-address.hadoop-test.nn1</name>
      <value>SY-0217:8020</value>
      <description>
        RPC address for nomenode1 of hadoop-test
      </description>
    </property>
    
    <!--节点二主机端口配置-->
    <property>
      <name>dfs.namenode.rpc-address.hadoop-test.nn2</name>
      <value>SY-0355:8020</value>
      <description>
        RPC address for nomenode2 of hadoop-test
      </description>
    </property>
    
    <property>
      <name>dfs.namenode.http-address.hadoop-test.nn1</name>
      <value>SY-0217:50070</value>
      <description>
        The address and the base port where the dfs namenode1 web ui will listen on.
      </description>
    </property>
    
    <property>
      <name>dfs.namenode.http-address.hadoop-test.nn2</name>
      <value>SY-0355:50070</value>
      <description>
        The address and the base port where the dfs namenode2 web ui will listen on.
      </description>
    </property>
    
    <!--namenode元信息存放路径-->
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:///home/dongxicheng/hadoop/hdfs/name</value>
      <description>Determines where on the local filesystem the DFS name node
          should store the name table(fsimage).  If this is a comma-delimited list
          of directories then the name table is replicated in all of the
          directories, for redundancy. </description>
    </property>
    
    <!--关键-->
    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://SY-0355:8485;SY-0225:8485;SY-0226:8485/hadoop-demo</value>
      <description>A directory on shared storage between the multiple namenodes
      in an HA cluster. This directory will be written by the active and read
      by the standby in order to keep the namespaces synchronized. This directory
      does not need to be listed in dfs.namenode.edits.dir above. It should be
      left empty in a non-HA cluster.
      </description>
    </property>
    
    <!--datanode目录-->
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>file:///home/dongxicheng/hadoop/hdfs/data</value>
      <description>Determines where on the local filesystem an DFS data node
      should store its blocks.  If this is a comma-delimited
      list of directories, then data will be stored in all named
      directories, typically on different devices.
      Directories that do not exist are ignored.
      </description>
    </property>
    
    <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>false</value>
      <description>
        Whether automatic failover is enabled. See the HDFS High
        Availability documentation for details on automatic HA
        configuration.
      </description>
    </property>
    
    <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/home/dongxicheng/hadoop/hdfs/journal/</value>
    </property>
    

    5)yarn-site.xml

    <!--	resourcemanager地址-->
    <property>
        <description>The hostname of the RM.</description>
        <name>yarn.resourcemanager.hostname</name>
        <value>SY-0217</value>
      </property>    
      
      <property>
        <description>The address of the applications manager interface in the RM.</description>
        <name>yarn.resourcemanager.address</name>
        <value>${yarn.resourcemanager.hostname}:8032</value>
      </property>
    
      <property>
        <description>The address of the scheduler interface.</description>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>${yarn.resourcemanager.hostname}:8030</value>
      </property>
    
      <property>
        <description>The http address of the RM web application.</description>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>${yarn.resourcemanager.hostname}:8088</value>
      </property>
    
      <property>
        <description>The https adddress of the RM web application.</description>
        <name>yarn.resourcemanager.webapp.https.address</name>
        <value>${yarn.resourcemanager.hostname}:8090</value>
      </property>
    
      <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>${yarn.resourcemanager.hostname}:8031</value>
      </property>
    
      <property>
        <description>The address of the RM admin interface.</description>
        <name>yarn.resourcemanager.admin.address</name>
        <value>${yarn.resourcemanager.hostname}:8033</value>
      </property>
    
      <property>
        <description>The class to use as the resource scheduler.</description>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
      </property>
    
      <property>
        <description>fair-scheduler conf location</description>
        <name>yarn.scheduler.fair.allocation.file</name>
        <value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value>
      </property>
    
      <property>
        <description>List of directories to store localized files in. An 
          application's localized file directory will be found in:
          ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
          Individual containers' work directories, called container_${contid}, will
          be subdirectories of this.
       </description>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/home/dongxicheng/hadoop/yarn/local</value>
      </property>
    
      <property>
        <description>Whether to enable log aggregation</description>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
      </property>
    
      <property>
        <description>Where to aggregate logs to.</description>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/tmp/logs</value>
      </property>
    
      <property>
        <description>Amount of physical memory, in MB, that can be allocated 
        for containers.</description>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>30720</value>
      </property>
    
      <property>
        <description>Number of CPU cores that can be allocated 
        for containers.</description>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>12</value>
      </property>
    
      <property>
        <description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
    

    6)slaves

    hadoop2
    hadoop3
    

    7)fairscheduler.xml

    <queue name="infrastructure">
        <minResources>102400 mb, 50 vcores </minResources>
        <maxResources>153600 mb, 100 vcores </maxResources>
        <maxRunningApps>200</maxRunningApps>
        <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
        <weight>1.0</weight>
        <aclSubmitApps>root,yarn,search,hdfs</aclSubmitApps>
      </queue>
    
       <queue name="tool">
          <minResources>102400 mb, 30 vcores</minResources>
          <maxResources>153600 mb, 50 vcores</maxResources>
       </queue>
    
       <queue name="sentiment">
          <minResources>102400 mb, 30 vcores</minResources>
          <maxResources>153600 mb, 50 vcores</maxResources>
       </queue>
    

    8)配置环境变量

    # vim ~/.bashrc  
    
    export HADOOP_HOME=/home/hadoop
    export HADOOP_INSTALL=$HADOOP_HOME 
    export HADOOP_MAPRED_HOME=$HADOOP_HOME 
    export HADOOP_HDFS_HOME=$HADOOP_HOME 
    export HADOOP_COMMON_HOME=$HADOOP_HOME 
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop  
    export YARN_HOME=$HADOOP_HOME 
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop  
    
    export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin 
    

    9)禁止防火墙开机启动

    systemctl disable firewalld.service 
    

    10)启动Hadoop

    ###格式化NameNode
    hdfs namenode -format
    ###启动HDFS
    start-dfs.sh
    ###访问UI
    http://hadoop1:50070
    ###停止HDFS
    stop-dfs.sh
    ###运行YARN
    start-yarn.sh
    ###访问UI
    http://hadoop1:8088
    ###停止YARN
    stop-yarn.sh
    

    整理不易,转载请通知
    参考资料:https://cloud.tencent.com/developer/article/1191526

    展开全文
  • linux下安装hadoop步骤

    千次阅读 2013-03-11 22:47:28
    linux下安装hadoop步骤 2011-07-19 13:26:45 我来说两句  收藏 我要投稿   下面的安装手册是我在hadoop第一版时做的,和现在的hadoop不太符合   一、前期准备: 下载hadoop: ...
    linux下安装hadoop步骤
    2011-07-19 13:26:45     我来说两句      
    收藏    我要投稿

     

    下面的安装手册是我在hadoop第一版时做的,和现在的hadoop不太符合

     

    一、前期准备:

    下载hadoop: http://hadoop.apache.org/core/releases.html

    http://hadoop.apache.org/common/releases.html

    http://www.apache.org/dyn/closer.cgi/hadoop/core/

    http://labs.xiaonei.com/apache-mirror/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz

    http://labs.xiaonei.com/apache-mirror/hadoop/

    二、硬件环境

    共有3台机器,均使用的CentOS,Java使用的是jdk1.6.0。

     

     

    三、安装JAVA6

    sudo apt-get install sun-java6-jdk

     

     

    /etc/environment

    打开之后加入:#中间是以英文的冒号隔开,记得windows中是以英文的分号做为分隔的

    CLASSPATH=.:/usr/local/java/lib

    JAVA_HOME=/usr/local/java

     

     

     

     

    三、配置host表

    [root@hadoop ~]# vi /etc/hosts

    127.0.0.1 localhost

    192.168.13.100 namenode

    192.168.13.108 datanode1

    192.168.13.110 datanode2

     

     

    [root@test ~]# vi /etc/hosts

    127.0.0.1 localhost

    192.168.13.100 namenode

    192.168.13.108 datanode1

     

     

    [root@test2 ~]# vi /etc/host

    127.0.0.1 localhost

    192.168.13.100 namenode

    192.168.13.110 datanode2

    添加用户和用户组

    addgroup hadoop

    adduser hadoop

    usermod -a -G hadoop hadoop

    passwd hadoop

     

     

    配置ssh:

     

     

    服务端:

    su hadoop

    ssh-keygen -t rsa

    cp id_rsa.pub authorized_keys

     

     

    客户端

    chmod 700 /home/hadoop

    chmod 755 /home/hadoop/.ssh

    su hadoop

    cd /home

    mkdir .ssh

     

     

    服务端:

    chmod 644 /home/hadoop/.ssh/authorized_keys

    scp authorized_keys datanode1:/home/hadoop/.ssh/

    scp authorized_keys datanode2:/home/hadoop/.ssh/

     

     

    ssh datanode1

    ssh datanode2

     

     

     如果ssh配置好了就会出现以下提示信息

    The authenticity of host [dbrg-2] can't be established.

    Key fingerpr is 1024 5f:a0:0b:65:d3:82:df:ab:44:62:6d:98:9c:fe:e9:52.

    Are you sure you want to continue connecting (yes/no)?

      OpenSSH告诉你它不知道这台主机但是你不用担心这个问题你是第次登录这台主机键入“yes”这将把

    这台主机“识别标记”加到“~/.ssh/know_hosts”文件中第2次访问这台主机时候就不会再显示这条提示信

     

     

     

     

    不过别忘了测试本机ssh dbrg-1

     

     

     

     

     

    mkdir /home/hadoop/HadoopInstall

    tar -zxvf hadoop-0.20.1.tar.gz -C /home/hadoop/HadoopInstall/

    cd /home/hadoop/HadoopInstall/

    ln -s hadoop-0.20.1 hadoop

     

     

    export JAVA_HOME=/usr/local/java

    export CLASSPATH=.:/usr/local/java/lib

    export HADOOP_HOME=/home/hadoop/HadoopInstall/hadoop

    export HADOOP_CONF_DIR=/home/hadoop/hadoop-conf

    export PATH=$HADOOP_HOME/bin:$PATH

     

     

    cd $HADOOP_HOME/conf/

    mkdir /home/hadoop/hadoop-conf

    cp hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml masters slaves /home/hadoop/hadoop-conf

     

     

    vi $HADOOP_HOME/hadoop-conf/hadoop-env.sh

     

     

     

     

    # The java implementation to use. Required. --修改成你自己jdk安装的目录

    export JAVA_HOME=/usr/local/java

     

    export HADOOP_CLASSPATH=.:/usr/local/java/lib

    # The maximum amount of heap to use, in MB. Default is 1000.--根据你的内存大小调整

    export HADOOP_HEAPSIZE=200

     

     

    vi /home/hadoop/.bashrc

    export JAVA_HOME=/usr/local/java

    export CLASSPATH=.:/usr/local/java/lib

    export HADOOP_HOME=/home/hadoop/HadoopInstall/hadoop

    export HADOOP_CONF_DIR=/home/hadoop/hadoop-conf

    export PATH=$HADOOP_HOME/bin:$PATH

     

     

     

     

     

     

    配置

     

     

    namenode

     

     

    #vi $HADOOP_CONF_DIR/slaves

    192.168.13.108

    192.168.13.110

     

     

    #vi $HADOOP_CONF_DIR/core-site.xml

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

    <!-- Put site-specific property overrides in this file. -->

     

    <configuration>

    <property>

    <name>fs.default.name</name>

    <value>hdfs://192.168.13.100:9000</value>

    </property>

    </configuration>

     

     

    #vi $HADOOP_CONF_DIR/hdfs-site.xml

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

    <!-- Put site-specific property overrides in this file. -->

     

    <configuration>

    <property>

    <name>dfs.replication</name>

    <value>3</value>

    <description>Default block replication.

    The actual number of replications can be specified when the file is created.

    The default is used if replication is not specified in create time.

    </description>

    </property>

    </configuration>

     

     

     

     

    #vi $HADOOP_CONF_DIR/mapred-site.xml

     

     

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

    <!-- Put site-specific property overrides in this file. -->

     

    <configuration>

    <property>

    <name>mapred.job.tracker</name>

    <value>192.168.13.100:11000</value>

    </property>

    </configuration>

    ~

     

     

     

     

     

     

     

     

    在slave上的配置文件如下(hdfs-site.xml不需要配置):

    [root@test12 conf]# cat core-site.xml

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

     

    <!-- Put site-specific property overrides in this file. -->

     

     

    <configuration>

    <property>

    <name>fs.default.name</name>

    <value>hdfs://namenode:9000</value>

    </property>

    </configuration>

     

     

    [root@test12 conf]# cat mapred-site.xml

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

     

    <!-- Put site-specific property overrides in this file. -->

     

     

    <configuration>

    <property>

    <name>mapred.job.tracker</name>

    <value>namenode:11000</value>

    </property>

    </configuration>

     

     

     

     

     

     

    启动

    export PATH=$HADOOP_HOME/bin:$PATH

     

     

    hadoop namenode -format

    start-all.sh

    停止stop-all.sh

     

     

    在hdfs上创建danchentest文件夹,上传文件到此目录下

    $HADOOP_HOME/bin/hadoop fs -mkdir danchentest

    $HADOOP_HOME/bin/hadoop fs -put $HADOOP_HOME/README.txt danchentest

     

     

    cd $HADOOP_HOME

    hadoop jar hadoop-0.20.1-examples.jar wordcount /user/hadoop/danchentest/README.txt output1

    09/12/21 18:31:44 INFO input.FileInputFormat: Total input paths to process : 1

    09/12/21 18:31:45 INFO mapred.JobClient: Running job: job_200912211824_0002

    09/12/21 18:31:46 INFO mapred.JobClient: map 0% reduce 0%

    09/12/21 18:31:53 INFO mapred.JobClient: map 100% reduce 0%

    09/12/21 18:32:05 INFO mapred.JobClient: map 100% reduce 100%

    09/12/21 18:32:07 INFO mapred.JobClient: Job complete: job_200912211824_0002

    09/12/21 18:32:07 INFO mapred.JobClient: Counters: 17

    09/12/21 18:32:07 INFO mapred.JobClient: Job Counters

    09/12/21 18:32:07 INFO mapred.JobClient: Launched reduce tasks=1

     

     

    查看输出结果文件,这个文件在hdfs上

    [root@test11 hadoop]# hadoop fs -ls output1

    Found 2 items

    drwxr-xr-x - root supergroup 0 2009-09-30 16:01 /user/root/output1/_logs

    -rw-r--r-- 3 root supergroup 1306 2009-09-30 16:01 /user/root/output1/part-r-00000

     

     

    [root@test11 hadoop]# hadoop fs -cat output1/part-r-00000

    (BIS), 1

    (ECCN) 1

     

     

    查看hdfs运行状态,可以通过web界面来访问http://192.168.13.100:50070/dfshealth.jsp;查看map-reduce信息,

    可以通过web界面来访问http://192.168.13.100:50030/jobtracker.jsp;下面是直接命令行看到的结果。

     

     

     

     

    出现08/01/25 16:31:40 INFO ipc.Client: Retrying connect to server: foo.bar.com/1.1.1.1:53567. Already tried 1 time(s).

    的原因是没有格式化:hadoop namenode -format

    展开全文
  • Linux安装hadoop步骤

    2016-12-26 16:23:06
    需要新增一个名为hadoop的用户,方法如下: 1、以root身份登录linux系统,在shell终端中输入如下命令创建用户hadoop useradd -m hadoop -s /bin/bash 2、使用如下命令修改hadoop用户的密码 passwd hadoop 3、...

    准备工作

    需要新增一个名为hadoop的用户,方法如下:

    1、以root身份登录linux系统,在shell终端中输入如下命令创建用户hadoop

    useradd -m hadoop -s /bin/bash

    2、使用如下命令修改hadoop用户的密码

    passwd hadoop

    3、为hadoop用户增加管理员权限

    visudo

    在打开的文件中找到“root    ALL=(ALL)       ALL”行,在这一行下面增加一行内容“hadoop    ALL=(ALL)       ALL”,按一下键盘上的“Esc”键,然后输入“:wq”,再按回车键保存退出即可。

    一、下载hadoop

    首先我们需要下载hadoop安装包,下载地址:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz,以hadoop-2.7.3版本为例

    二、安装

    1、将下载的hadoop安装包放到/usr/local目录,解压缩安装包

    tar zxvf hadoop-2.7.3.tar.gz

    mv hadoop-2.7.3 hadoop  //重命名目录

    至此,Hadoop单机配置完成。


    展开全文
  • 网上很多讲安装步骤的,但是对安装过程中的问题没有描述,这里讲自己安装hadoop过程中遇到的问题及解决方法汇总一下,希望对大家有帮助
  • 安装Hadoop之前,需要先检测系统是否已安装Java, 可参考如下链接:青瓷妈妈:Mac OS Catalina(10.15.4) 安装Java​zhuanlan.zhihu.com一. 安装Mac系统软件包管理系统Homebrew (已安装的可跳过) 在Terminal终端输入...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 5,080
精华内容 2,032
关键字:

安装hadoop步骤