精华内容
下载资源
问答
  • Hadoop在Linux安装完成,如何使用,Hadoop究竟怎么运作,怎么实现单机下测试文件写入写出
  • 使用hadoop官方工具遇到了一些问题(附录详写),所以自己编写了测试用例进行压力测试 首先pom文件引入lombok,hadoop和hdfs的依赖包 <properties> <lombok.version>1.16.20</lombok.version> &...

    使用hadoop官方工具遇到了一些问题(附录详写),所以自己编写了测试用例进行压力测试

    首先pom文件引入lombok,hadoop和hdfs的依赖包

    <properties>
        <lombok.version>1.16.20</lombok.version>
        <hadoop.version>2.7.7</hadoop.version>
    </properties>
    
    	<dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <version>${lombok.version}</version>
            <scope>provided</scope>
        </dependency>
    	<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-hdfs</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
    

    下面是我编写的测试用例(需要配置的注释已给出,需要配置的路径需要自行配置),编译后直接放到一台有linux服务器上进行测试,测试命令为

    java -cp./*; hadoop.HLT
    

    其中-cp./*是要引入测试用例需要的jar包,请把项目打包后获取lib下的jar包

    import lombok.SneakyThrows;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    
    import java.io.File;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.util.concurrent.CountDownLatch;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    
    /**
     * @ClassName HadoopLinuxTest
     * @Descriptiom TODO hadoop I/O test for linux
     * @Author KING
     * @Date 2019/4/1 17:26
     * @Version 1.2.1
     **/
    public class HLT implements Runnable
    {
        //测试文件地址
        private static final String src = "/root/32KB" ;
        //hadoop namenode 节点ip
        private static final String remote = "hdfs://192.168.1.191:9000/";
        //设定文件数偏移(用于中继上传)
        private static final int Offset = 0;
        //上传文件总数
        private static final int TOTAL_FILES = 100000000;
        //上传文件使用线程数
        private static final int TOTAL_THREADS = 200 ;
    
        private String dst ;
        private int FileStartNum;
        private int FileEndNum;
        private Configuration conf;
        private CountDownLatch Latch;
    
        public HLT(String dst, int fileStartNum, int fileEndNum, Configuration conf, CountDownLatch latch) {
            this.dst = dst;
            FileStartNum = fileStartNum;
            FileEndNum = fileEndNum;
            this.conf = conf;
            Latch = latch;
        }
    
        @SneakyThrows
        public static void main(String[] args) {
            ExecutorService ThreadPool = Executors.newFixedThreadPool(TOTAL_THREADS);
            CountDownLatch Latch = new CountDownLatch(TOTAL_THREADS);
            Configuration conf = new Configuration() ;
            conf.set("fs.defaultFS", "hdfs://192.168.1.191:9000"); // "hdfs://master:9000"
            int SingleFileNum = TOTAL_FILES/TOTAL_THREADS;
            File file = new File("/root/record.txt");
            FileOutputStream fileOut = new FileOutputStream(file,true);
            Long a = System.currentTimeMillis();
            String st ="start time sec :" + a +"\n";
            fileOut.write(st.getBytes());
    
            for (int i=0;i<TOTAL_THREADS;i++){
                int FileStartNum = i * SingleFileNum + Offset;
                int FileEndNum = (i+1) * SingleFileNum;
                //hadoop每个文件夹都有文件数量上限,所以此处为每个线程执行的上传新建一个目录
                String dst =remote + "linux_test"+i+"/";
                HLT hlt = new HLT(dst, FileStartNum, FileEndNum, conf ,Latch);
                ThreadPool.execute(hlt);
            }
            Latch.await();
    
            Long b = System.currentTimeMillis();
            String et ="end  time  sec :" + b +"\n";
            fileOut.write(et.getBytes());
            String allt = "all time :" + ((b-a)/1000) + "s" +"\n";
            fileOut.write(allt.getBytes());
            System.out.println("总用时:" + ((b-a)/1000) + "s");
    
            ThreadPool.shutdown();
        }
    
        @Override
        public void run() {
            //hadoop目录未指定时启用,修改为自己hadoop的根目录
    //        System.setProperty("hadoop.home.dir", "/root/hadoop-2.7.1");
            //用户名为非root时启用
    //        System.setProperty("HADOOP_USER_NAME", "root");
            for (int n = FileStartNum; n < FileEndNum ; n++){
                putToHDFS(src, dst + n, conf);
            }
            Latch.countDown();
        }
    
        /** 执行hadoop上传操作
         * @param src 本地资源全路径
         * @param dst 远程地址全路径
         * @param conf 对hadoop的设置
         * @return boolean 是否上传成功
         */
        public static boolean  putToHDFS(String src , String dst , Configuration conf){
            Path dstPath = new Path(dst) ;
            try{
                FileSystem hdfs = dstPath.getFileSystem(conf) ;
                hdfs.copyFromLocalFile(false, new Path(src), dstPath) ;
            }
            catch(IOException ie){
                ie.printStackTrace() ;
                return false ;
            }
            return true ;
        }
    }
    

    测试环境参数记录

    类别 型号 标准值 实际测试差异记录
    磁盘参数 磁盘型号 ST1000NM0018-2F2130
    - 磁盘类型 SATA-7200RPM SATA-7200RPM
    - 厂商 DELL
    - 单盘容量(GB) 1TB
    - FW版本 EA04
    - 磁盘所在位置 SPU:2个磁盘DSU2624(DSU-1:1:2):3个磁盘DSU2624(DSU-1:2:1):3个磁盘
    HADOOP环境参数 H节点个数 8个
    - 虚拟机CPU/内存 16线程/64GB内存
    - 前端网络 10GE 客户端为10GE,通过交换机连接10GE网络
    - 后端网络 10GE 同左
    - 其他 标准V2.7.7版本
    客户端参数 操作系统 CentOS Linux release 7.6.1810 (Core)
    - 内存 8GB
    - CPU Intel® Xeon® CPU E5-2609 v3 @ 1.90GHz(双CPU,每个CPU 6核12线程)
    - 访问hadoop速率 10Gb/s
    - 并发线程数 详见测试参数
    - 文件大小 32KB
    - 文件数量 100000000
    - 其他

    测试结果如下

    200并发情况下上传一亿个32KB文件

    操作总数 字节数 平均响应时间 平均处理时间 吞吐量 带宽 总用时
    100000000File 2.9795TB 1.08 ms 2.08 ms 481.25op/s 15.03MB/S 207792.21s≈58h

    附录

    官方工具的问题

    1.速度奇慢

    hadoop jar /home/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.7-tests.jar TestDFSIO -write -nrFiles 10000 -size 32KB
    

    这是官方的 TestDFSIO方法,蜗牛一般的速度,报异常还不少,远达不到hdfs的性能瓶颈

    2.TestDFSIO只能测试一个节点的性能,所有文件都会写在当前运行TestDFSIO方法的节点上,完全不是集群的压力测试,是单节点的测试。

    3.teragen 工具

    hadoop jar /home/hadoop/hadoop-2.7.7/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar teragen -Dmapred.map.tasks=100000000 -Ddfs.block.size=134217728 32 /terasort/output
    

    以上命令虽然定义了任务数量 -Dmapred.map.tasks=100000000,但事实上这个参数毫无作用,不论怎么设定这个参数,teragen 工具只能创建一个文件。然后无看了源码和官方介绍,teragen 工具只是为了给terasort测试工具生成一个测试用例,并不是什么压力测试工具。(网上的博客好深的水)

    展开全文
  • Hadoop测试——HDFS基准测试

    千次阅读 2019-01-14 21:12:02
    Hadoop基准测试   数据节点3台:8C/32GB /chunkdata01:1.2T   HDFS:3.46TB YARN:Memory:24GB=8G*3 Vcores:18C=6C*3   写入数据: TestDFSIO 第一次测试:失败 写入30*100GB(单个副本)的文件,执行...

    Hadoop基准测试

     

    数据节点3台:8C/32GB /chunkdata01:1.2T

     

    HDFS:3.46TB

    YARN:Memory:24GB=8G*3 Vcores:18C=6C*3

     

    写入数据:

    TestDFSIO

    • 第一次测试:失败

    写入30*100GB(单个副本)的文件,执行时间:11:06-11:56(失败)

    hadoop jar hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0-tests.jar TestDFSIO -write -nrFiles 30 -fileSize 100000

     

    作业执行50分钟左右被标记为UNASSIGNED,UNASSIGNED是在创建task阶段卡住了。

    查看yarn状态,发现3台nodemanager都已处于UNHEALTHY列表:

     

    查看3台数据节点的磁盘状态:

    发现HDFS和YARN作业日志的挂载点/chunkdata01达到91%,

    YARN默认,当挂载磁盘占用90%或者磁盘达到最小的空间大小0M时,会将该盘列入UNHEALTHY,若主机中达到UNHEALTHY列表的磁盘数达到1/4时,则该主机nodemanager会处于UNHEALTHY状态,不会再向该节点分配任务。

    该次测试满足以上条件,故3台节点都处于UNHEALTHY状态,导致测试失败。

    将yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage参数调整为99.0

    将yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb参数调整为2048

     

    • 第二次测试:成功

    写入10*100GB(单副本)的文件,执行时间:13:05-13:45,用时40分钟。

    hadoop jar hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 100000

    可以看到/benchmarks下的存储达到2.9T(3个副本)

     

    读取数据:

    读取10*100GB的文件,执行时间:17:48-18:00,用时11分钟左右。

    hadoop jar hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 100000

    测试结果:

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    ......

     

     

    展开全文
  • Hadoop基准测试

    千次阅读 2019-06-27 16:11:14
    利用hadoop自带基准测试工具包进行集群性能测试测试平台为CDH5.16上hadoop2.6版本 目录 /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/ 使用TestDFSIO、mrbench、nnbench、Terasort 、sort 几个使用较广的...

    利用hadoop自带基准测试工具包进行集群性能测试,测试平台为CDH5.16上hadoop2.6版本

    目录 /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/

    使用TestDFSIO、mrbench、nnbench、Terasort 、sort 几个使用较广的基准测试程序

     

    hadoop-mapreduce-client-jobclient-tests.jar

    不带参数运行,会显示示例说明

    hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar

    An example program must be given as the first argument.
    Valid program names are:
      DFSCIOTest: Distributed i/o benchmark of libhdfs.
      DistributedFSCheck: Distributed checkup of the file system consistency.
      JHLogAnalyzer: Job History Log analyzer.
      MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
      SliveTest: HDFS Stress Test and Live Data Verification.
      TestDFSIO: Distributed i/o benchmark.
      fail: a job that always fails
      filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed),            Text(Input|Output)Format (compressed and uncompressed)
      largesorter: Large-Sort tester
      loadgen: Generic map/reduce load generator
      mapredtest: A map/reduce test check.
      minicluster: Single process HDFS and MR cluster.
      mrbench: A map/reduce benchmark that can create many small jobs
      nnbench: A benchmark that stresses the namenode.
      sleep: A job that sleeps at each map and reduce task.
      testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
      testfilesystem: A test for FileSystem read/write.
      testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
      testsequencefile: A test for flat files of binary key value pairs.
      testsequencefileinputformat: A test for sequence file input format.
      testtextinputformat: A test for text input format.
      threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill

     

    1. TestDFSIO

    TestDFSIO用于测试HDFS的IO性能,使用一个MapReduce作业来并发地执行读写操作,每个map任务用于读或写每个文件,map的输出用于收集与处理文件相关的统计信息,reduce用于累积统计信息,并产生summary。

    查看说明:

    hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    TestDFSIO

    TestDFSIO.1.7
    Usage: TestDFSIO [genericOptions] -read [-random | -backward | -skip [-skipSize Size]] | -write | -append | -clean [-compression codecClassName] [-nrFiles N] [-size Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes]

     

    1. 测试HDFS写性能
    测试内容:向HDFS集群写10个128M的文件:

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    TestDFSIO \
    -write \
    -nrFiles 10 \
    -size 128MB \
    -resFile /tmp/TestDFSIO_results.log

    注意:因为是切换hdfs用户在Hadoop上运行,生成本地日志的路径可以不指定,但是得在hdfs用户有写的权径路下运行,生成的日志也在运行路径下,否则需要指定路径。


     

      查看结果:

    cat /tmp/TestDFSIO_results.log

    ----- TestDFSIO ----- : write
               Date & time: Thu Jun 27 13:46:41 CST 2019
           Number of files: 10
    Total MBytes processed: 1280.0
         Throughput mb/sec: 16.125374788984352
    Average IO rate mb/sec: 17.224742889404297
     IO rate std deviation: 4.657439940376364
        Test exec time sec: 28.751

     

    2. 测试HDFS读性能
    测试内容:读取HDFS集群10个128M的文件

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    TestDFSIO \
    -read \
    -nrFiles 10 \
    -size 128MB \
    -resFile /tmp/TestDFSIO_results.log


    3. 清除测试数据

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    TestDFSIO -clean

    19/06/27 13:57:21 INFO fs.TestDFSIO: TestDFSIO.1.7
    19/06/27 13:57:21 INFO fs.TestDFSIO: nrFiles = 1
    19/06/27 13:57:21 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
    19/06/27 13:57:21 INFO fs.TestDFSIO: bufferSize = 1000000
    19/06/27 13:57:21 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
    19/06/27 13:57:22 INFO fs.TestDFSIO: Cleaning up test files

     

    2.nnbench

    nnbench用于测试NameNode的负载,它会生成很多与HDFS相关的请求,给NameNode施加较大的压力。这个测试能在HDFS上模拟创建、读取、重命名和删除文件等操作。


    查看说明:

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    nnbench -help

    NameNode Benchmark 0.4
    Usage: nnbench <options>
    Options:
            -operation <Available operations are create_write open_read rename delete. This option is mandatory>
             * NOTE: The open_read, rename and delete operations assume that the files they operate on, are already available. The create_write operation must be run before running the other operations.
            -maps <number of maps. default is 1. This is not mandatory>
            -reduces <number of reduces. default is 1. This is not mandatory>
            -startTime <time to start, given in seconds from the epoch. Make sure this is far enough into the future, so all maps (operations) will start at the same time>. default is launch time + 2 mins. This is not mandatory 
            -blockSize <Block size in bytes. default is 1. This is not mandatory>
            -bytesToWrite <Bytes to write. default is 0. This is not mandatory>
            -bytesPerChecksum <Bytes per checksum for the files. default is 1. This is not mandatory>
            -numberOfFiles <number of files to create. default is 1. This is not mandatory>
            -replicationFactorPerFile <Replication factor for the files. default is 1. This is not mandatory>
            -baseDir <base DFS path. default is /becnhmarks/NNBench. This is not mandatory>
            -readFileAfterOpen <true or false. if true, it reads the file and reports the average time to read. This is valid with the open_read operation. default is false. This is not mandatory>
            -help: Display the help statement


    测试使用10个mapper和5个reducer来创建1000个文件:

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar nnbench \
    -operation create_write \
    -maps 10 \
    -reduces 5 \
    -blockSize 1 \
    -bytesToWrite 0 \
    -numberOfFiles 1000 \
    -replicationFactorPerFile 3 \
    -readFileAfterOpen true \
    -baseDir /benchmarks/NNBench-`hostname`

    HDFS上存储的结果:

     

    3. mrbench

    mrbench会多次重复执行一个小作业,用于检查在机群上小作业的运行是否可重复以及运行是否高效。 

    查看说明:

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    mrbench -help

    MRBenchmark.0.0.2
    Usage: mrbench [-baseDir <base DFS path for output/input, default is /benchmarks/MRBench>] [-jar <local path to job jar file containing Mapper and Reducer implementations, default is current jar file>] [-numRuns <number of times to run the job, default is 1>] [-maps <number of maps for each run, default is 2>] [-reduces <number of reduces for each run, default is 1>] [-inputLines <number of input lines to generate, default is 1>] [-inputType <type of input to generate, one of ascending (default), descending, random>] [-verbose]

    测试运行一个作业50次:

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar \
    mrbench \
    -numRuns 50 \
    -maps 10 \
    -reduces 5 \
    -inputLines 10 \
    -inputType descending

     

     

    hadoop-mapreduce/hadoop-mapreduce-examples.jar

    An example program must be given as the first argument.
    Valid program names are:
      aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
      aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
      bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
      dbcount: An example job that count the pageview counts from a database.
      distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
      grep: A map/reduce program that counts the matches of a regex in the input.
      join: A job that effects a join over sorted, equally partitioned datasets
      multifilewc: A job that counts words from several files.
      pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
      pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
      randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
      randomwriter: A map/reduce program that writes 10GB of random data per node.
      secondarysort: An example defining a secondary sort to the reduce.
      sort: A map/reduce program that sorts the data written by the random writer.
      sudoku: A sudoku solver.
      teragen: Generate data for the terasort
      terasort: Run the terasort
      teravalidate: Checking results of terasort
      wordcount: A map/reduce program that counts the words in the input files.
      wordmean: A map/reduce program that counts the average length of the words in the input files.
      wordmedian: A map/reduce program that counts the median length of the words in the input files.
      wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

    4. Terasort

    Terasort是测试Hadoop的一个有效的排序程序。通过Hadoop自带的Terasort排序程序,测试不同的Map任务和Reduce任务数量,对Hadoop性能的影响。 实验数据由程序中的teragen程序生成,数量为1G和10G。

    一个TeraSort测试需要按三步:
    1. TeraGen生成随机数据
    2. TeraSort对数据排序
    3. TeraValidate来验证TeraSort输出的数据是否有序,如果检测到问题,将乱序的key输出到目录

     

    1. TeraGen生成随机数,将结果输出到目录/tmp/examples/terasort-intput

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    teragen 10000000 /tmp/examples/terasort-input

     

    HDFS上的数据

    2. TeraSort排序,将结果输出到目录/tmp/examples/terasort-output

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    terasort /tmp/examples/terasort-input /tmp/examples/terasort-output

    HDFS上的数据

     

    3.TeraValidate验证,如果检测到问题,将乱序的key输出到目录/tmp/examples/terasort-validate

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    teravalidate /tmp/examples/terasort-output /tmp/examples/terasort-validate

    HDFS上的结果

     

    5. 另外,常使用的还有sort程序评测MapReduce

    1. randomWriter产生随机数,每个节点运行10个Map任务,每个Map产生大约1G大小的二进制随机数

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    randomwriter /tmp/examples/random-data

    2. sort排序

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    sort /tmp/examples/random-data /tmp/examples/sorted-data

    3. testmapredsort验证数据是否真正排好序了

    sudo -uhdfs hadoop jar \
    /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar \
    testmapredsort \
    -sortInput /tmp/examples/random-data \ 
    -sortOutput /tmp/examples/sorted-data

     

    展开全文
  • 台湾hadoop测试环境

    千次阅读 2010-04-09 11:08:00
    网址:http://hadoop.nchc.org.tw 用户名,密码自己去申请吧。 ssh hadoop.nchc.org.tw -l h655

    网址:http://hadoop.nchc.org.tw 

    用户名,密码自己去申请吧。

     

    ssh hadoop.nchc.org.tw -l h655

    展开全文
  • hadoop功能测试

    千次阅读 2019-04-14 14:10:07
    第六章主要是Hadoop的功能测试,本小节主要是叙述相关章节 6.功能测试   6.1 执行上传文件   6.2 执行下载文件 (to be continued ) 快速链接 上一篇 汇总目录 下一篇 云计算数据平台-hadoop集群 ...
  • hadoop测试

    千次阅读 2013-11-18 09:25:27
    31. NameNode 的 Web UI 端口是 50030,它通过 jetty 启动的 Web 服务。( ) 32. Hadoop 环境变量中的 HADOOP_HEAPSIZE 用于设置所有 Hadoop 守护线程的内存。它默 认是 200 GB。( ) 33. DataNode 首次加入...
  • [Hadoop]Hadoop单元测试MRUnit

    千次阅读 2016-12-08 20:12:17
    1. 设置开发环境从(https://repository.apache.org/content/repositories/releases/org/apache/mrunit/mrunit/)下载最新版本的MRUnit jar,例如如果你使用的hadoop版本为1.0.3,则需要下载mrunit-x.x.x-incubating...
  • hadoop基本测试方法

    千次阅读 2017-12-07 15:53:46
    最近在做hadoop的时候,总结了一些hadoop的一些基本的功能运用和一些基本性能测试,记录一下,希望可以帮助大家,由于输出信息太多,在这里只介绍命令,具体输出信息大家可以自己试验一下。不同的hadoop版本里自带的...
  • hadoop测试环境搭建1

    千次阅读 2017-04-23 15:41:41
    在虚拟机里安装hadoop,关闭防火墙、配置ip 我的系统环境为:win8系统、vmware11、centos81、设置虚拟机网络通信模式为host only模式 在虚拟机目录下找到vmnetcfg文件 2、设置windows连接 3、修改linux系统...
  • hadoop基准测试

    千次阅读 2011-09-09 15:24:18
    Hadoop 集群基准测试 一、测试条件 集群完全安装配置后,应立即开始基准测试。基准测试时集群里不应该运行其他一切任务。 二、测试目标 1. 硬盘故障:新系统最常见的故障。可以通过运行高强度的IO基准测试程序...
  • HADOOP读写性能测试

    千次阅读 2017-06-11 13:05:44
    可使用hadoop自带的工具进行读写性能测试
  • 运行hadoop基准测试

    万次阅读 2011-11-03 11:14:59
    由于需要为hadoop集群采购新的服务器,需要对服务器在hadoop环境下的性能进行测试,所以特地整理了一下hadoop集群自带的测试用例: bin/hadoop jar hadoop-*test*.jar 运行上述命令,可以得到hadoop-*test*....
  • hadoop性能测试

    千次阅读 2015-06-16 16:49:18
    1、测试写性能 (1)若有必要,先删除历史数据 $hadoop jar /home/hadoop/hadoop/share/hadoop/mapreduce2/hadoop-mapreduce-client-jobclient-2.3.0-cdh5.1.2-tests.jar TestDFSIO -clean
  • Hadoop平台测试

    千次阅读 2014-06-09 13:08:16
    (1)对Hadoop进行格式化处理,代码: Hadoop Namenode –format 易错:版本问题可能导致:Hadoop 未知命令,或者没有Namenode的问题 解决:hadoop namenode -format (2)启动整个集群,代码: ...
  • Hadoop测试-简单分布式程序运行

    千次阅读 2015-06-13 09:38:14
    (1)首先切换到hadoop的安装目录下,可以看到hadoop的安装目录中已经自带了一些小程序,都打包在hadoop-examples-1.2.1.jar中,下一步就是如何打开jar包。 (2)在命令行中,直接键入hadoop,可以显示各种命令用法...
  • Hadoop基准测试工具

    千次阅读 2016-07-10 15:52:38
    为了能对系统有更全面的了解、能找到系统的瓶颈所在、能对系统性能做更好的改进,打算先从测试入手,学习Hadoop几种主要的测试手段。本文将分成两部分:第一部分记录如何使用Hadoop自带的测试工具
  • 1.首先确认linux服务器安装好了hadoop安装教程:http://blog.csdn.net/sunweijm/article/details/783997262.使用IDEA编写mapreducer的demo.2.1 IDEA创建一个maven项目,项目名称为WordCount2.2 配置Project Settings...
  • Hadoop 磁盘配额测试

    千次阅读 2015-08-19 11:04:58
    Hadoop 磁盘配额测试
  • hadoop安装测试

    千次阅读 2011-11-08 12:14:18
    操作系统 操作系统使用Ubuntu 11.04桌面版。... passwd root ...下载hadoop, hadoop-0.20.204.0.tar.gz 安装JDK 安装java6,将JDK复制到/usr/local目录下,使用下面命令安装 sudo sh jdk-
  • Hadoop 2.2.0测试环境搭建

    千次阅读 2014-03-25 22:50:40
    引言: 基于64位的Ubuntu系统,利用VirtualBox建立两个节点,搭建Hadoop 2.2.0的测试开发环境。
  • Hadoop集群性能测试

    千次阅读 2016-04-13 10:26:20
    前言测试方法临时的一个小测试,主要目的是测试一下集群的IO。...hadoop benchmark 系统级别测试通过对集群节点测试,块写40G耗时49秒,磁盘写IO 873MB/s,读IO 1022.49MB/s,点对点网络IO大概110MB/
  • Hadoop统计词频测试

    千次阅读 2016-04-28 10:12:52
    上篇成功配置hadoop之后,此篇为测试 1、在home/data文件夹中创建words文件,文件内容如下 hello a hello b 2、进入hadoop根目录,使用命令 bin/hadoop fs -help查看hdfs对文件的操作帮助 使用bin/hadoop fs ...
  • Hadoop

    万次阅读 多人点赞 2019-09-16 22:44:08
    Hadoop简介 Hadoop 的思想之源:Google 第一遇到大数据计算问题的公司 Openstack :NASA 面对的数据和计算难题 - 大量的网页怎么存储 - 搜索算法 带给我们的关键技术和思想(Google三大理论) - GFS 文件存储 - Map-...
  • Hadoop测试(在cygwin+window环境下)

    千次阅读 2008-07-24 10:34:00
    Hadoop测试1/安装cygwin2/下载hadoop3/配置ssh(参看Cygwin + OpenSSH FOR Windows的安装配置CygWin:一个将Linux自由软件带入Windows中应用的绝佳工具,网址:...
  • Hadoop现有测试框架探幽

    千次阅读 2012-10-12 18:32:19
    Hadoop现有测试框架探幽 背景 从使用hadoop的第一天开始,就一直没有离开过对Hadoop自身功能的开发以及hadoop本身bug的修复的相关开发。这样的开发模式已经持续了好几年,但是可以从中发现的一个现象:对于我们...
  • Loadrunner通过ssh连接调用hadoop测试Jar包进行基准测试,似乎有点讨巧,而且好像实际意义也不是特别大,但是通过这个方法的展示,能够看到Loadrunner的生命力有多大,而且在linux系统测试和开源技术的测试中,也...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 127,412
精华内容 50,964
关键字:

网上hadoop测试服务