MR程序的组件combiner怎么使用

这篇文章主要介绍“MR程序的组件combiner怎么使用”，在日常操作中，相信很多人在MR程序的组件combiner怎么使用问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”MR程序的组件combiner怎么使用”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

创新互联服务项目包括黄平网站建设、黄平网站制作、黄平网页制作以及黄平网络营销策划等。多年来，我们专注于互联网行业，利用自身积累的技术优势、行业经验、深度合作伙伴关系等，向广大中小型企业、政府机构等提供互联网行业的解决方案，黄平网站推广取得了明显的社会效益与经济效益。目前，我们服务的客户以成都为中心已经辐射到黄平省份的部分城市，未来相信会继续扩大服务区域并继续获得客户的支持与信任！

用一句简单的话语描述combiner组件作用：降低map任务输出,减少reduce任务数量,从而降低网络负载

工作机制：

Map任务允许在提交给Reduce任务之前在本地执行一次汇总的操作，那就是combiner组件，combiner组件的行为模式和Reduce一样，都是接收key/values，产生key/value输出

MR程序的组件combiner怎么使用

注意：

1、combiner的输出是reduce的输入

2、如果combiner是可插拔的，那么combiner绝不能改变最终结果

3、combiner是一个优化组件,但是并不是所有地方都能用到,所以combiner只能用于reduce的输入、输出key/value类型完全一致且不影响最终结果的场景。

例子：WordCount程序中,通过统计每一个单词出现的次数,我们可以首先通过Map任务本地进行一次汇总(Combiner)，然后将汇总的结果交给Reduce，完成各个Map任务存在相同KEY的数据进行一次总的汇总，图：

MR程序的组件combiner怎么使用

Combiner代码：

Combiner类，直接打开Combiner类源码是直接继承Reducer类，所以我们直接继承Reducer类即可，最终在提交时指定咱们定义的Combiner类即可

package com.itheima.hadoop.mapreduce.combiner;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class WordCountCombiner extends
        Reducer {

    @Override
    protected void reduce(Text key, Iterable values, Context context)
            throws IOException, InterruptedException {
        long count = 0 ;
        for (LongWritable value : values) {
            count += value.get();
        }
        context.write(key, new LongWritable(count));
    }

}

Mapper类：

package com.itheima.hadoop.mapreduce.mapper;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WordCountCombinerMapper extends
        Mapper {

    public void map(LongWritable key, Text value, Context context)
            throws java.io.IOException, InterruptedException {
        
        String line = value.toString(); //获取一行数据
        String[] words = line.split(" "); //获取各个单词
        for (String word : words) {
            // 将每一个单词写出去
            context.write(new Text(word), new LongWritable(1));
        }
        
        
        
    }

}

驱动类：

package com.itheima.hadoop.drivers;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;

import com.itheima.hadoop.mapreduce.combiner.WordCountCombiner;
import com.itheima.hadoop.mapreduce.mapper.WordCountCombinerMapper;

public class WordCountCombinerDriver extends Configured implements Tool{

    @Override
    public int run(String[] args) throws Exception {
        /**
         * 提交五重奏：
         * 1、产生作业
         * 2、指定MAP/REDUCE
         * 3、指定MAPREDUCE输出数据类型
         * 4、指定路径
         * 5、提交作业
         */
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf);
        job.setJarByClass(WordCountCombinerDriver.class);
        job.setMapperClass(WordCountCombinerMapper.class);
        
        /***此处中间小插曲：combiner组件***/
        job.setCombinerClass(WordCountCombiner.class);
        /***此处中间小插曲：combiner组件***/
        
        //reduce逻辑和combiner逻辑一致且combiner又是reduce的子类
        job.setReducerClass(WordCountCombiner.class);
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(LongWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(LongWritable.class);
        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        return job.waitForCompletion(true) ? 0 : 1;
    }

}

主类：

package com.itheima.hadoop.runner;

import org.apache.hadoop.util.ToolRunner;

import com.itheima.hadoop.drivers.WordCountCombinerDriver;

public class WordCountCombinerRunner {

    public static void main(String[] args) throws Exception {
        
        int res = ToolRunner.run(new WordCountCombinerDriver(), args);
        System.exit(res);
    }
}

运行结果：

MR程序的组件combiner怎么使用

到此，关于“MR程序的组件combiner怎么使用”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注创新互联网站，小编会继续努力为大家带来更多实用的文章！

分享名称：MR程序的组件combiner怎么使用
本文地址：http://cdxtjz.cn/article/jhcjoi.html

MR程序的组件combiner怎么使用

其他资讯