這篇文章主要講解了“java map reduce怎么實(shí)現(xiàn)”,文中的講解內(nèi)容簡(jiǎn)單清晰,易于學(xué)習(xí)與理解,下面請(qǐng)大家跟著小編的思路慢慢深入,一起來(lái)研究和學(xué)習(xí)“java map reduce怎么實(shí)現(xiàn)”吧!
成都創(chuàng)新互聯(lián)是一家專業(yè)提供天峨企業(yè)網(wǎng)站建設(shè),專注與做網(wǎng)站、成都做網(wǎng)站、H5頁(yè)面制作、小程序制作等業(yè)務(wù)。10年已為天峨眾多企業(yè)、政府機(jī)構(gòu)等服務(wù)。創(chuàng)新互聯(lián)專業(yè)的建站公司優(yōu)惠進(jìn)行中。
輸入文件內(nèi)容:
a a1
b b2
c c3
d d4
a a1
b b2
c c3
d d4
輸出:
a a1|0 a1|20
b b2|5 b2|25
c c3|10 c3|30
d d4|15 d4|35
代碼:
import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, Text>{ public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] oriSegs = value.toString().split("\t"); String str = oriSegs[1] + "|" + key; context.write(new Text(oriSegs[0]), new Text(str)); } } public static class IntSumReducer extends Reducer<Text, Text, Text, Text> { public void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException { String out = ""; for (Text val: values) { if (!out.equals("")) { out += '\t'; } out += val.toString(); } context.write(key, new Text(out)); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.set("mapred.job.queue.name", "platform"); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setNumReduceTasks(1); //set reducer number FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
編譯:make.sh 編譯成jar文件
javac -classpath /home/hadoop/hadoop-0.20.2-cdh4u0/hadoop-core-0.20.2-cdh4u0.jar:/home/hadoop/hadoop-0.20.2-cdh4u0/lib/commons-cli-1.2.jar -d wordcount_class WordCount.java jar -cvf WordCount.jar -C wordcount_class/ .
執(zhí)行map reduce任務(wù):exec.sh
IN=/user/zhumingliang/tanx_rtb_account/input OUT=/user/zhumingliang/tanx_rtb_account/output/test hadoop jar WordCount.jar WordCount $IN $OUT
注意:
mapper的輸入key在針對(duì)文件輸入時(shí),是一行起始位置在文件中的字符序號(hào);而mapper的輸入value則為整行內(nèi)容。
reducer的輸入key則為mapper的輸出key; reducer的輸入value則為mapper的輸出value。
感謝各位的閱讀,以上就是“java map reduce怎么實(shí)現(xiàn)”的內(nèi)容了,經(jīng)過(guò)本文的學(xué)習(xí)后,相信大家對(duì)java map reduce怎么實(shí)現(xiàn)這一問(wèn)題有了更深刻的體會(huì),具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是創(chuàng)新互聯(lián),小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章,歡迎關(guān)注!
當(dāng)前題目:javamapreduce怎么實(shí)現(xiàn)
分享鏈接:http://chinadenli.net/article30/ppdpso.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供定制網(wǎng)站、標(biāo)簽優(yōu)化、Google、手機(jī)網(wǎng)站建設(shè)、服務(wù)器托管、網(wǎng)站排名
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)