3. 安装Hadoop开发插件:
4. 启动Eclipse,打开Perspective:
【Window】->【Open Perspective】->【Other...】->【Map/Reduce】->【OK】
5. 打开一个View:
【Window】->【Show View】->【Other...】->【MapReduce Tools】->【Map/Reduce Locations】->【OK】
6.添加Hadoop location:
在Advanced parameters中修改:
hadoop.tmp.dir=/home/xsj/hadoop/hadoop-xsj
mapred.child.java.opts=-Xmx512m
注意:hadoop.tmp.dir的修改将影响到mapred.local.dir、mapred.system.dir、mapred.temp.dir、fs.s3.buffer.dir、fs.checkpoint.dir、fs.checkpoint.edits.dir、dfs.name.dir、dfs.name.edits.dir、dfs.data.dir等项,因此修改后需要重新启动Eclipse。
mapred.child.java.opts的修改可能最开始没有这一项,可以在后面再行修改,这里的512是我的虚拟机Ubuntu的内存大小,具体情形具体分析。
修改之后可以看到以下HDFS的视图:
7. 添加文本文件:
$ ./hadoop fs -mkdir /user/xsj/input
$ ./hadoop fs -put ./*.txt /user/xsj/input
8. 新建Map/Reduce Project:
【File】->【New】->【Project...】->【Map/Reduce】->【Map/Reduce Project】->【Project name: WordCount】->【Configure Hadoop install directory...】->【Hadoop installation directory: /home/xsj/hadoop/hadoop-0.20.2】->【Apply】->【OK】->【Next】->【Allow output folders for source folders】->【Finish】
9. 新建WordCount类:
【WordCount】->【src】->【New】->【Class】->【Package: org.apache.hadoop.examples】->【Name: WordCount】->【Finish】
添加/编写源代码:
/home/xsj/hadoop/hadoop-0.20.2/src/examples/org/apache/hadoop/examples/WordCount.java
package org.apache.hadoop.examples;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
<wbr>public static class TokenizerMapper<wbr></wbr></wbr>
<wbr><wbr><wbr><wbr>extends Mapper<Object, Text, Text, IntWritable>{</wbr></wbr></wbr></wbr>
<wbr><wbr><wbr></wbr></wbr></wbr>
<wbr><wbr>private final static IntWritable one = new IntWritable(1);</wbr></wbr>
<wbr><wbr>private Text word = new Text();</wbr></wbr>
<wbr><wbr><wbr><wbr></wbr></wbr></wbr></wbr>
<wbr><wbr>public void map(Object key, Text value, Context context</wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>) throws IOException, InterruptedException {</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr>StringTokenizer itr = new StringTokenizer(value.toString());</wbr></wbr></wbr>
<wbr><wbr><wbr>while (itr.hasMoreTokens()) {</wbr></wbr></wbr>
<wbr><wbr><wbr><wbr>word.set(itr.nextToken());</wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr>context.write(word, one);</wbr></wbr></wbr></wbr>
<wbr><wbr><wbr>}</wbr></wbr></wbr>
<wbr><wbr>}</wbr></wbr>
<wbr>}</wbr>
<wbr><wbr></wbr></wbr>
<wbr>public static class IntSumReducer<wbr></wbr></wbr>
<wbr><wbr><wbr><wbr>extends Reducer<Text,IntWritable,Text,IntWritable> {</wbr></wbr></wbr></wbr>
<wbr><wbr>private IntWritable result = new IntWritable();</wbr></wbr>
<wbr><wbr>public void reduce(Text key, Iterable<IntWritable> values,<wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>Context context</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>) throws IOException, InterruptedException {</wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr></wbr>
<wbr><wbr><wbr>int sum = 0;</wbr></wbr></wbr>
<wbr><wbr><wbr>for (IntWritable val : values) {</wbr></wbr></wbr>
<wbr><wbr><wbr><wbr>sum += val.get();</wbr></wbr></wbr></wbr>
<wbr><wbr><wbr>}</wbr></wbr></wbr>
<wbr><wbr><wbr>result.set(sum);</wbr></wbr></wbr>
<wbr><wbr><wbr>context.write(key, result);</wbr></wbr></wbr>
<wbr><wbr>}</wbr></wbr>
<wbr>}</wbr>
<wbr>public static void main(String[] args) throws Exception {</wbr>
<wbr><wbr>Configuration conf = new Configuration();</wbr></wbr>
<wbr><wbr>String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();</wbr></wbr>
<wbr><wbr>if (otherArgs.length != 2) {</wbr></wbr>
<wbr><wbr><wbr>System.err.println("Usage: wordcount <in> <out>");</wbr></wbr></wbr>
<wbr><wbr><wbr>System.exit(2);</wbr></wbr></wbr>
<wbr><wbr>}</wbr></wbr>
<wbr><wbr>Job job = new Job(conf, "word count");</wbr></wbr>
<wbr><wbr>job.setJarByClass(WordCount.class);</wbr></wbr>
<wbr><wbr>job.setMapperClass(TokenizerMapper.class);</wbr></wbr>
<wbr><wbr>job.setCombinerClass(IntSumReducer.class);</wbr></wbr>
<wbr><wbr>job.setReducerClass(IntSumReducer.class);</wbr></wbr>
<wbr><wbr>job.setOutputKeyClass(Text.class);</wbr></wbr>
<wbr><wbr>job.setOutputValueClass(IntWritable.class);</wbr></wbr>
<wbr><wbr>FileInputFormat.addInputPath(job, new Path(otherArgs[0]));</wbr></wbr>
<wbr><wbr>FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));</wbr></wbr>
<wbr><wbr>System.exit(job.waitForCompletion(true) ? 0 : 1);</wbr></wbr>
<wbr>}</wbr>
}
10. 配置运行参数:
【Run】->【Run Configurations】->【Java Application】->【Word Count】->【Arguments】->【Program arguments: hdfs://localhost:9000/user/xsj/input/* hdfs://localhost:9000/user/xsj/output】->【VM arguments: -Xms512m -Xmx512m】->【Apply】->【Close】->【Run】->【Run As】->【Run On Hadoop】
11. 运行:
控制台正常输出:
12/06/01 10:23:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
12/06/01 10:23:33 WARN mapred.JobClient: No job jar file set. <wbr>User classes may not be found. See JobConf(Class) or JobConf#setJar(String).</wbr>
12/06/01 10:23:36 INFO input.FileInputFormat: Total input paths to process : 2
12/06/01 10:23:37 INFO mapred.JobClient: Running job: job_local_0001
12/06/01 10:23:37 INFO input.FileInputFormat: Total input paths to process : 2
12/06/01 10:23:37 INFO mapred.MapTask: io.sort.mb = 100
12/06/01 10:23:40 INFO mapred.MapTask: data buffer = 79691776/99614720
12/06/01 10:23:40 INFO mapred.MapTask: record buffer = 262144/327680
12/06/01 10:23:44 INFO mapred.JobClient: <wbr>map 0% reduce 0%</wbr>
12/06/01 10:23:52 INFO mapred.MapTask: Starting flush of map output
12/06/01 10:23:59 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:23:59 INFO mapred.MapTask: Finished spill 0
12/06/01 10:24:00 INFO mapred.JobClient: <wbr>map 100% reduce 0%</wbr>
12/06/01 10:24:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/06/01 10:24:03 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:24:03 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
12/06/01 10:24:04 INFO mapred.MapTask: io.sort.mb = 100
12/06/01 10:24:08 INFO mapred.MapTask: data buffer = 79691776/99614720
12/06/01 10:24:08 INFO mapred.MapTask: record buffer = 262144/327680
12/06/01 10:24:10 INFO mapred.MapTask: Starting flush of map output
12/06/01 10:24:10 INFO mapred.MapTask: Finished spill 0
12/06/01 10:24:11 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
12/06/01 10:24:12 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:24:12 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done.
12/06/01 10:24:14 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:24:16 INFO mapred.Merger: Merging 2 sorted segments
12/06/01 10:24:17 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 77 bytes
12/06/01 10:24:17 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:24:20 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/06/01 10:24:20 INFO mapred.LocalJobRunner:<wbr></wbr>
12/06/01 10:24:20 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/06/01 10:24:21 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://localhost:9000/user/xsj/output
12/06/01 10:24:21 INFO mapred.LocalJobRunner: reduce > reduce
12/06/01 10:24:21 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
12/06/01 10:24:22 INFO mapred.JobClient: <wbr>map 100% reduce 100%</wbr>
12/06/01 10:24:22 INFO mapred.JobClient: Job complete: job_local_0001
12/06/01 10:24:22 INFO mapred.JobClient: Counters: 14
12/06/01 10:24:22 INFO mapred.JobClient: <wbr>FileSystemCounters</wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>FILE_BYTES_READ=50488</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>HDFS_BYTES_READ=120</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>FILE_BYTES_WRITTEN=102748</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>HDFS_BYTES_WRITTEN=41</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr>Map-Reduce Framework</wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Reduce input groups=5</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Combine output records=6</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Map input records=4</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Reduce shuffle bytes=0</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Reduce output records=5</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Spilled Records=12</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Map output bytes=81</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Combine input records=8</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Map output records=8</wbr></wbr>
12/06/01 10:24:22 INFO mapred.JobClient: <wbr><wbr>Reduce input records=6</wbr></wbr>
12. 查看运行结果:
/user/xsj/output/part-r-00000文件为输出文件:
分享到:
相关推荐
myeclipse运行hadoop文件,通过他能熟练掌握
myeclipse以及eclipse开发hadoop所使用的插件,已经亲测过hadoop2.6.5可用
Myeclipse配置ssh开发环境图解。通过这个图解可以学会配置正确的ssh,快速方便
一.所需软件 注意:安装MyEclipse后,我再已安装软件里找不到MyEclipse,所以我在root...先把hadoop-eclipse-plugin-1.2.1.jar复制到ubuntu桌面(我的不能直接复制到MyEclipse的dropins目录下,我同学的能,真是
MyEclipse 7.0+Tomcat6.0配置安装开发环境
这里已经详细说明了安卓开发环境在Myeclipse下的搭建,只要按照步骤,9成是可以搭建成功的!
myEclipse中配置struts开发环境
myeclipse配置Hadoop插件
myEclipse_hadoop,解压即可用。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。...
Myeclipse配置环境,Myeclipse配置环境,Myeclipse配置环境
零基礎學習入門(無liunux基礎,無hadoop基礎) Myeclipse開發Hadoop基礎 一步一步解決各種bug
Flex4 blazed MyEclipse8.5配置开发环境
一步一步地教你java配置MyEclipse开发环境
Myeclipse连接hadoop 详细过程讲解
MyEclipse+Tomcat开发环境配置
使用MyEclipse进行JSP开发环境配置教程,按教程一步步来就可以编写第一个jsp 程序!
Eclipse+MyEclipse开发环境搭建Eclipse+MyEclipse开发环境搭建Eclipse+MyEclipse开发环境搭建Eclipse+MyEclipse开发环境搭建Eclipse+MyEclipse开发环境搭建
hadoop-common-2.6.0-bin-master-x64上可用,hadoop-common-2.2.0-bin-master,hadoop-common-2.6.0-bin,hadoop-eclipse-plugin-2.6.0,hadoop-eclipse-plugin-2.2.0,以及使用教程
搭建Eclipse+MyEclipse+tomcat开发环境
搭建Eclipse+MyEclipse的JSP开发环境