实验环境:
操作系统:Ubuntu 16.04 LTS
Hadoop版本:Hadoop 2.7.1
一、安装JAVA环境
dblab@dblab-VirtualBox:/$ sudo apt-get install default-jre default-jdk
dblab@dblab-VirtualBox:/$ vim ~/.bashrc
export JAVA_HOME=/usr/lib/jvm/default-java
dblab@dblab-VirtualBox:/$ source ~/.bashrc #使变量设置生效
dblab@dblab-VirtualBox:/$ echo $JAVA_HOME
/usr/lib/jvm/default-java
dblab@dblab-VirtualBox:/$ java -version
openjdk version “1.8.0_131”
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
二、安装Hadoop
dblab@dblab-VirtualBox:/$ cd ~/下载
dblab@dblab-VirtualBox:~/下载$ wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.5/hadoop-2.6.5.tar.gz
dblab@dblab-VirtualBox:~/下载$ sudo tar -zxvf hadoop-2.6.5.tar.gz -C /usr/local
dblab@dblab-VirtualBox:/usr/local$ sudo mv ./hadoop-2.6.5/ ./hadoop #更改文件夹名称
dblab@dblab-VirtualBox:/usr/local$ sudo chown -R hadoop ./hadoop/ #修改文件夹权限
dblab@dblab-VirtualBox:/usr/local/hadoop$ ./bin/hadoop version #显示版本信息
Hadoop 2.7.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 15ecc87ccf4a0228f35af08fc56de536e6ce657a
Compiled by jenkins on 2015-06-29T06:04Z
Compiled with protoc 2.5.0
From source with checksum fc0a1a23fc1868e4d5ee7fa2b28a58a
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar
三、Hadoop单机配置
Hadoop 默认模式为非分布式模式(本地模式),无需进行其他配置即可运行。
可以执行例子来感受下 Hadoop 的运行。Hadoop 附带了丰富的例子
dblab@dblab-VirtualBox:/usr/local/hadoop$ ./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
$cd /usr/local/hadoop
$sudo mkdir ./input
$sudo cp ./etc/hadoop/*.xml ./input #将配置文件作为输入文件
$./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep ./input ./output ‘dfs[a-z.]+’
$cat ./output/* # 查看运行结果
dblab@dblab-VirtualBox:/usr/local/hadoop$ cat ./output/*
1 dfsadmin
原创文章,作者:奋斗,如若转载,请注明出处:https://blog.ytso.com/193887.html