æç¨çæ¯Ubuntu 12ï¼å
åå¤å¥½ä¸äºè½¯ä»¶/å·¥å
·ï¼é¾æ¥å为æ°æµªå¾®çï¼ã
· VMWare Workstation ï¼å»å®ç½å
è´¹ä¸ï¼
· ubuntu-12.04.1-desktop-i386.iso
· jdk-7u7-windows-i586.rar
· å èå¸åä¸å¼ºè°hadoopä¸åçæ¬é´çå·®å¼å¤§ï¼æ°ææ好ä¸èå¸ä½¿ç¨ç¸åçæ¬çhadoopï¼å³ hadoop-0.20.2.tar.gz
· WinSCP (æç¨å°ç) , PuTTYæSecureCRT 以å°jdk, hadoopä¼ éå°ubuntu
å®è£
Ubuntu
åºæ¬æ²¡æä»»ä½å¯ä»¥æ³¨æçï¼å®è£
å®æåæçæ¯é»è®¤è¿å
¥å½ä»¤è¡æ¨¡å¼ï¼startxè¿å
¥GUIå¾å½¢çé¢æ¨¡å¼
Ubuntuå¯ä»¥è°displayå辨ç使å¾GUI大å°èæç¹ï¼æç´¢terminalå¯æå¼å½ä»¤è¡å·¥å
·ï¼ctrl+alt+f1~f6ï¼å¨å½ä»¤è¡æ¨¡å¼ä¸alt + å·¦å³é®å¯åæ¢ä¸åæ¡é¢ã
é
ç½®ç½ç» ï¼éhadoopå®è£
å¿
é¡»æ¥éª¤ï¼
å 为群éé¢ææåæ¯ç¨çæ¡¥æ¥å¿
é¡»ç¨åä¸ç½æ®µï¼æ以æ们åæºä¼ç©äºä¸ä¸ç½ç»è®¾ç½®ï¼æ³¨ï¼ææ³è¿ä¸ªä¸æ¯hadoopå®è£
çå¿
é¡»æ¥éª¤ï¼ãUbuntuå 为ænetwork-managerçç¼æ
ï¼ä¸è¿å»ä¸éè¦ä»»ä½è®¾ç½®å°±å¯ä»¥ä¸ç½äºï¼æå¼settings > networkå°±å¯çå°ç½ç»é
ç½®ï¼ä½è¿ä¸ªå 为åºäºDHCPãæéè¿sudo vi /etc/network/interfaces设置çIPï¼éå¯åå被network-manageræ¹åå»äºï¼å¨è¿ç¯æç« ä¸æå°è¿ä¸¤ç§æ¹æ³æ¯ç¸äºå²çªçï¼éé¢æ讲å°å¦ä½å¤çè¿ç§æ
åµï¼æç´æ¥ç²ççä½¿ç¨ sudo apt-get autoremove network-manager -- purgeæå®ç»å¸è½½äºã
autoremove : 'autoremove' removes all package that got automatically installed to satisfy, --purge option makes apt-get to remove config files
æ¥éª¤ï¼é
ç½®éæIP > DNS > host name > hosts
é
ç½®éæIP
å¨VM > settings > networkä¸å¯ä»¥çåºæ使ç¨çæ¯VMWareé»è®¤çNATæ¹å¼ï¼è¿å¿è§£é为ï¼ä½¿ç¨NATå¯ä»¥ä½¿èææºå宿主æºå¯ä»¥ç¸äºpingï¼å
¶ä»ä¸»æºæ æ³pingèææºï¼ï¼ä½¿ç¨è¿ç§ç¡®å®æ é¡»HOSTåVM使ç¨åä¸ç½æ®µIPå´ä»è½åå°ç¸äºpingéã
è¿ä¸è
çåºå«ï¼æå
´è¶£çå¯ä»¥æç´¢ âVMWare æ¡¥æ¥,NAT,Host Onlyçåºå«âãå¨VMWare Workstationèå>Edit>Virtual Network Editorä¸å¯ä»¥çå°NAT使ç¨çæ¯å®è£
VMWareæ¶å®èªå¨èæåºæ¥ç两个ç½å¡ä¸çVMnet8.
ç¹å»NAT Settingså¯ä»¥çå°
å¾å°å¦ä¸ä¿¡æ¯ï¼
ç½å
³: 192.168.221.2
IPç½æ®µï¼192.168.221.128~254
åç½æ©ç ï¼255.255.255.0
:sudo vi /etc/network/interfaces
(å
³äºvi/vimï¼è§é¸å¥çãé¸å¥ç Linux ç§æ¿èãä¸ vim ç¨å¼ç¼è¾å¨)
auto lo #localhost
iface lo inet loopback #è¿ä¸æ®µé
ç½®çæ¯localhost/127.0.0.1ï¼å¯ä¿ç
#å ä¸eth0, ç½å¡0çé
ç½®
auto eth0
iface eth9 inet static #éæip
address 192.168.221.130
netmask 255.255.255.0
gateway 192.168.221.2
dns-nameserver 192.168.221.2 8.8.8.8
#dns-search test.com è¿ä¸ªæ°å¦çï¼é»è®¤ä¼èªå¨ç»hostå å°.test.com
éå¯ç½ç»
:sudo /etc/init.d/networking restart #éå¯åï¼æè½establish eth0
:whereis ifup #...
:sudo /sbin/ifup eth0 #æå¨ä¿®æ¹eth0åå¿
é¡»å¯ç¨eth0æææï¼æ¤ææ讲
:sudo /sbin/ifdown eth0
:sudo /etc/init.d/networking restart #åéå¯
:ifconfig #æ¥çIPï¼æ¾ç¤ºeth0ä¿¡æ¯
#é
ç½®DNS
:sudo vi /etc/resolv.conf
å å¦ä¸ googleçå
Œ
±DNSï¼
nameserver 192.168.221.2
nameserver 8.8.8.8
è¿ä¸ªä¼è¢«network-managerè¦çï¼æ以åè
è¦KOæ
:sudo apt-get autoremove network-manager âpurge
#é
ç½®HOST
:sudo vi /etc/hosts
å ä¸
192.168.221.130 h1
192.168.221.141 h2
192.168.221.142 h3
#é
ç½®host name
:whereis hostname
:sudo vi /etc/hostname
åä¸h1
è¿è¡
:sudo hostname h1
å°ç°å¨ç½ç»å°±æåé
置好äºï¼éCLONEçè¯ï¼å°±ä¸å°serversä¸è·¯æ§è¡å§ï¼æé
¸ï¼ï¼/etc/hosts建议scpè¿å»
为hadoopå建ç¹å®æä½ç¨æ·
为hadoopå建ç¹å®æä½ç¨æ·ï¼ä¹åé群èç¹æå¡å¨ä¹éå建ï¼ä»¥ä½¿å¾èç¹æå¡å¨é´è½å¤éè¿è¿äºç¹å®ç¨æ·åå
¶RSAå
¬é¥ä¿¡æ¯éè¿SSHè¿æ¥äºéã
(å¨è¿å¿æåäºæ¯è¾å¤§çå¼¹åï¼useraddåadduseræ¯ä¸¤ä¸ªä¸åçå½ä»¤ï¼ä½¿ç¨èµ·æ¥ä¹ä¸åï¼è¿ä¸ç¯è®²å¾æ¯è¾æ¸
æ¥)
æ使ç¨çæ¯
:sudo useradd hadoop_admin
:sudo passwd hadoop_admin
ç»æç¨å®æ¥loginåï¼åç°æ²¡æhomeä¿¡æ¯ï¼æ¾ç¤ºçæ¯
$:
ç¶åæåårootç¨æ·ï¼èªä½ä¸»å¼ å建äº/home/hadoop_adminç®å½ï¼äºæ¯è¿ä¸ªç®å½å°±åªærootææéï¼
å¼å§åç°çé®é¢æ¯å¨çærsa ssh keyæ¶æ示ç®å½æ åå
¥æé
æ¥äºä¸ç¸å
³èµæï¼ååºç¨æ·å¯¹homeçæéï¼åç°hostæ¯root
继ç»
åç°æé为0ï¼è¯´æè¿ä¸ªç¨æ·å建å¾æé®é¢ï¼ç¾¤å让æç¨chmodåæå¨è®¾ç½®æéï¼ä½¿ç¨sudo chown -R hadoop_admin /home/hadoop_adminï¼è¿ä¹æ¯ä½¿ç¨useraddéåçï¼ï¼æè§å¾å¤ªéº»ç¦ï¼æ¥äºä¸ï¼å³å®éæ°å»ºç¨æ·ï¼è¿ä¸ªå¨ITè¿ç»´ä¸å®æ¯ä¸å¯ä»¥çå§ =O=ï¼
:sudo deluser hadoop_admin
:sudo adduser hadoop_admin âhome /home/hadoop_admin âu 545
ç°å¨æ£å¸¸äº
1. å建ç¨æ·
:sudo adduser hadoop_admin âhome /home/hadoop_admin âu 545
2. å°ç¨æ·å å
¥å°å¯ä»¥æ§è¡sudoçç¨æ·å表
:sudo vi /etc/sudoers
å°å¦ä¸ä¿¡æ¯å å
¥å°æ件ä¸
3. 为ç¨æ·çæSSH KEYï¼ä¸è®²ï¼
å®è£
SSH并çæRSA KEY
1. å®è£
OpenSSH
ç¥è¯ç¹ï¼å
³äºdebian软件å
åapt-getï¼çè¿å¿
:sudo apt-get install openssh-server
å®æåï¼ç论ä¸sshå°±å¯å¨äºï¼ç°å¨å¯ä»¥ä½¿ç¨WinSCP explore模å¼è¿è¡æä»¶ä¼ è¾äºï¼å°JDK,HADOOPé½æ·è¿å»
å¯ä»¥çä¸çsshçé
ç½®ï¼æå©äºä¸é¢ç解èç¹æå¡å¨ä¹é´éè¿SSHå
¬é¥æ å¯ç è¿æ¥ï¼æè¿ç§é¶åºç¡ç人è§å¾whereiså½ä»¤å¼å¸¸æ¹ä¾¿..
å 为å®è£
hadoopæ¶ç»å¸¸åºç°æ¯å¦è¦å°hostå å
¥know_hostï¼æ以è¿ä¸è¡åå¾å¾æ趣
Ubuntu debiané»è®¤æå¼äº~/.ssh/configä¸çHashKnownHosts yesï¼æ以æ¯æ¬¡ssh hostnameæ¶é½ä¼è¯¢é®æ¯å¦å å
¥known_hostsæ件ï¼å
³äºOpenSSHçæ©å±é
读
2. 为hadoop_adminçæç§é¥åå
¬é¥æ件
#以hadoop_adminç»é并åæ¢å°~/主ç®å½
:cd ~/
:ssh-keygen ât rsa #以RSAå å¯ç®æ³çæSSH keys ât 设置ç®æ³ç±»å
è¿æ ·å°±ä¼èªå¨å¨ç¨æ·ä¸»ç®å½ä¸çæ.sshæ件夹åid_rsaï¼prive keyï¼åid_rsa.pub(public key)两个æ件
:cd ~/.ssh
:cp id_rsa.pub authorized_keys #éè¿ä¸é¢å¯¹SSHçäºè§£ï¼è¿ä¸ªauthorized_keysåæ¾SSHè¯å«è½èªå¨éè¿éªè¯çå
¬é¥ä¿¡æ¯ï¼ä¿¡æ¯ä¸²å¨æçå®éªä¸é½æ¯ä»¥login_name@hostnameç»å°¾ç
ï¼å¯ä»¥å°å
¶å®userçå
¬é¥ä¹æè¿å»ï¼
å®è£
JDK
ååååæè
¾äºå¥½å ç§å®è£
æ¹æ³ï¼æä»Ubuntu Software Centeræç´¢JDKå®è£
äºOpenJDKçï¼æéè¿ä¿®æ¹debian source list使ç¨sudo apt-get install java-6-sunçï¼é½ä¸å¥½ç¨ï¼æç®åçæ¹æ³å°±æ¯ä¸è½½Sunçjdk -> 解å -> ä¿®æ¹JAVA_HOMEä¿¡æ¯ã
1. åå¤å¥½JDKæ件
ä¸é¢æ讲ï¼ä¸è½½å°å以åéè¿sshå°æ件æ·å°VMç³»ç»ä¸
2. å®è£
JDK
ææ¯å®è£
å°/usr/lib/jvm/jdk1.7.0_21ä¸ ï¼è¿ä¸ªç®å½æ好å¨ææserverä¸é½ç»ä¸ï¼ä¸ç¶ææ»äººäº~ï¼
:sudo tar xvf ~/Downloads/[jdk].tar.gz âC /usr/lib/jvm
:cd /usr/lib/jvm
:ls
è¿å»çç
3. 设置JAVA_PATHçä¿¡æ¯
:sudo vi /etc/profile
#å ä¸å¦ä¸ä¿¡æ¯è®¾ç½®ç¯å¢åé
export JAVA_HOME=/usr/lib/ jvm/jdk1.7.0_21
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH:$JRE_HOME/lib
#æ§è¡ä¸ä¸ä½¿ä¹ææ
:source /etc/profile
#æ§è¡ä¸ä¸éªè¯
:cd $JAVA_HOME
#è¥è½æ£ç¡®å®ä½ï¼å设置å®æ¯
å®è£
hadoop
1. åå¤hadoopæ件
ä¸é¢æ讲è¿ï¼å°hadoop.0.20.2éè¿sshä¼ è¾å°ç®æ æºå¨ä¸
2. å®è£
hadoop
解åå°hadoop_adminçç®å½ä¸ï¼Q: ä¸å®è¦å¨è¿ä¸ªç®å½åï¼ ->
:sudo tar xvf [hadoop.tar.gzè·¯å¾] âC /home/hadoop_admin/hadoop/
3. é
ç½®hadoop
é
ç½®æä¸å°å¦é®ï¼ä¸é¢æ¯ææç®åç⦠æå¾å¦å°ä¸å¨æè½æç½äºï¼ææ³â¦ è¿å¿æäºåºæ¬å±æ§ç解éï¼æä¸é¢èªå·±æå¨è¾å
¥å 强记å¿åç解
a. 设置ç¯å¢åéHADOOP_HOMEï¼æ¹ä¾¿ä½¿ç¨
:sudo vi /etc/profile
export HADOOP_HOME=/home/hadoop_admin/hadoop-0.20.2
export JAVA_HOME=/usr/lib/syveen_jvm/jdk1.7.0_21
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$PATH:$JRE_HOME/lib:$HADOOP_HOME/bin
:source /etc/profile #æ§è¡ï¼ä½¿ä¹ææ
:cd $HADOOP_HOME
:cd /conf/
:cd ls
b. 设置jdkè·¯å¾ï¼å°JAVA_HOMEå å
¥å°ç¯å¢é
ç½®ä¸
:sudo vi /JAVA_HOMEå å
¥å°/hadoop-env.sh
ä¸è®°å¾JDKè·¯å¾çå¯ä»¥
:echo $JAVA_HOME
c. core-site.xml
设置name nodeçHDFSè·¯å¾ ãfs.default.name: 设置é群çname nodeçURIï¼åè®®hdfsï¼ä¸»æºå/IPï¼ç«¯å£å·ï¼ï¼é群ä¸çæ¯å°æºå¨é½éè¦ç¥éname nodeä¿¡æ¯ã
<configuration>
<property><name>fs.default.name</name><value>hdfs://h1:9001</value></property>
</configuration>
d. hdfs-site.xml
设置name nodeçæ件æ°æ®(file system)çåå¨è·¯å¾åæ·è´ä»½æ°(replication)ï¼è¯´å®è¯å 为ç°å¨æ²¡æå®é
åºç¨hadoopæ以对è¿ä¸ªnamenodeådatanodeçç®å½è®¾ç½®åreplication没æå®é
ç解ï¼åªè½ä¾è«è¦ç»ç¢ï¼ä¹ååæ´æ°è¿é¨åå§
<property><name>dfs.name.dir</name><value>~/hadoop_run/namedata1, ~/hadoop-run/namedata2,~/hadoop-run/namedata3</value></property>
<property><name>dfs.data.dir</name><value>~/hadoop-0.20.2/data</value></property>
<property><name>dfs.replication</name><value>3</value></property>
e. mapred-site.xml
mapred: map-reduceçjobtrackerä¿¡æ¯
<property><name>mapred.job.tracker</name><value>h1:9001</value></property>
f. masters
å å
¥masterèç¹ä¿¡æ¯ï¼è¿å¿æ¯h1
g. slaves
å å
¥ä»å±èç¹ä¿¡æ¯, è¿å¿æ¯h2, h3
4. é
ç½®h2, h3èç¹æå¡å¨
漫é¿çæ
ç¨åï¼ææ¯éæ°VMWareå®è£
h2,h3ï¼éå¤ä»¥ä¸ææç¯å¢ä»¥è¾¾å°äºæ¬¡å·©åºçç®çï¼å¹¶æ²¡æ使ç¨clone模å¼å¤å¶imageï¼è¿å
¶ä¸æ´é²åºæ¥çé®é¢å¾å¤ï¼æ¯å¦jdkåhadoopçå®è£
ç®å½ä¸ä¸æ ·ï¼å®å
¨æ¯æ¼åé误ä¹ç±»ï¼ï¼å¯¼è´åæ¥æ´æ¹æ件é½ç´¯æ»~ æ以象æè¿æ ·çåå¦è
è¿æ¯é½ç»ä¸å§ï¼å
æ¬hadoop_adminè¿æ ·çæä½ç¨æ·å称ä¹æ好ç»ä¸äºã
4.1 å®è£
åé
ç½®h2,h3èç¹æå¡å¨
éå¤å建hadoop_adminç¨æ·ï¼å®è£
ssh并çækeyï¼å°è¿å¿å°±stop
4.2 å°h2,h3çå
¬é¥ä¿¡æ¯å¯¼å
¥å°h1çauthorized_keysä¸ï¼ä»¥æ¹ä¾¿ä¹åæ å¯ç SSHæä»¶ä¼ è¾
æ¹æ³ä¸ºå°h2ï¼h3çæ件å
scpï¼secure copyï¼ä¼ è¾å°h1ç®å½ä¸
å¨h2ä¸ sudo scp ~/.ssh/id_rsa.pub hadoop_admin@h1:~/h2pub
å¨h3ä¸ sudo scp ~/.ssh/id_rsa.pub hadoop_admin@h1:~/h3pub
å¨h1ä¸
:sudo cat ~/.ssh/id_rsa.pub ~/h2pub ~/h3pub > ~/.ssh/authorized_keys #å°èªå·±ç,h2åh3çå
¬é¥èå(concatenate)å¨ä¸èµ·
:sudo scp ~/.ssh/authorized_keys hadoop_admin@h2:~/.ssh/authorized_keys #好å§ï¼ç¶ååæ·åå»ï¼Q: slaveéè¦åï¼
:sudo scp ~/.ssh/authorized_keys hadoop_admin@h3:~/.ssh/authorized_keys
4.3 ä»h1ç´æ¥å®è£
JDK,HADOOPå°h2,h3
a. å®è£
jdk
:sudo scp $JAVA_HOME hadoop_admin@h2:/usr/liv/jvm
:sudo scp $JAVA_HOME hadoop_admin@h3:/usr/liv/jvm
å¦æetc/profileä¸æ ·çè¯ï¼ä¹è¿ä¹æè¿å»å§..
:sudo scp /etc/profile h2:/etc/profile
:sudo scp /etc/profile h3:/etc/profile
b. å®è£
hadoop
:sudo scp $HADOOP_HOME hadoop_admin@h2:~/hadoop-0.20.2
:sudo scp $HADOOP_HOME hadoop_admin@h3:~/hadoop-0.20.2
c. å¦æetc/hostsä¸æ ·çè¯ï¼æå®ä»¬ä¹æè¿å»å§..
:sudo scp /etc/hosts h2:/etc/hosts
:sudo scp /etc/hosts h3:/etc/hosts
æ£æ¥ä¸è¿°æ¥éª¤ï¼äºç¸pingä¸æ ·é½è½äºéï¼ä½¿ç¨ssh [hostname]é½è½ä¸éè¦å¯ç äºéï¼é£è¿ä¸ä¸ªæå¡å¨ä¹åºè¯¥å°±é
ç½®å®æäºï¼hadoopå¥çä¹ä¸éè¦é¢å¤é
ç½®ã
5. æ ¼å¼åname node
arr.. è¿ä¸ªä¸è¥¿å°åºæ¯åäºå¥å¢? å¾æå
´è¶£ï¼ç´æ¥æç´¢äºä¸æï¼è¿çæ人çè¿æºç ãTBDäºï¼ä¹ååæ·±å
¥ç 究æ¶ä¼çã
6. å¯å¨hadoop
ç论ä¸ï¼å¦æjava home, ç¨æ·åæéï¼host, IP, sshæ å¯ç äºéä¹ç±»çå
¨é¨é
ç½®æ£ç¡®çè¯ï¼è¿å¿å¯ä»¥æ åççç»æäºï¼ä½å®é
ä¸ï¼å¾å¤é®é¢å⦠åç§é
ç½®ç²å¿å¤§æçï¼
:sudo $HADOOP_HOME/bin/start-all.sh
å¨è¿ä¸æ¥ï¼ä¸è¦åºç°permission deniedï¼file or directory not existsï¼ççåç§é误ï¼éªéªççå°started successfullyï¼è¡¨ç¤ºå¯ç¨æ éç¢ã
7. æ£éªæ¯å¦æå
a. è¿ç¨æ£å¸¸
:sudo $JAVA_HOME/bin/jps
name node 4个è¿ç¨
data node 3个è¿ç¨
b.
http://localhost:50030c.
http://locahost:50070OYEAH! è³å°è¡¨é¢çæ¥ä¸åè¯å¥½ï¼çå°è¿å¿ï¼è¡¨ç¤ºæ¨å·²ç»æåå®è£
hadoopå®å
¨åå¸å¼é群ï¼åç»å·¥ä½ä¼æ´å¤æï¼æå¾
å§ï¼
æ¿ä¸è·¯å¥è·ä¸é缩ï¼å°ç®åä¸ç´ä»äº.NetçB/S,C/Sä¼ä¸åºç¨ç å