/ 中存储网

Hadoop安装遇到各种异常以及解决方法

2014-04-01 00:00:00 来源:中存储网

hadoop学习过程中,我们会遇到各种各样的问题,常见的有hadoop无法启动,集群不能正常工作,不停跳出报错信息等等,这里总结了常见的几个问题及除了方法,希望对大家有用。

hadoop无法正常启动(1)

执行  $ bin/hadoop start-all.sh之后,无法启动.

异常一

Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.

localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:214)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:135)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:119)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:481)


解决方法:此时是没有配置conf/mapred-site.xml的缘故.  在0.21.0版本上是配置mapred-site.xml,在之前的版本是配置core-site.xml,0.20.2版本中配置mapred-site.xml无效,只能配置core-site.xml文件

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>hdfs://localhost:9001</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

hadoop无法正常启动(2)

异常二、

starting namenode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-namenode-aist.out

localhost: starting datanode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-datanode-aist.out

localhost: starting secondarynamenode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-secondarynamenode-aist.out

localhost: Exception in thread "main" java.lang.NullPointerException

localhost:      at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:134)

localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:156)

localhost:      at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:160)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:131)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:115)

localhost:      at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:469)

starting jobtracker, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-jobtracker-aist.out

localhost: starting tasktracker, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-tasktracker-aist.out

解决方法:此时是没有配置conf/mapred-site.xml的缘故.  在0.21.0版本上是配置mapred-site.xml,在之前的版本是配置core-site.xml   , 0.20.2版本中配置mapred-site.xml无效,只能配置core-site.xml文件

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>hdfs://localhost:9001</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

hadoop无法正常启动(3)

异常三、

starting namenode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-namenode-aist.out

localhost: starting datanode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-datanode-aist.out

localhost: Error: JAVA_HOME is not set.

localhost: starting secondarynamenode, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-secondarynamenode-aist.out

localhost: Error: JAVA_HOME is not set.

starting jobtracker, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-jobtracker-aist.out

localhost: starting tasktracker, logging to /home/xixitie/hadoop/bin/../logs/hadoop-root-tasktracker-aist.out

localhost: Error: JAVA_HOME is not set.

解决方法:

请在$hadoop/conf/hadoop-env.sh文件中配置JDK的环境变量

JAVA_HOME=/home/xixitie/jdk

CLASSPATH=$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export JAVA_HOME CLASSPATH




hadoop无法正常启动(4)

异常四:mapred-site.xml配置中使用hdfs://localhost:9001,而不使用localhost:9001的配置

异常信息如下:

11/04/20 23:33:25 INFO security.Groups: Group mapping impl=org.apache.hadoop.sec                                                                             urity.ShellBasedUnixGroupsMapping; cacheTimeout=300000

11/04/20 23:33:25 WARN fs.FileSystem: "localhost:9000" is a deprecated filesyste                                                                             m name. Use "hdfs://localhost:9000/" instead.

11/04/20 23:33:25 WARN conf.Configuration: mapred.task.id is deprecated. Instead                                                                             , use mapreduce.task.attempt.id

11/04/20 23:33:25 WARN fs.FileSystem: "localhost:9000" is a deprecated filesyste                                                                             m name. Use "hdfs://localhost:9000/" instead.

11/04/20 23:33:25 WARN fs.FileSystem: "localhost:9000" is a deprecated filesyste                                                                             m name. Use "hdfs://localhost:9000/" instead.

解决方法:

mapred-site.xml配置中使用hdfs://localhost:9000,而不使用localhost:9000的配置

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

<property>

<name>mapred.job.tracker</name>

<value>hdfs://localhost:9001</value>

</property>

hadoop无法正常启动(5)

异常五、no namenode to stop 问题的解决:

异常信息如下:

11/04/20 21:48:50 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 0 time(s).

11/04/20 21:48:51 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 1 time(s).

11/04/20 21:48:52 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 2 time(s).

11/04/20 21:48:53 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 3 time(s).

11/04/20 21:48:54 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 4 time(s).

11/04/20 21:48:55 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 5 time(s).

11/04/20 21:48:56 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 6 time(s).

11/04/20 21:48:57 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 7 time(s).

11/04/20 21:48:58 INFO ipc.Client: Retrying connect to server: localhost/127.0.0                                                                             .1:9000. Already tried 8 time(s).

解决方法:

这个问题是由namenode没有启动起来引起的,为什么no namenode to stop,可能之前的一些数据对namenode有影响,

你需要执行:

$ bin/hadoop namenode -format

然后

$bin/hadoop start-all.sh

hadoop无法正常启动(6)

异常五、no datanode to stop 问题的解决:

有时数据结构出现问题会产生无法启动datanode的问题。

然后用 hadoop namenode -format  重新格式化后仍然无效,/tmp中的文件并没有清楚。

其实还需要清除/tmp/hadoop*里的文件。

执行步骤:

一、先删除hadoop:///tmp 

hadoop  fs -rmr /tmp

二、停止 hadoop   

stop-all.sh

三、删除/tmp/hadoop*

rm -rf /tmp/hadoop*

四、格式化hadoop

hadoop namenode -format

五、启动hadoop 

start-all.sh

之后即可解决这个datanode没法启动的问题

++++++++++++++++++++++++++++++++++++++++

由于机器服务器维护需要,要求hadoop集群的一台服务器停止服务,于是我就到那台服务器去停止hadoop的datanode和tasktracker,运行以下命令:
bin/hadoop-daemon.sh stop datanode
竟然输出:
no datanode to stop
但是查看进程,却发现datanode和tasktracker都还在运行,尝试了好几次都是同样结果,最后我试图使用namenode的命令停止:
bin/stop_dfs.sh 
还是输出:
no datanode to stop
不得已,只好使用暴力手段,直接kill -9 进程了。
在杀死hadoop进程之后,bin/hadoop-daemon.sh又可以正常使用了。不知道其他的hadoop使用者是否遇到过此问题??

但是问题不能就这么算了,在网上查了下资料,没找到满意的结果。没办法,自己看代码吧!
在看了hadoop-daemon.sh代码后,我发现脚本是通过pid文件来停止hadoop服务的,而我的集群配置是使用的默认配置,pid文件位于/tmp目录下,于是我对比了/tmp目录下hadoop pid文件中的进程id和ps ax查出来的进程id,发现两个进程id不一致,终于找到了问题的根源。
呵呵,赶紧去更新hadoop的配置吧!
修改hadoop-env.sh中的:HADOOP_PID_DIR = hadoop安装路径
然后根据集群hadoop进程的pid在hadoop安装路径下建立相应的pid文件:
hadoop-hadoop运行用户名-datanode.pid
hadoop-hadoop运行用户名-tasktracker.pid
hadoop-hadoop运行用户名-namenode.pid
hadoop-hadoop运行用户名-jobtracker.pid 
hadoop-hadoop运行用户名-secondarynamenode.pid