全部博文(22)
分类: 大数据
2013-01-29 01:11:35
首先,在此真的要感谢Google impala-user的各位大牛不吝时间为我指点迷津。闲话不说,刚接触impala不久,前段时间一直在看其源码,随后就是调试工作的展开,刚一开始就出现了问题。在编译通过后的源码之上进行调试工作,由于整个项目是用cmake管理的,直接在编译阶段有生成可debug的可执行程序,具体在${IMPALA_HOME}/be/build/debug这个目录下,所以调试可以由以下几步完成:
1)启动impalad
2)使用gdb attach pidofimpalad进入调试环境
3)在另一个终端shell下运行impala-shell.sh脚本
但按照gdb调试的过程进行时,shell端连接上impalad之后,在执行sql query查询语句的时候,在调试端出现以下错误:
[impala@hadoop-01 service]$ gdb -q impalad Reading symbols from /usr/local/src/impala/be/build/debug/service/impalad...done. (gdb) set args -use_statestore=false -nn=hadoop-01.localdomain -nn_port=8030 (gdb) b main Breakpoint 1 at 0x9adba1: file /usr/local/src/impala/be/src/service/impalad-main.cc, line 71. (gdb) r Starting program: /usr/local/src/impala/be/build/debug/service/impalad -use_statestore=false -nn=hadoop-01.localdomain -nn_port=8030 [Thread debugging using libthread_db enabled] Breakpoint 1, main (argc=4, argv=0x7fffffffbc48) at /usr/local/src/impala/be/src/service/impalad-main.cc:71 71 InitDaemon(argc, argv); Missing separate debuginfos, use: debuginfo-install boost-date-time-1.41.0-11.el6_1.2.x86_64 boost-regex-1.41.0-11.el6_1.2.x86_64 boost-thread-1.41.0-11.el6_1.2.x86_64 bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.80.el6_3.6.x86_64 libevent-1.4.13-4.el6.x86_64 libgcc-4.4.6-4.el6.x86_64 libicu-4.2.1-9.1.el6_2.x86_64 libstdc++-4.4.6-4.el6.x86_64 zlib-1.2.3-27.el6.x86_64 (gdb) n 73 LlvmCodeGen::InitializeLlvm(); (gdb) n 76 if (!FLAGS_principal.empty()) { (gdb) c Continuing. [New Thread 0x7ffff2dd8700 (LWP 7856)] [New Thread 0x7ffff2cd7700 (LWP 7857)] [New Thread 0x7ffff2a3a700 (LWP 7858)] [New Thread 0x7ffff2939700 (LWP 7859)] [New Thread 0x7ffff2838700 (LWP 7860)] [New Thread 0x7fffec8a6700 (LWP 7861)] [New Thread 0x7fffec7a5700 (LWP 7862)] [New Thread 0x7fffec6a4700 (LWP 7863)] [New Thread 0x7fffec5a3700 (LWP 7864)] [New Thread 0x7fffec4a2700 (LWP 7865)] Program received signal SIGSEGV, Segmentation fault. 0x00007ffff3277068 in ?? () (gdb) l 71 InitDaemon(argc, argv); 72 73 LlvmCodeGen::InitializeLlvm(); 74 75 // Enable Kerberos security if requested. 76 if (!FLAGS_principal.empty()) { 77 EXIT_IF_ERROR(InitKerberos("Impalad")); 78 } 79 80 JniUtil::InitLibhdfs(); (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007ffff6e72f2e in InterpreterRuntime::newarray(JavaThread*, BasicType, int) () from /usr/java/default/jre/lib/amd64/server/libjvm.so (gdb) c Continuing. [New Thread 0x7fffebc5d700 (LWP 7874)] [New Thread 0x7fffeba45700 (LWP 7875)] [New Thread 0x7fffeb044700 (LWP 7876)] 13/01/28 09:44:17 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /usr/local/src/impala/fe/src/test/resources/hive-default.xml 13/01/28 09:44:18 WARN conf.Configuration: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 13/01/28 09:44:18 WARN conf.Configuration: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 13/01/28 09:44:18 WARN conf.Configuration: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 13/01/28 09:44:18 WARN conf.Configuration: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 13/01/28 09:44:18 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 13/01/28 09:44:18 WARN conf.Configuration: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffec7a5700 (LWP 7862)] 0x00007ffff6ce74e3 in ciMethod::method_data() () from /usr/java/default/jre/lib/amd64/server/libjvm.so其实错误很明显,就是libjvm.so动态共享库在调试接受shell端query请求时,发生了SIGSEGV引发的段错误,当时一头雾水,以为是jdk的问题,后来通过在谷歌讨论组的提问,得到了解决的方法,现将解决方法分享给那些和我遇到相同问题的大家:
用gdb调试impala be的时候,JVM会产生大量的段错误,这里就需要我们手动告诉gdb去忽略他们或者让java处理它们,这里只需要在执行run之前,添加一行命令:
gdb> handle SIGSEGV nostop noprint pass这样就可以避免遇到各种Segmentation faults这样的问题而导致shell端查询一直没有结果返回。
再次,感谢那些分享自己经验的大牛,向你们致敬!
loogn_qiang2013-05-26 14:37:57
cuidong008:请问SIGSEGV不是真正的错误码?为什么可以直接忽略,请赐教,我的QQ:350639746
在调试的时候,JVM会产生一大堆segmentation这样的错误,而这些本该由JVM进行处理,但是却被gdb捕获而使得调试无法继续,告诉gdb忽略它实际上是让JVM自己去处理。
回复 | 举报