Spark Shell因为Scala编译器原因不能正常启动怎么解决 spark无法启动,Hadoop可以正常启动,该怎么解决

Spark Shell\u56e0\u4e3aScala\u7f16\u8bd1\u5668\u539f\u56e0\u4e0d\u80fd\u6b63\u5e38\u542f\u52a8\u600e\u4e48\u89e3\u51b3

\u4f60\u597d\uff0c
\u9519\u8bef\u539f\u56e0\uff0c\u4e0elibgcj.so.10\u6709\u5173\uff0c\u53ef\u80fd\u662fJava\u73af\u5883\u6ca1\u6709\u914d\u7f6e\u6b63\u786e\uff0c\u5728conf/spark-env.sh\u4e2d\u6dfb\u52a0\u4e00\u884c\uff1aexport JAVA_HOME=/usr/java/latest\u89e3\u51b3\u95ee\u9898\uff01

\u9519\u8bef\u539f\u56e0\uff0c\u4e0elibgcj.so.10\u6709\u5173\uff0c\u53ef\u80fd\u662fJava\u73af\u5883\u6ca1\u6709\u914d\u7f6e\u6b63\u786e\uff0c\u5728conf/spark-env.sh\u4e2d\u6dfb\u52a0\u4e00\u884c\uff1aexport JAVA_HOME=/usr/java/latest\u89e3\u51b3\u95ee\u9898\uff01

Spark Shell由于Scala编译器原因不能正常启动

使用SBT安装完成Spark后,可以运行示例,但是尝试运行spark-shell就会报错:

D:\Scala\spark\bin\spark-shell.cmd
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J:
Found binding in
[jar:file:/D:/Scala/spark/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/D:/Scala/spark/tools/target/scala-2.10/spark-tools-assembly-0.9.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/03 20:40:43 INFO HttpServer: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/04/03 20:40:43 INFO HttpServer: Starting HTTP Server

Failed to initialize compiler: object scala.runtime in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programatically, settings.usejavacp.value = true.
14/04/03
20:40:44 WARN SparkILoop$SparkILoopInterpreter: Warning: compiler
accessed before init set up.  Assuming no postInit code.

Failed to initialize compiler: object scala.runtime in compiler mirror not found.
** Note that as of 2.8 scala does not assume use of the java classpath.
** For the old behavior pass -usejavacp to scala, or if using a Settings
** object programatically, settings.usejavacp.value = true.
Failed to initialize compiler: object scala.runtime in compiler mirror not found.
        at scala.Predef$.assert(Predef.scala:179)
       
at
org.apache.spark.repl.SparkIMain.initializeSynchronous(SparkIMain.scala:197)
       
at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:919)
       
at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:876)
       
at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:876)
       
at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
       
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:876)
       
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:968)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)

Google
之还是不求解。只是在SBT的网站上看到Q&A里面有个问题提到了:http://www.scala-sbt.org/release
/docs/faq#how-do-i-use-the-scala-interpreter-in-my-code。这里说代码中怎么修改设置。显然不
适合我。

继续求解。注意到错误提示是在2.8以后才有的,原因是有一个关于编译器解释权Classpath的提议被接受了:Default compiler/interpreter classpath in a managed environment。

继续在Google中找,有一篇论文吸引了我的注意:Object Scala Found。里面终于找到一个办法:



However, a working command can be recovered, like so:
$ jrunscript -Djava.class.path=scala-library.jar -Dscala.usejavacp=true -classpath scala-compiler.jar -l scala



于是修改一下\bin\spark-class2.cmd:

rem Set JAVA_OPTS to be able to load native libraries and to set heap size
set
JAVA_OPTS=%OUR_JAVA_OPTS% -Djava.library.path=%SPARK_LIBRARY_PATH%
-Dscala.usejavacp=true -Xms%SPARK_MEM% -Xmx%SPARK_MEM%
rem Attention: when changing the way the JAVA_OPTS are assembled, the change must be reflected in ExecutorRunner.scala!

标红的部分即是心添加的一个参数。再次运行\bin\spark-shell.cmd:

D:>D:\Scala\spark\bin\spark-shell.cmd
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J:
Found binding in
[jar:file:/D:/Scala/spark/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/D:/Scala/spark/tools/target/scala-2.10/spark-tools-assembly-0.9.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/03 22:18:41 INFO HttpServer: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/04/03 22:18:41 INFO HttpServer: Starting HTTP Server
Welcome to
     

____             
__
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 0.9.0
      /_/

Using Scala version 2.10.3 (Java HotSpot(TM) Client VM, Java 1.6.0_10)
Type in expressions to have them evaluated.
Type :help for more information.
14/04/03 22:19:12 INFO Slf4jLogger: Slf4jLogger started
14/04/03 22:19:13 INFO Remoting: Starting remoting
14/04/03 22:19:16 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@Choco-PC:5960]
14/04/03 22:19:16 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@Choco-PC:5960]
14/04/03 22:19:16 INFO SparkEnv: Registering BlockManagerMaster
14/04/03
22:19:17 INFO DiskBlockManager: Created local directory at
C:\Users\Choco\AppData\Local\Temp\spark-local-20140403221917-7172
14/04/03 22:19:17 INFO MemoryStore: MemoryStore started with capacity 304.8 MB.
14/04/03 22:19:18 INFO ConnectionManager: Bound socket to port 5963 with id = ConnectionManagerId(Choco-PC,5963)
14/04/03 22:19:18 INFO BlockManagerMaster: Trying to register BlockManager
14/04/03 22:19:18 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager Choco-PC:5963 with 304.8 MB RAM
14/04/03 22:19:18 INFO BlockManagerMaster: Registered BlockManager
14/04/03 22:19:18 INFO HttpServer: Starting HTTP Server
14/04/03 22:19:18 INFO HttpBroadcast: Broadcast server started at http://192.168.1.100:5964
14/04/03 22:19:18 INFO SparkEnv: Registering MapOutputTracker
14/04/03
22:19:18 INFO HttpFileServer: HTTP File server directory is
C:\Users\Choco\AppData\Local\Temp\spark-e122cfe9-2d62-4a47-920c-96b54e4658f6
14/04/03 22:19:18 INFO HttpServer: Starting HTTP Server
14/04/03 22:19:22 INFO SparkUI: Started Spark Web UI at http://Choco-PC:4040
14/04/03 22:19:22 INFO Executor: Using REPL class URI: http://192.168.1.100:5947
Created spark context..
Spark context available as sc.

scala> :quit
Stopping spark context.
14/04/03 23:05:21 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
14/04/03 23:05:21 INFO ConnectionManager: Selector thread was interrupted!
14/04/03 23:05:21 INFO ConnectionManager: ConnectionManager stopped
14/04/03 23:05:21 INFO MemoryStore: MemoryStore cleared
14/04/03 23:05:21 INFO BlockManager: BlockManager stopped
14/04/03 23:05:21 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
14/04/03 23:05:21 INFO BlockManagerMaster: BlockManagerMaster stopped
14/04/03 23:05:21 INFO SparkContext: Successfully stopped SparkContext
14/04/03 23:05:21 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
14/04/03
23:05:21 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon
shut down; proceeding with flushing remote transports.

Good。浏览器打开http://Choco-PC:4040,就可以看到Spark的状态、环境、执行者等信息了。

这个Fix可能只是适用与我的情况。如果还有问题可以再找找相关的资料。

期间还碰到不能找到文件的错误。最后发现是JAVA_HOME设置没有对。如果你碰到问题了,可以打开脚本的回显,然后找找原因。

有一篇论文吸引了我的注意:Object Scala Found。里面终于找到一个办法:



However, a working command can be recovered, like so:
$ jrunscript -Djava.class.path=scala-library.jar -Dscala.usejavacp=true -classpath scala-compiler.jar -l scala



于是修改一下\bin\spark-class2.cmd:

rem Set JAVA_OPTS to be able to load native libraries and to set heap size
set JAVA_OPTS=%OUR_JAVA_OPTS% -Djava.library.path=%SPARK_LIBRARY_PATH% -Dscala.usejavacp=true -Xms%SPARK_MEM% -Xmx%SPARK_MEM%
rem Attention: when changing the way the JAVA_OPTS are assembled, the change must be reflected in ExecutorRunner.scala!

标红的部分即是心添加的一个参数。再次运行\bin\spark-shell.cmd:

D:>D:\Scala\spark\bin\spark-shell.cmd
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/D:/Scala/spark/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/D:/Scala/spark/tools/target/scala-2.10/spark-tools-assembly-0.9.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/04/03 22:18:41 INFO HttpServer: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
14/04/03 22:18:41 INFO HttpServer: Starting HTTP Server
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 0.9.0
/_/

Using Scala version 2.10.3 (Java HotSpot(TM) Client VM, Java 1.6.0_10)
Type in expressions to have them evaluated.
Type :help for more information.
14/04/03 22:19:12 INFO Slf4jLogger: Slf4jLogger started
14/04/03 22:19:13 INFO Remoting: Starting remoting
14/04/03 22:19:16 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@Choco-PC:5960]
14/04/03 22:19:16 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@Choco-PC:5960]
14/04/03 22:19:16 INFO SparkEnv: Registering BlockManagerMaster
14/04/03 22:19:17 INFO DiskBlockManager: Created local directory at C:\Users\Choco\AppData\Local\Temp\spark-local-20140403221917-7172
14/04/03 22:19:17 INFO MemoryStore: MemoryStore started with capacity 304.8 MB.
14/04/03 22:19:18 INFO ConnectionManager: Bound socket to port 5963 with id = ConnectionManagerId(Choco-PC,5963)
14/04/03 22:19:18 INFO BlockManagerMaster: Trying to register BlockManager
14/04/03 22:19:18 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager Choco-PC:5963 with 304.8 MB RAM
14/04/03 22:19:18 INFO BlockManagerMaster: Registered BlockManager
14/04/03 22:19:18 INFO HttpServer: Starting HTTP Server
14/04/03 22:19:18 INFO HttpBroadcast: Broadcast server started at http://192.168.1.100:5964
14/04/03 22:19:18 INFO SparkEnv: Registering MapOutputTracker
14/04/03 22:19:18 INFO HttpFileServer: HTTP File server directory is C:\Users\Choco\AppData\Local\Temp\spark-e122cfe9-2d62-4a47-920c-96b54e4658f6
14/04/03 22:19:18 INFO HttpServer: Starting HTTP Server
14/04/03 22:19:22 INFO SparkUI: Started Spark Web UI at http://Choco-PC:4040
14/04/03 22:19:22 INFO Executor: Using REPL class URI: http://192.168.1.100:5947
Created spark context..
Spark context available as sc.

scala> :quit
Stopping spark context.
14/04/03 23:05:21 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
14/04/03 23:05:21 INFO ConnectionManager: Selector thread was interrupted!
14/04/03 23:05:21 INFO ConnectionManager: ConnectionManager stopped
14/04/03 23:05:21 INFO MemoryStore: MemoryStore cleared
14/04/03 23:05:21 INFO BlockManager: BlockManager stopped
14/04/03 23:05:21 INFO BlockManagerMasterActor: Stopping BlockManagerMaster
14/04/03 23:05:21 INFO BlockManagerMaster: BlockManagerMaster stopped
14/04/03 23:05:21 INFO SparkContext: Successfully stopped SparkContext
14/04/03 23:05:21 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
14/04/03 23:05:21 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

扩展阅读:国外spark实践视频 ... spark shell脚本 ... 外国spark实践网站入口 ... 代号spark ... 中国spark实践网站视频 ... spark shell运行sql ... spark rdd ... shell -gt ... 启动spark shell命令 ...

本站交流只代表网友个人观点,与本站立场无关
欢迎反馈与建议,请联系电邮
2024© 车视网