zoukankan      html  css  js  c++  java
  • pyspark 错误记录

    2020-09-11 06:55:00 ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    
    

    超过2000行的日志文件中主要部分,其它都是在重复显示这些错误日志。
    其中,下面为主要的错误信息。

    ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    

    一、错误分析

    在程序pysoark程序运行几个小时之后,开始显示这个错误。

  • 相关阅读:
    2016年3月9日~10日,杨学明老师为武汉某著名通信企业提供内训课程服务!
    打造高效率产品测试体系--产品测试管理(深圳,2016.3.18~19)
    互联网产品上线前,做些什么——产品、开发、测试的视角
    2016年1月16日,《互联网项目管理高级实务》内训在上海某高科技企业成功举办!
    用C++对C++语法格式进行分析
    mysql主从配置
    Windows Zip/CentOS/Radhat系统安装Mysql5.7.x方法
    c++预声明类引发的无法解析外部符号问题
    解决VS2015单元测试“未能设置用于运行测试的执行上下文”问题
    扩展Linux磁盘空间
  • 原文地址:https://www.cnblogs.com/leimu/p/13650837.html
Copyright © 2011-2022 走看看