zoukankan      html  css  js  c++  java
  • pyspark 错误记录

    2020-09-11 06:55:00 ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    
    

    超过2000行的日志文件中主要部分,其它都是在重复显示这些错误日志。
    其中,下面为主要的错误信息。

    ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    

    一、错误分析

    在程序pysoark程序运行几个小时之后,开始显示这个错误。

  • 相关阅读:
    jquery 序列化form表单
    nginx for windows 安装
    nodejs idea 创建项目 (一)
    spring 配置 shiro rememberMe
    idea 2018 解决 双击shift 弹出 search everywhere 搜索框的方法
    redis 在windows 集群
    spring IOC控制反转和DI依赖注入
    redis 的安装
    shiro 通过jdbc连接数据库
    handlebars的用法
  • 原文地址:https://www.cnblogs.com/leimu/p/13650837.html
Copyright © 2011-2022 走看看