zoukankan      html  css  js  c++  java
  • pyspark 错误记录

    2020-09-11 06:55:00 ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:257)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
            at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
            at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
            at com.sun.proxy.$Proxy17.call(Unknown Source)
            at org.apache.spark.streaming.api.python.TransformFunction.callPythonTransformFunction(PythonDStream.scala:92)
            at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:78)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:179)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:51)
            at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:50)
            at scala.util.Try$.apply(Try.scala:192)
            at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:257)
            at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
            at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:256)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at java.net.Socket.connect(Socket.java:538)
            at java.net.Socket.<init>(Socket.java:434)
            at java.net.Socket.<init>(Socket.java:244)
            at javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
            at py4j.CallbackConnection.start(CallbackConnection.java:226)
            at py4j.CallbackClient.getConnection(CallbackClient.java:238)
            at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
            ... 25 more
    
    

    超过2000行的日志文件中主要部分,其它都是在重复显示这些错误日志。
    其中,下面为主要的错误信息。

    ERROR JobScheduler:91 - Error running job streaming job 1599762320000 ms.2
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    2020-09-11 06:55:00 ERROR PythonDStream$$anon$1:91 - Cannot connect to Python process. It's probably dead. Stopping StreamingContext.
    py4j.Py4JException: Error while obtaining a new communication channel
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
    

    一、错误分析

    在程序pysoark程序运行几个小时之后,开始显示这个错误。

  • 相关阅读:
    【分布计算环境学习笔记】4 Enterprise Java Bean
    超详细的秒杀架构设计,运维,了解一下【转】
    Redis的监控指标【转】
    Windows netstat 查看端口、进程占用 查看进程路径
    wireshark抓包新手使用教程【转】
    关于设置sftp 指定端口【转】
    简单聊一聊Ansible自动化运维【转】
    tomcat启动报错SEVERE: Exception loading sessions from persistent storage【转】
    彻底搞懂 Kubernetes 的底层网络,看这几张图就够了【转】
    Java设计模式之(五)——代理模式
  • 原文地址:https://www.cnblogs.com/leimu/p/13650837.html
Copyright © 2011-2022 走看看