Stage Id ▾ | Description | Submitted | Duration | Tasks: Succeeded/Total | Input | Output | Shuffle Read | Shuffle Write |
---|---|---|---|---|---|---|---|---|
3 | collect at <ipython-input-22-ce2c7a41cb8f>:1 org.apache.spark.rdd.RDD.collect(RDD.scala:1029) org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:180) org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) py4j.Gateway.invoke(Gateway.java:282) py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) py4j.commands.CallCommand.execute(CallCommand.java:79) py4j.GatewayConnection.run(GatewayConnection.java:238) java.lang.Thread.run(Thread.java:750) | Unknown | Unknown |
0/2
|
Stage Id ▾ | Description | Submitted | Duration | Tasks: Succeeded/Total | Input | Output | Shuffle Read | Shuffle Write | Failure Reason |
---|---|---|---|---|---|---|---|---|---|
2 | reduceByKey at <ipython-input-21-27fe18271c16>:1 org.apache.spark.rdd.RDD.<init>(RDD.scala:110) org.apache.spark.api.python.PairwiseRDD.<init>(PythonRDD.scala:111) sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) java.lang.reflect.Constructor.newInstance(Constructor.java:423) py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) py4j.Gateway.invoke(Gateway.java:238) py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) py4j.GatewayConnection.run(GatewayConnection.java:238) java.lang.Thread.run(Thread.java:750) | 2025/05/20 19:52:45 | 0.3 s |
0/2
(7 failed)
(1 killed: Stage cancelled)
| 828.0 B | Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 16) (w02.itversity.com executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last):Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 16) (w02.itversity.com executor 1): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/worker.py", line 604, in main process() File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/worker.py", line 594, in process out_iter = func(split_index, iterator) File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/rdd.py", line 2916, in pipeline_func return func(split, prev_func(split, iterator)) File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/rdd.py", line 2916, in pipeline_func return func(split, prev_func(split, iterator)) File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/rdd.py", line 418, in func return f(iterator) File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/rdd.py", line 2144, in combineLocally merger.mergeValues(iterator) File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/shuffle.py", line 240, in mergeValues for k, v in iterator: File "/opt/spark-3.1.2-bin-hadoop3.2/python/pyspark/util.py", line 73, in wrapper return f(*args, **kwargs) File "<ipython-input-18-d4364c452db2>", line 1, in <lambda> TypeError: 'str' object is not callable at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:517) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:652) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:635) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:470) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:1209) at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:1215) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:132) at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: |