Spark Streaming UpdateStateByKey -
Spark Streaming UpdateStateByKey -
i running spark streaming 24x7 , using updatestatebykey function save computed historical info in case of networkwordcount example..
i tried stream file 3lac records 1 sec sleep every 1500 records. using 3 workers
over period updatestatebykey growing, programme throws next exceptionerror executor: exception in task id 1635 java.lang.arrayindexoutofboundsexception: 3
14/10/23 21:20:43 error tasksetmanager: task 29170.0:2 failed 1 times; aborting job 14/10/23 21:20:43 error diskblockmanager: exception while deleting local spark dir: /var/folders/3j/9hjkw0890sx_qg9yvzlvg64cf5626b/t/spark-local-20141023204346-b232 java.io.ioexception: failed delete: /var/folders/3j/9hjkw0890sx_qg9yvzlvg64cf5626b/t/spark-local-20141023204346-b232/24 14/10/23 21:20:43 error executor: exception in task id 8037 java.io.filenotfoundexception: /var/folders/3j/9hjkw0890sx_qg9yvzlvg64cf5626b/t/spark-local-20141023204346-b232/22/shuffle_81_0_1 (no such file or directory) @ java.io.fileoutputstream.open(native method) how handle this? guess updatestatebykey should periodically reset growing in rapid rate, please share illustration on when , how reset updatestatebykey.. or there other problem? shed light.
any help much appreciated. time
did set checkpoint ssc.checkpoint("path checkpoint")
spark-streaming
Comments
Post a Comment