"Requested dask.distributed scheduler but no Client active." RuntimeError for larger computations
- Installing a new environment:
mamba create -n myenv climix
- activating the env and running:
climix -e -x tn10p /nobackup/rossby27/users/sm_carni/data/tmp/data_files/tasmin_EUR-11_MPI-M-MPI-ESM-LR_rcp85_r2i1p1_MPI-CSC-REMO2009_v1_day_20060101-20101231.nc /nobackup/rossby27/users/sm_carni/data/tmp/data_files/tasmin_EUR-11_MPI-M-MPI-ESM-LR_rcp85_r2i1p1_MPI-CSC-REMO2009_v1_day_20110101-20151231.nc -r 2007/2009
Returns the following RuntimeError and saves no result:
101637ms:main.py:main() INFO:root:Calculation took 94.1128 seconds.
2023-05-08 12:44:25,748 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33317 -> tcp://127.0.0.1:34451
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33317 remote=tcp://127.0.0.1:45308>: Stream is closed
2023-05-08 12:44:25,749 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:33317 -> tcp://127.0.0.1:46206
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:33317 remote=tcp://127.0.0.1:46116>: Stream is closed
2023-05-08 12:44:25,793 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36001 -> tcp://127.0.0.1:34451
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36001 remote=tcp://127.0.0.1:33512>: Stream is closed
2023-05-08 12:44:25,795 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36001 -> tcp://127.0.0.1:46206
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36001 remote=tcp://127.0.0.1:34318>: Stream is closed
2023-05-08 12:44:25,800 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44554 -> tcp://127.0.0.1:34451
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:44554 remote=tcp://127.0.0.1:39332>: Stream is closed
2023-05-08 12:44:25,801 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:44554 -> tcp://127.0.0.1:46206
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:44554 remote=tcp://127.0.0.1:40084>: Stream is closed
2023-05-08 12:44:25,832 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36220 -> tcp://127.0.0.1:34451
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36220 remote=tcp://127.0.0.1:43522>: Stream is closed
2023-05-08 12:44:25,833 - distributed.worker - ERROR - failed during get data with tcp://127.0.0.1:36220 -> tcp://127.0.0.1:46206
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 225, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/worker.py", line 1787, in get_data
response = await comm.read(deserializers=serializers)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 241, in read
convert_stream_closed_error(self, e)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/comm/tcp.py", line 144, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) local=tcp://127.0.0.1:36220 remote=tcp://127.0.0.1:43490>: Stream is closed
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-conda/bin/climix", line 10, in <module>
sys.exit(main())
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/climix/main.py", line 353, in main
do_main(
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/climix/main.py", line 325, in do_main
save(
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/climix/datahandling.py", line 371, in save
result.data = r.result()
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/distributed/client.py", line 317, in result
raise exc.with_traceback(tb)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/optimization.py", line 990, in __call__
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/core.py", line 149, in get
result = _execute_task(task, cache)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/utils.py", line 73, in apply
return func(*args, **kwargs)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/array/chunk.py", line 225, in argtopk
if abs(k) >= a.shape[axis]:
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/array/core.py", line 1868, in __bool__
return bool(self.compute())
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/base.py", line 314, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/base.py", line 587, in compute
schedule = get_scheduler(
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/base.py", line 1400, in get_scheduler
return get_scheduler(scheduler=config.get("scheduler", None))
File "/home/sm_carni/.conda/envs/climix-conda/lib/python3.10/site-packages/dask/base.py", line 1375, in get_scheduler
raise RuntimeError(
RuntimeError: Requested dask.distributed scheduler but no Client active.
- Running another smaller index:
climix -e -x tn /nobackup/rossby27/users/sm_carni/data/tmp/data_files/tasmin_EUR-11_MPI-M-MPI-ESM-LR_rcp85_r2i1p1_MPI-CSC-REMO2009_v1_day_20060101-20101231.nc /nobackup/rossby27/users/sm_carni/data/tmp/data_files/tasmin_EUR-11_MPI-M-MPI-ESM-LR_rcp85_r2i1p1_MPI-CSC-REMO2009_v1_day_20110101-20151231.nc
-----> Returns no error.
- Downgrading dask to
mamba install dask==2023.4.0
, solves this error. But, results in another error when running a simpler index. Runningclimix -e -x txx /home/rossby/data_lib/esgf/cordex/output/EUR-11/SMHI/NCC-NorESM1-M/rcp85/r1i1p1/RCA4/v1/day/tasmax/latest/tasmax_EUR-11_NCC-NorESM1-M_rcp85_r1i1p1_SMHI-RCA4_v1_day_20060101-20101231.nc /home/rossby/data_lib/esgf/cordex/output/EUR-11/SMHI/NCC-NorESM1-M/rcp85/r1i1p1/RCA4/v1/day/tasmax/latest/tasmax_EUR-11_NCC-NorESM1-M_rcp85_r1i1p1_SMHI-RCA4_v1_day_20110101-20151231.nc
gives:
INFO:distributed.scheduler:Lost all workers
INFO:distributed.batched:Batched Comm Closed <TCP (closed) Scheduler connection to worker local=tcp://127.0.0.1:36766 remote=tcp://127.0.0.1:40786>
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/distributed/batched.py", line 115, in _background_send
nbytes = yield coro
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
value = future.result()
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/distributed/comm/tcp.py", line 269, in write
raise CommClosedError()
distributed.comm.core.CommClosedError
INFO:distributed.batched:Batched Comm Closed <TCP (closed) Scheduler connection to worker local=tcp://127.0.0.1:36766 remote=tcp://127.0.0.1:40776>
Traceback (most recent call last):
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/distributed/batched.py", line 115, in _background_send
nbytes = yield coro
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/tornado/gen.py", line 767, in run
value = future.result()
File "/home/sm_carni/.conda/envs/climix-latest/lib/python3.10/site-packages/distributed/comm/tcp.py", line 269, in write
raise CommClosedError()
distributed.comm.core.CommClosedError