Improve dask usage for large jobs
As we are tackling larger datasets, we can make better use of Dask's features to guarantee a smoother operation in the face of possible stumbling blocks from delays in the file system, time required to start up the cluster, etc.