diff --git a/source/running/main.rst b/source/running/main.rst
index 4a955ab32341ff5b5f7a45c4570d7d4b9aba5477..744cc39f904f946da78d0d33aa3421dca58d1318 100644
--- a/source/running/main.rst
+++ b/source/running/main.rst
@@ -169,10 +169,10 @@ Running batch jobs from ScriptEngine
 ScriptEngine can send jobs to the SLURM batch system when the
-``scriptengine-tasks-hpc package`` is installed, which is done automatically if
+``scriptengine-tasks-hpc`` package is installed, which is done automatically if
 the ``environment.yml`` file has been used to create the Python virtual
-environment, as described in :ref:`creating_virtual_environment`.
-Here is an example of the ``hpc.slurm.sbatch`` task in ``example.yml``:
+environment, as described in :ref:`creating_virtual_environment`. Here is an
+example of using the ``hpc.slurm.sbatch`` task:
 .. code-block:: yaml+jinja
@@ -206,6 +206,11 @@ again, but do nothing because it already runs in a batch job.
 Then, the next task (``base.echo``) would be executed, writing the message to
 standard output in the batch job.
+Note that in the default runskript examples, submitting the job to SLURM is done
+behind the scenes in ``scriptlib/submit.yml``. The actual configuration for the
+batch job, such as account, allocated resources, etc, is configured according to
+the chosen launch option, as described below.
 Launch options
@@ -216,6 +221,7 @@ model run once the jobs is executed by the batch system:
 * SLURM heterogeneous jobs (``slurm-hetjob``)
 * SLURM multiple program configuration and ``taskset`` process/thread pinning
+* SLURM wrapper with taskset and node groups (``slurm-wrapper-taskset``)
 * SLURM job with generic shell script template (``slurm-shell``)
 Each option has advantages and disadvantages and they come also with different
@@ -303,6 +309,169 @@ in many cases, because the remaining nodes are used exclusively for one
 component each.
+SLURM wrapper and taskset
+This launch option uses the SLURM ``srun`` command together with
+* a HOSTFILE created on-the-fly
+* a wrapper created on-the-fly, which uses
+* the ``taskset`` command to set the CPU's affinity for MPI processes, OpenMP threads
+  and hyperthreads
+The ``slurm-wrapper-taskset`` option is configured per node. Instead of choosing
+the total number of tasks or nodes dedicated to each component, you specify the
+number of MPI processes for each component that will execute on each computing
+node. To avoid repeating the same node configuration over and over again, the
+configuration is structured in groups, each representing a set of nodes with the
+same configuration.
+The following simple example assumes a computer platform that has 128 cores per
+comupte node, such as, for example, the ECMWF HPC2020 system. Three nodes are
+allocated to run a model configuration with four components: XIOS (1 process),
+OpenIFS (250 processes), NEMO (132) and the Runoff-mapper (1 process):
+.. code-block:: yaml
+  platform:
+    cpus_per_node: 128
+  job:
+    launch:
+      method: slurm-wrapper-taskset
+    groups:
+      - {nodes: 1, xios: 1, oifs: 126, rnfm: 1}
+      - {nodes: 2, oifs: 62, nemo: 66}
+Two groups are defined in this example: the first comprising **one** node
+(running XIOS, OpenIFS and the Runoff-mapper), and the second group with **two**
+nodes running OpenIFS and NEMO.
+.. note:: The ``platform.cpus_per_node`` parameter and the ``job.*`` parameters
+  do not have to be defined in the same file, as suggested in the simple
+  example. In fact, the ``platform.*`` parameters are usually defined in the
+  platform configuration file, while ``job.*`` is usually found in the
+  experiment configuration.
+A second example illustrates the use of hybrid parallelization (MPI+OpenMP) for
+OpenIFS. The number of MPI tasks per node reflects that each process will be
+using more than one core:
+.. code-block:: yaml
+  platform:
+    cpus_per_node: 128
+  job:
+    launch:
+      method: slurm-wrapper-taskset
+    oifs:
+      omp_num_threads: 2
+      omp_stacksize: "64M"
+    groups:
+      - {nodes: 1, xios: 1, oifs: 63, rnfm: 1}
+      - {nodes: 2, oifs: 64}
+      - {nodes: 2, oifs: 31, nemo: 66}
+Note the configuration of ``job.oifs.omp_num_thread`` and
+``job.oifs.omp_stacksize``, which set the OpenMP environment for OpenIFS. The
+example utilises the same number of MPI ranks for XIOS, NEMO and the
+Runoff-mapper, and 253 MPI ranks for OpenIFS. However, each OpenIFS MPI
+rank has now two OpenMP threads, which results in 506 cores being used for the
+.. caution:: The ``omp_stacksize`` parameter is needed on some platforms in
+  order to avoid errors when there is too little stack memory for OpenMP threads
+  (see `OpenMP documentation
+  <https://www.openmp.org/spec-html/5.0/openmpse54.html>`_). However, the
+  example (and in particular the value of 64MB) should not be seen as a general
+  recommendation for all platforms.
+Overall, the ``slurm-wrapper-taskset`` launch method allows to share the compute
+nodes flexibly and in a controlled way between |ece4| components, which is
+useful to avoid idle cores. It can also help to decrease the computational costs
+of configurations involving components with high memory requirements, by
+allowing them to share nodes with components that need less memory.
+Optional configuration
+Some special configuration parameters may be required for the
+``slurm-wrapper-taskset`` launcher on some machines.
+.. hint:: Do not use these special parameters, unless you need to!
+The first special parameter is ``platform.mpi_rank_env_var``:
+.. code-block:: yaml
+  platform:
+    mpi_rank_env_var: SLURM_PROCID
+This is the name of an environment variable that must contain the MPI rank for
+each task at runtime. The default value is `SLURM_PROCID`, which should work for
+SLURM when using the `srun` command. Other possible choices that work for some
+platforms are `PMI_RANK`` or `PMIX_RANK`.
+Another special parameter is ``platform.shell``:
+.. code-block:: yaml
+  platform:
+    shell: "/usr/bin/env bash"
+It is used for the wrapper script to determine the appropriate shell. It must be
+configured if the given default value is not valid for your platform.
+Implementation of Hyper-threading
+The implementation of Hyper-threading in this launch method is restricted to
+OpenMP programs (only available for OpenIFS for now). It assumes that CPUs
+number ``i`` and ``i + platform.cpus_per_node`` correspond to the same physical
+core. By enabling the ``job.oifs.use_hyperthreads`` option, both cpus ``i`` and
+``i + job.cpus_per_node`` are bound for the execution of that component. In this
+case, the number of OpenMP threads executing that component is twice the value
+given in ``job.oifs.omp_num_threads``. The following example would configure
+OpenIFS to execute using 4 threads in the [0..127] range:
+.. code-block:: yaml
+  platform:
+    cpus_per_node: 128
+  job:
+    oifs:
+      omp_num_threads: 4
+      omp_stacksize: "64M"
+      use_hyperthreads: false
+while the following example would result in 8 OpenIFS threads, with 4 of them in the [0..127]
+range, and the others in [128..255]:
+.. code-block:: yaml
+  platform:
+    cpus_per_node: 128
+  job:
+    oifs:
+      omp_num_threads: 4
+      omp_stacksize: "64M"
+      use_hyperthreads: true
+There is also the possibility of using all the 256 logical cpus in the node to
+run more MPI tasks, as in the following example. In this case, the
+``job.oifs.use_hyperthreads`` option must be disabled for every component (it is
+disabled by default):
+.. code-block:: yaml
+  platform:
+    cpus_per_node: 256
+  job:
+    oifs:
+      use_hyperthreads: false
 SLURM shell template
@@ -339,12 +508,18 @@ shared between OpenIFS and the Runoff-mapper.
         ntasks: 127
         ntasks_per_node: 127
         omp_num_threads: 1
+        omp_stacksize: "64M"
         ntasks: 127
         ntasks_per_node: 127
         ntasks: 1
         ntasks_per_node: 1
+    slurm:
+      sbatch:
+        opts:
+          hint: nomultithread
     # remaining configuration same as for slurm-hetjob