Running FitMultiCell pipeline in an HPC infrastructure

To run the pipeline in an HPC infrastructure, you need first to install pyABC, fitmulticell, Morpheus, and Redis server on the cluster.

Install fitmulticell

To install fitmulticell on an HPC infrastructure, you need to install it under the user domain without the need for root privileges. Please be aware that on such systems, the user usually doesn’t have roots right. To check how you can do that, please check: https://gitlab.com/fitmulticell/fit/-/blob/master/doc/install.rst#install-as-user-into-your-home-directory-recommended

Install pyABC

A similar installation causing should be followed here to install pyABC. WE should also install it in the user domain. To check how you can do that, please check: https://pyabc.readthedocs.io/en/latest/installation.html#install-as-user-into-your-home-directory-recommended

Install Morpheus

Installing Morpheus is a tricky part. You can find detailed information about how you can install Morpheus in your machine in the following link: https://morpheus.gitlab.io/. Before starting Morpheus installation, be sure that you Install/load all dependencies. You can see the list of dependencies here: https://gitlab.com/morpheus.lab/morpheus#install. to check available module in your system, you can use module avail.

Please be aware that we will mainly work with the CLI version of Morpehsu and that there is no need to install the GUI version. Be sure to replace make && sudo make install with make && make install since you usually don’t have root privileges.

After the installation complete, we can then use Morpheus executable which is located usually on ~/morpheus/build/morpheus/morpheus

Install Redis

In order to run pyABC on a distributed system, we need to use Redis server. It might already be installed in your cluster by the IT team. You need to check that first. You can check the available module in your system by using module avail. If Redis is available, then you can load it, e.g module load Redis.

Nonetheless, there is a chance that Redis is et not installed in the system. A detailed guide about how to install, setup, and use Redis server is provided in the following links: https://pyabc.readthedocs.io/en/latest/sampler.html#how-to-set-up-a-redis-based-distributed-cluster, https://redis.io/topics/quickstart.

Running fitmulticell

After installing all required components of the pipeline, you can now start using it in your cluster.

Running fitmulticell on clusters

To fasilitate easier deployment of fitmulticell pipeline on large cluster: see here (https://gitlab.com/fitmulticell/fit/-/tree/develop/scripts/allocate_batch), we created a set of batch script. These scripts will allow dynamically allocate master/worker nodes and enable the communication between them.

Start a parallel pyABC run: run submit_job.sh with the follwing parameters:

  • port number

  • Number of nodes

  • queue name

  • job time

  • CPUSPERTASK

  • python script

submit_job.sh script is the main one and it will invoke the follwing scripts:

  • load_module.sh: this script will load all reuired modules.

  • submit_redis.sh: this script will initiate the redis server and will return the IP address of the hosted node.

  • submit_python.sh: this script will submit the python task.

  • submit_worker.sh: this script will initiate workers the simulate the model.

To kill all running jobs: run kill_all.sh

Please note that you need to change the python file to take two arguments (the port_number and the host_id). This can be done using the argparse package.