FDTD - Using Multiple Processors on Cluster

Hi,
I’m trying to run some FDTD simulations on a high-performance computing cluster but I’m running into some difficulties. I am trying to do a distributed computing job, but I can’t seem to get Lumerical to use more than one processor and I was wondering if anybody could tell me what I’m doing wrong.

Our cluster uses Slurm as a job manager, so I’ve created a bash script (attached) for submitting my job. In the script, I do the normal setup where I set a number of nodes and a number of processes per node (as well as RAM allocations) and then I point Lumerical to the MPICH2 Nemesis MPI and run the job.

Whenever the job runs, the cluster allocates the correct number of nodes, but when I check the Lumerical log files (one is attached) they say that only 1 CPU is used and the simulation is 1x1x1. It runs fine and gives me a file, but it defeats the purpose of using the HPC cluster if I’m using 1 CPU.

Is there something I have to allocate in the resources before saving the .fsp file? Or another command I have to add to the Slurm bash? Or is this a question for our HPC support staff?

Thanks,
Brandon

SubmitScript.txt (914 Bytes) sweep_1_p0-edited.log-1.txt (5.2 KB)

@hacha

Welcome to Lumerical Knowledge Exchange.

You are not using MPI

fdtd-engine-mpich2nem  /home/<USER>/LumericalTesting/LumericalForCluster_sweep/*.fsp

The command “fdtd-engine-mpich2nem” is only running with the FDTD engine.

Run with MPI on cluster

You will need to run the simulation job with MPI to be able to distribute across multiple nodes.

  • Typically your IT would have configured an MPI to run jobs on your cluster.
  • Check if the MPI your cluster uses is supported by Lumerical.
  • Check with your IT on the exact command to use/execute to run the simulation using MPI on your cluster/job scheduler to distribute to multiple nodes based on your submission script.

Examples:

Using bundled mpich2:

With the default install path: “/opt/lumerical/v202/mpich2/...”, your script would be:

... (rest of your script)
# executable
load module lumerical

/opt/lumerical/v202/mpich2/nemesis/bin/mpiexec fdtd-engine-mpich2nem /home/<USER>/LumericalTesting/LumericalForCluster_sweep/*.fsp -logall -fullinfo

Using/Loading your cluster’s MPI:

If supported by Lumerical, use your cluster’s MPI and execute the command to “load your MPI module”:

... (rest of your script)
# executable
load module lumerical
load module mpi      # loading your mpi module 

mpiexec fdtd-engine-mpich2-lcl /home/<USER>/LumericalTesting/LumericalForCluster_sweep/*.fsp -logall -fullinfo

Note:

Use the correct FDTD engine executable for your supported MPI variant.

fdtd-engine-mpich2-lcl
fdtd-engine-impi-lcl
fdtd-engine-ompi-lcl

Running the submission script:

>$  sbatch -N 4 --ntasks-per-node=4 submit-script.sh

Hope this helps.

Best,
Lito

Thank you so much @lyap! That all worked!

@hacha

You are welcome. Glad that the issue has been resolved. :+1:

Best,
Lito

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.