Running DEVICE or MODE Solutions simulations on a cluster - supported MPI versions


#1

Question
Is it possible to run DEVICE or MODE Solutions simulations on a Linux cluster that only supports OpenMPI? The “Matching MPI and solver versions” section of the following page suggests that the DEVICE and MODE Solutions solvers (FDE, EME, HEAT, CHARGE) are only compatible with MPICH2.
https://kb.lumerical.com/en/index.html?user_guide_run_linux_solver_command_line_mpi.html

Answer
A bit of background is required to answer this question. Lumerical’s software uses MPI for one or both of the following reasons:

  • Distributing a single job between multiple cores/nodes (FDTD, varFDTD solvers)
    The speed of FDTD and varFDTD simulations can be increased by dividing the simulation region into a number of sub-regions and then running the calculation of each sub-region on a different CPU core or different computer. In this scenario, MPI is used to manage the communication between each sub-region process. This ability to efficiently divide the simulation volume into sub-regions is a major strength of the FDTD method.
    Many of Lumerical’s other solvers (FDE, EME, HEAT, CHARGE) are nto compatible with this type of distributed computing, and therefore don’t require MPI for distributed computing. It is worth noting that these solvers can still take advantage of multiple cores in a CPU, but they uses a different technique (multi-threading) that doesn’t require MPI. The multi-threading approach does not allow a single job to be distributed between multiple computers.
  • Starting jobs from the graphical design environment job manager
    By default, the Lumerical job manager uses MPI for all job launching (both on local and remote computers). For solvers like FDTD, this makes a lot of sense because MPI is already required for distributing the job between the CPU cores. MPI is also nice because it functions in a similar way on all operating systems. However, it is important to recognize that MPI is not the only way to launch jobs. For example, on Linux machines, we could have used ssh instead.

On Linux, Lumerical’s products use MPICH2 for the above tasks. MPICH2 is included with the product installers. However, linux clusters typically have a version of MPI pre-installed. OpenMPI is a common example. Users of the cluster are encouraged to use the pre-installed version for optimal performance. Lumerical’s FDTD and varFDTD engine support many such MPI versions, as described in the “Matching MPI and solver versions” section of the following page suggests.
https://kb.lumerical.com/en/index.html?user_guide_run_linux_solver_command_line_mpi.html

Lumerical’s other solvers including FDE, EME, HEAT and CHARGE don’t support as many MPI variants. Fortunately, this is not a problem because they don’t actually need MPI because they don’t support distributed computing based on MPI, as described above. When running these solvers on your cluster, they can be treated as simple, non-MPI based applications. For example, rather than using an MPI based command like:
/opt/lumerical/mode/mpich2/nemesis/bin/mpiexec /opt/lumerical/mode/bin/eme-engine $HOME/temp/example.lms

use a command like:
/opt/lumerical/mode/bin/eme-engine $HOME/temp/example.lms


Device에서 mpi 기능