Sorry for my delayed reply.
I spend some time with my colleague to assess the best approach for your simulation. Since your device has specific dimension (i.e. not infinite), use of periodic BC will not be a good idea.
Also, as you mentioned, dipoles have different phases. Does this mean that you are looking for the coherent sum of the results? If this is the case, you will need to run at least 50-100 simulations, each with dipole ensemble, and then add the results coherently. Of course this approach will be time consuming.
If the results are incoherent (for example, dipoles emit at different frequencies) you can simulate only one dipole at a time and then add them incoherently. A good link that discusses these approaches is:
If the incoherent sum is what you want, I think we can simplify simulations a lot. Lets assume that grating structure is large enough such that dipole transmission to the left and right monitors is not sensitive to the dipole position along the x-direction (you can check this by putting dipole at different location along the x-axis). Also, from your dipole cloud, it looks like they are located up to only 1.5um below the substrate. So, I think we need only 4 simulation with four dipole position (shown by blue star in the screenshot below):
You can learn more about this approach by looking at OLED examples in our KB:
Please let me know how you want to treat your simulations and I am glad to be of a help.