2 years ago

#64595

test-img

Nitin Malapally

Improving a clumsy automatic cuboid domain decomposition algorithm

Introduction

For my library, I need to split up a cuboid point lattice for assignment to multiple processing elements (PEs).

Example decomposition

As you can see, no points are shared between the PEs, and certain PEs (here 4-7) get only 2D 'slices'.

The interface accepts n1, n2, n3, P which are all integers. nx indicates the number of lattice points along direction x i.e. n1, n2 or n3. P indicates the number of PEs available.

Current problem formulation

Since the domain is a cuboid, I'm looking for a 3-part-factorization of P such that P = p1 * p2 * p3 and px is proportional to nx.

Limitations of problem formulation

  1. 3-part-factorization algorithm needs to be developed.
  2. Some sort of metric must be used to choose any one of the many 3-part-factorizations generated. In the example in the image, n1 = 3, n2 = 5, n3 = 4 and say, P = 54, we would have the 3-part-factorizations (27, 2, 1), (9, 3, 2), (6, 9, 1) and (6, 3, 3). Which one is here applicable? A simple magnitude-of-difference metric would choose (6, 3, 3). But even this has multiple permutations (3, 6, 3) and (3, 3, 6).
  3. If the chosen 3-part-factorization has a factor which is greater than the corresponding lattice size (nx) then we would have to truncate it, which means some PEs will do no work.
  4. If the chosen 3-part-factorization has a factor (perhaps after the truncation mentioned in the above point) which does not divide its corresponding lattice size, we have unequal work-sharing (although some PEs might be sleeping as seen in the above point).

How reader can help

  • How can we improve this inelegant algorithm? Could you suggest any improvement to any of the points in 'Limitations of problem formulation'?
  • Is there a better and more formalized approach (i.e. problem formulation)?
  • Links to papers/websites/articles/books would be greatly appreciated.

parallel-processing

mpi

factorization

mathematical-lattices

0 Answers

Your Answer

Accepted video resources