We propose three parallelized algorithms BfsEnumP1-3 for enumeration of tree-like chemical compounds by modifying BfsSimEnum in simple manners. Let N be the number of processors. In growing a family tree, BfsSimEnum adds an atom to a molecular tree by BFS order. BfsEnumP1-3 take a parameter d, grow a family tree up to depth d as BfsSimEnum does, and assign numbers to the vertices (molecular trees) in depth d by BFS order. Figure 1 shows an example of the family tree for C2O2H2 and numbers, #0, ..., #3, in depth 2. All N processors independently construct the family tree up to depth d and assign numbers one by one. Each vertex in depth d is assigned to exactly one processor, and the processor generates its descendants, the subtree rooted at the vertex of the family tree. However, we observe that the number of generated molecular trees in the descendants is often different. In the example of Figure 1, the number of generated molecular trees for vertex '#0' is eight, and on the other hand, that for '#1' is one. Hence, we develop three types of assignment methods in BfsEnumP1-3 for the sake of distributing the load equally to each processor. BfsEnumP1-2 take static assignment methods, and BfsEnumP3 takes a dynamic method depending on computational environment during execution.
By modifying the previous single algorithm BfsSimEnum, we propose the following parallelized algorithm.
Input: numbers n
l
i
of atoms for l
i
(∈ Σ), division depth d, processor identifier p, number N of processors,
Output: all molecular trees in normal form
BfsEnumP(p, N )
c := 0
for each l
j
∈ Σ such that val(l
j
) >1, > 0 do
T := a tree consisted of a root with l
j
AddAtom(T , p, N )
end
AddAtom(T , p, N )
if |T| = n
a
then
if T is in normal form then
BfsMulEnum(T )
else
flag := true
if |T| = d then
flag := IsAssigned(c, p, N)
c := c + 1
if flag then
v
k
:= the deepest rightmost vertex in T
v
l
:= the deepest leftmost vertex in T
if v
k
and v
l
are included in the same subtree then
v
e
:= vl−1
else v
e
:= v
k
for each v
i
from parent(v
k
) to v
e
in BFS order do
if degree(v
i
) < val(l(v
i
)) then
for each l
j
∈ Σ such that val(l
j
) >1 do
if num
T
(l
j
) <and
l
j
does not violate left-heavy then
T′ := T
add an atom l
j
as the last child of v
i
in T′
AddAtom(T′, p, N)
end
It should be noted that this pseudocode describes the common part of BfsEnumP1-3, and function 'IsAssigned' provides an assignment method according to BfsEnumP1-3. c means the identifier number for each vertex in depth d of the family tree. All processors execute the same algorithm with distinct identifier number p among N processors, and BfsMulEnum(T) sequentially outputs molecular trees by adding multiplicity to edges of T if needed. Thus, N processors output all tree-like chemical compounds without redundancy.
BfsEnumP1
We define the assignment method of BfsEnumP1 as follows.
IsAssigned(c, p, N )
return p = c mod N
end
'IsAssigned' returns whether or not the processor with identifier p is assigned to vertex c. For instance, in the case of enumeration using 3 processors and division depth d = 2 for C2O2H2, vertices 0, 1, 2, 3 are assigned to processors 0, 1, 2, 0, respectively by BfsEnumP1 (see Figure 1).
BfsEnumP2
We define the assignment method of BfsEnumP2 as follows. First, we initialize weights w
i
= 0 for i = 0,..., N − 1.
IsAssigned(T , p, {w
i
})
i := argmini = 0,...,N − 1wi
w
i
:= w
i
+ cost(T)
return p = i
end
In BfsEnumP2, the number of molecular trees generated from T is estimated by cost(T ), which is accumulated to w
i
. One processor having the minimum of w
i
is selected to execute the enumeration from T . It should be noted that any communication between processors does not occur during the construction of a family tree as well as BfsEnumP1, and w
i
is calculated independently in each processor. In this paper, we define cost(T ) by
where v
k
denotes the deepest rightmost vertex in T , and c
l
i
is a positive constant for l
i
, (c
C
, c
N
, c
O
, c
H
) = (1.4, 1.2, 1.0, 0.0). Here the valence of each atom is taken into account. cost(T ) is large if the number of positions that atoms bond and/or the number of remaining atoms are large.
BfsEnumP3
BfsEnumP3 requires an extra processor to manage the assignment, which receives requests from other processors, and replies an assigned number to each processor. It should be noted that such a manager is not needed if we use shared memory. In this paper, we implement the algorithm using MPI (message passing interface) for avoiding inconsistency of cache memory. On the other hand, processor p receives an assigned number as r from the manager, and executes the enumeration from vertex r.
Finally, BfsEnumP3 in processor p sends an end-signal to the manager. Thus, we have the following pseudo-codes.
Manage(N )
globalc := 0
n
e
:= 0
while n
e
< N
if receive a request from processor p then
send globalc to p
globalc := globalc + 1
else if receive an end-signal from p then
n
e
:= n
e
+ 1
end
IsAssigned(c, r, p)
if c >r then
send a request to the manager
receive globalc as r
return c = r
end
Here, r is initialized as some negative integer.