We propose three parallelized algorithms BfsEnumP1-3 for enumeration of tree-like chemical compounds by modifying BfsSimEnum in simple manners. Let *N* be the number of processors. In growing a family tree, BfsSimEnum adds an atom to a molecular tree by BFS order. BfsEnumP1-3 take a parameter *d*, grow a family tree up to depth *d* as BfsSimEnum does, and assign numbers to the vertices (molecular trees) in depth *d* by BFS order. Figure 1 shows an example of the family tree for C_{2}O_{2}H_{2} and numbers, #0, ..., #3, in depth 2. All *N* processors independently construct the family tree up to depth *d* and assign numbers one by one. Each vertex in depth *d* is assigned to exactly one processor, and the processor generates its descendants, the subtree rooted at the vertex of the family tree. However, we observe that the number of generated molecular trees in the descendants is often different. In the example of Figure 1, the number of generated molecular trees for vertex '#0' is eight, and on the other hand, that for '#1' is one. Hence, we develop three types of assignment methods in BfsEnumP1-3 for the sake of distributing the load equally to each processor. BfsEnumP1-2 take static assignment methods, and BfsEnumP3 takes a dynamic method depending on computational environment during execution.

By modifying the previous single algorithm BfsSimEnum, we propose the following parallelized algorithm.

**Input**: numbers *n*_{
l
}_{
i
} of atoms for *l*_{
i
} (*∈* Σ), division depth *d*, processor identifier *p*, number *N* of processors,

{n}_{a}:={\sum}_{\left\{{l}_{i}\in \sum |val\left({l}_{i}\right)>1\right\}}{{n}_{l}}_{{}_{i}},d<{n}_{a}

**Output**: all molecular trees in normal form

**BfsEnumP**(*p, N* )

*c* := 0

**for** each *l*_{
j
} *∈* Σ such that *val*(*l*_{
j
} ) *>*1, {n}_{{l}_{j}} > 0 **do**

*T* := a tree consisted of a root with *l*_{
j
}

AddAtom(*T , p, N* )

**end**

**AddAtom**(*T , p, N* )

**if** |*T*| = *n*_{
a
} **then**

**if** *T* is in normal form **then**

BfsMulEnum(*T* )

**else**

*flag* := *true*

**if** |*T*| = *d* **then**

*flag* := IsAssigned(*c, p, N*)

*c* := *c* + 1

**if** *flag* **then**

*v*_{
k
} := the deepest rightmost vertex in *T*

*v*_{
l
} := the deepest leftmost vertex in *T*

**if** *v*_{
k
} and *v*_{
l
} are included in the same subtree **then**

*v*_{
e
}:= *v*_{l−1}

**else** *v*_{
e
} := *v*_{
k
}

**for** each *v*_{
i
} from *parent*(*v*_{
k
} ) to *v*_{
e
} in BFS order **do**

**if** *degree*(*v*_{
i
}) *< val*(*l*(*v*_{
i
})) **then**

**for** each *l*_{
j
} ∈ Σ such that *val*(*l*_{
j
}) *>*1 **do**

**if** *num*_{
T
} (*l*_{
j
}) <{n}_{{l}_{j}}**and**

*l*_{
j
} does not violate left-heavy **then**

*T*′ := *T*

add an atom *l*_{
j
} as the last child of *v*_{
i
} in *T′*

AddAtom(*T′, p, N*)

**end**

It should be noted that this pseudocode describes the common part of BfsEnumP1-3, and function 'IsAssigned' provides an assignment method according to BfsEnumP1-3. *c* means the identifier number for each vertex in depth *d* of the family tree. All processors execute the same algorithm with distinct identifier number *p* among *N* processors, and BfsMulEnum(T) sequentially outputs molecular trees by adding multiplicity to edges of *T* if needed. Thus, *N* processors output all tree-like chemical compounds without redundancy.

### BfsEnumP1

We define the assignment method of BfsEnumP1 as follows.

**IsAssigned**(*c, p, N* )

**return** *p* = *c* mod *N*

**end**

'IsAssigned' returns whether or not the processor with identifier *p* is assigned to vertex *c*. For instance, in the case of enumeration using 3 processors and division depth *d* = 2 for C_{2}O_{2}H_{2}, vertices 0, 1, 2, 3 are assigned to processors 0, 1, 2, 0, respectively by BfsEnumP1 (see Figure 1).

### BfsEnumP2

We define the assignment method of BfsEnumP2 as follows. First, we initialize weights *w*_{
i
} = 0 for *i* = 0,..., *N* − 1.

**IsAssigned**(*T , p*, {*w*_{
i
}})

*i* := argmin_{i = 0,...,N − 1wi}

*w*_{
i
} := *w*_{
i
} + *cost*(*T*)

**return** *p* = *i*

**end**

In BfsEnumP2, the number of molecular trees generated from *T* is estimated by *cost*(*T* ), which is accumulated to *w*_{
i
}. One processor having the minimum of *w*_{
i
} is selected to execute the enumeration from *T* . It should be noted that any communication between processors does not occur during the construction of a family tree as well as BfsEnumP1, and *w*_{
i
} is calculated independently in each processor. In this paper, we define *cost*(*T* ) by

\begin{array}{c}{\displaystyle \sum _{{v}_{i}\in \left\{parent\left({v}_{k}\right),\dots ,{v}_{k}\right\}}}val\left(l\left({v}_{i}\right)\right)-degree\left({v}_{i}\right)\\ +{\displaystyle \sum _{{l}_{i}\in \sum}}{c}_{{l}_{i}}\left({n}_{{l}_{i}}-nu{m}_{T}\left({l}_{i}\right)\right),\end{array}

where *v*_{
k
} denotes the deepest rightmost vertex in *T* , and *c*_{
l
}_{
i
} is a positive constant for *l*_{
i
}, (*c*_{
C
} *, c*_{
N
} *, c*_{
O
} *, c*_{
H
} ) = (1.4, 1.2, 1.0, 0.0). Here the valence of each atom is taken into account. *cost*(*T* ) is large if the number of positions that atoms bond and/or the number of remaining atoms are large.

### BfsEnumP3

BfsEnumP3 requires an extra processor to manage the assignment, which receives requests from other processors, and replies an assigned number to each processor. It should be noted that such a manager is not needed if we use shared memory. In this paper, we implement the algorithm using MPI (message passing interface) for avoiding inconsistency of cache memory. On the other hand, processor *p* receives an assigned number as *r* from the manager, and executes the enumeration from vertex *r*.

Finally, BfsEnumP3 in processor *p* sends an end-signal to the manager. Thus, we have the following pseudo-codes.

**Manage**(*N* )

*globalc* := 0

*n*_{
e
} := 0

**while** *n*_{
e
} *< N*

**if** receive a request from processor *p* **then**

send *globalc* to *p*

*globalc* := *globalc* + 1

**else if** receive an end-signal from *p* **then**

*n*_{
e
} := *n*_{
e
} + 1

**end**

**IsAssigned**(*c, r, p*)

**if** *c* >*r* **then**

send a request to the manager

receive *globalc* as *r*

**return** *c* = *r*

**end**

Here, *r* is initialized as some negative integer.