The GMW framework allows us to transform any computation in a form of a circuit into a privacy-preserving one against semi-honest adversaries. Therefore, we focus on designing a depth-optimized circuit with XORAND and NOT gates to compute set-maximal matches. Then, using the generic protocol described in the previous section, we have a secure protocol for set-maximal matches that requires at most as many rounds of interaction as the depth of the circuit.
Problem definition
We give an efficient and depth-optimized algorithm for computing set-maximal matches. More specifically, the problem specification is as follows:
Input: A genomic database Y containing m sequences, each of size n, a query genomic sequence x of size n, and a threshold value t. Output: A matrix M of size m×n such that the element Mj,i is equal to the length of the match between x and yj ending at position i if the match is set-maximal with threshold t and 0 otherwise.
We note that the output as described above leaks the position of set-maximal matches. Because of the very sensitive nature of genomic sequences, it is beneficial to hide even this information. Therefore, we slightly modify the above problem so that the list of set-maximal matches is revealed after applying a random permutation to each row of the output.
Input: A genomic database Y containing m sequences, each of size n, a query genomic sequence x of size n, a threshold value t and m permutations \((\pi _{k})_{k \in \{1,\dots, m\}}\). Output: Let M be a matrix of size m×n such that the element Mj,i is equal to the length of the match between x and yj ending at position i if the match is set-maximal with threshold t and 0 otherwise. The output is the permutation of each row k of M according to πk.
We observe that the output still reveals the lengths of set-maximal matches for each index. This information could be hidden by applying a random permutation on all the entries of M, instead of each row. However, it seems that this would reduce the applicability of the computation, since this information seems central for certain application.
In the secure protocol, the database Y and the permutations \((\pi _{k})_{k \in \{1,\dots, m\}}\) are the input of the server, the query x is the input of the client and the threshold t is known to both parties. To avoid leakage to the server, the protocol has selective reconstruction to the client.
Algorithm description
We describe our algorithm. We include a sample execution in Fig. 1.
-
1
We compute a matrix M of size m×n such that Mj,i is equal to the length of the match between the query x and the database entry yj ending at position i (Fig. 1b).
-
2
We set Mj,i to 0 if the match of x and yj ending at position i is below the threshold t (Fig. 1c).
-
3
We compute a matrix L of size m×n such that Lj,i is 0 if there is a j′≠j such that \(\phantom {\dot {i}\!}\mathbf {M}_{j',i} > \mathbf {M}_{j,i}\) and 1 otherwise (Fig. 1d).
-
4
We compute K such that Kj,i is 0 if there is a j′ (may be equal j) such that \(\phantom {\dot {i}\!}\mathbf {L}_{j',i}= y{L}_{j',i+1} = 1\). Namely, there exists a match that is extended to position i+1 (Fig. 1e).
-
5
We set Mj,i←Mj,iKj,iLj,i and we permute the row Mk according to permutation πk (Fig. 1f).
After the steps described above, the matrix M contains the correct output:
-
The output M contains the correct length of matches computed in step 1.
-
The output M contains no matches of length less than t, since all such matching have been removed in step 2.
-
The output M does not output matches strictly contained in another match. If a match is contained in another larger match, then either there is a match with a preceding starting point or a match with a succeeding ending point or both. In the first and third case, this match is excluded in step 3, since there is another database entry with larger value in the corresponding positions of M. In the second case, the match is excluded in step 4, since there is another extendable match in the corresponding positions.
-
The output M contains only locally maximal matches. After step 1, M contains the length of matches from their starting position, so it is not possible for a match to be extendable toward a previous position. Additionally, from step 4, M does not contain matches extendable towards the next position.
We now describe how to implement each of these steps using boolean circuits with AND,XOR and NOT gates in a depth-optimized way. Compute the length of matches: First, we compute an auxiliary matrix B(0) such that \(\mathbf {B}^{(0)}_{j,i} = 1\) if x[ i]=yj[ i]. Namely, \(\mathbf {B}^{(0)}_{j,i} = \lnot (\mathbf {x}[\!i] \oplus \mathbf {y}_{j}[\!i])\). We observe that B(0) indicates whether a match has length greater than or equal to one. Using B(0), we compute whether a match has length greater than or equal to two by setting \(\mathbf {B}_{j,i}^{(1)} \leftarrow \mathbf {B}_{j,i}^{(0)} \wedge \mathbf {B}_{j,i-1}^{(0)}\). More generally, if B(k) indicates whether a match has length more than 2k or not, then it can be updated to indicate if the length of a match is more than 2k+1 by setting \(\mathbf {B}_{j,i}^{(k+1)} \leftarrow \mathbf {B}_{j,i}^{(k)} \wedge \mathbf {B}_{j,i-(2^{k}-1)}^{(k)}\).
Concurrently, in each iteration we compute a bound on the length of each match by setting M(0)=B(0) and \(\mathbf {M}_{j,i}^{(k+1)} \leftarrow \mathbf {B}_{j,i-(2^{k}-1)}^{(k)} \mathbf {M}_{j,i-(2^{k}-1)}^{(k)} +\mathbf {M}_{j,i}^{(k)} \). We observe that this computation gives the actual length of a match if it is less than or equal to 2k and returns the lower bound of 2k otherwise. Hence, after ⌈log(n)⌉ iterations, we set M=M(⌈log(n)⌉).
The addition is computed using a depth-optimized adder, which has AND-depth proportional to the logarithm of the bit length of the numbers returned [23]. Namely, the AND-depth of the adder is O(log(logn)). Overall, the AND-depth of the length computation is O(log(n) log(logn)). Remove matches with length below the threshold: Let tk≡t (mod 2k) for \(k \in \left \{1, \dots, \lceil \log (n)\rceil \right \}\) and \((b_{\lceil \log (n)\rceil }, \dots, b_{1})\) be the bit decomposition of t, where b⌈log(n)⌉ is the most significant bit and b1 is the least significant bit. Let T(k) be m×n matrices that indicate candidate matches; initially \(\mathbf {T}_{j,i}^{(0)} = 1\) for all i and j. Similarly to the length computation, we define the auxiliary matrix B(0) that initially indicates whether a database and query position match or not. In each iteration, we update B(k) as in the length computation; namely, we set \(\mathbf {B}_{j,i}^{(k)} \leftarrow \mathbf {B}_{j,i}^{(k-1)} \wedge \mathbf {B}_{j,i-(2^{k} - 1)}^{(k-1)} \). In the k-th iteration if bk=1, we update \(\mathbf {T}^{(k)}_{j,i} \leftarrow \mathbf {B}_{j,i - (t_{k} - 1)}^{(k)}\mathbf {T}_{j,i}\), otherwise T(k)=T(k−1). Finally, we set \(\mathbf {M}_{j,i} \leftarrow \mathbf {M}_{j,i} \mathbf {T}_{j,i}^{(\lceil \log (n) \rceil)}\).
This computation intuitively splits the t positions preceding a specific position i into parts of increasing powers of two, then it iteratively checks whether each of these parts is a match. Splitting a number into increasing powers of two is equivalent to computing its bit decomposition. Even though the AND-depth of this step is O(log(n)), it can be performed in parallel to the length computation where we also use the same auxiliary matrix B. So, this step does not increase the depth of the circuit. Remove matches contained in other matches: We first compute the maximum length of a match for each position i across all the database entries and then perform an equality check between Mj,i and the maximum for position i to compute each Lj,i. Each such maximum is computed using the D&C comparison circuit [24] in AND-depth O(log(logn) log(m)). The equality check can be implemented with a depth-optimized circuit in O(log(logn) logm) depth in which the comparison of the bits across the database entries is performed in parallel. Remove extendable matches: A match between x and yj is extendable at position i if Bj,i+1=1. So, in order to remove the extendable matches, we compute \(\phantom {\dot {i}\!}\mathbf {K}_{j,i}= \lnot \max _{j'}\left \{\mathbf {L}_{j',i} \wedge \mathbf {L}_{j',i+1}\right \}\) that indicates whether an extendable match exists. Finally, we update Mj,i←Kj,i∧Lj,i∧Mj,i. Using the D&C comparison, this operation requires O(log(logn) logm)AND-depth. Permute matches: Finally, to remove the information regarding the position of matches, we permute each row k of the matrix M according to a permutation πk, which is given as an input. All permutations are performed in parallel and require AND-depth O(log(n)) using the Waksman permutation network [25].
The total AND-depth of the above circuit is O(log(n)+ log(logn) log(m)). Since in practice n>>m, the depth can be assumed to be proportional to O(log(n)).
We now present an optimization that reduces the output size and the computational cost of the permutations. Even though this does not offer an asymptotic optimization, it is definitely helpful in the experimental efficiency of the protocol.
Output size reduction and efficient permutations: Before the permutation step, the above circuit outputs a matrix M that contains all set-maximal matches with threshold t. We note that the threshold t guarantees that there exists no set-maximal matches ending at positions with distance less than t. In other words, in each row of the matrix M there is at most one none zero element for every t positions. By combining every t columns of M into one equal to their XOR, we reduce the size of the output by a factor of t without losing the desired information about set-maximal matches.
After the output reduction, we need to permute each row of a matrix of size m×n/t. Therefore, the Waksman permutation network has depth only O(log(n/t)).
Experimental evaluation
We have implemented our protocol for secure computation of set-maximal matches using the ABY framework to evaluate its efficiency. We use the network configuration included in ABY for the communication between the two parties and the default method for precomputing multiplication triplets via oblivious transfer. Then, we build the computation circuit gate-by-gate and let ABY handle the sharing procedures and secure computation. The problem has three parameters: n, the size of the genomic sequences, m, the size of the database and t, the threshold of set-maximal matches. We run a series of simulations in a single machine of 16GB RAM and Quad-core 2.8 GHz CPU that simulates both the server and the client to evaluate the efficiency of the protocol with respect to these parameters.
In our implementation, the server has as input the database, which defines the parameters n and m, and the threshold value t. Similarly, the client’s input is a query of size n, the database size m and the threshold value t. We note that because of the nature of the GMW protocol both parties need to know the values of all three parameters n, m and t. We evaluate how our protocol scales as n increases for database size m=10,100,1000 and threshold t=1 and t=2k, where k is the bit length of n. Even though the threshold value is an input to the protocol, we plot only two values for clarity of exposition. The two values represent the best and worst values in terms of efficiency. The threshold value 2k maximizes the efficiency gain form the output reduction optimization. On the contrary, when the threshold value t=1, this optimization is not in use, and hence this value corresponds to the worst case running time. In Fig. 2, we observe that for larger enough values of n indeed there is an improvement of both the running time and the depth due to the output reduction technique.
Figure 2a, c, and e show the running time of the protocol for three different values of m. For small n and m, the running time essentially depends on the precomputation phase, but when m and n are large enough the running time for a given m depends almost linearly in n. We observe that for small values of n, there is almost no difference on the running time for the two threshold values. In this case, the output is reduced by a small factor when t=2k, whereas the precomputation time increases, since the computation circuit needs to include the output reduction. Hence, for small values of n, there is no improvement in the efficiency from the output reduction optimization.
In Fig. 2b, d, and f, we notice that the depth depends mainly on log(n), which is what was expected by the analysis in the previous section.