PDA

View Full Version : How to get optimal pairing?



Shaolin3a
11-07-2006, 08:14 AM
Here's what I'm trying to accomplish: I have two sets of numbers and and I need to pair the numbers from set A with numbers from set B in any way possible to get the lowest absolute difference in all the pairs combined.

ex. A[4, 10] and B[2, 10] : 4->2, 10->10 [diff: 2]
Wouldn't want this: 4->10, 2->10 [diff: 14]

I know that brute force won't cut it in this problem because getting about 15 numbers will already take too long. I tried thinking about how to do linear regression and maybe borrowing some ideas from that but nothing came of it so far. Also, thinking about whether it would ever be a good idea to pair sets of sequential numbers out of order as in:

ex. A[a, b] where a < b and B[i, j] where i < j

I don't see a situation where it would be advantageous to pair a->j and b->i. This out-of-order pairing would always yield an equal or higher difference.

Anyway, there must be some method to break it down but I've been thinking on it for several days and can't come up with anything yet. Maybe it's something obvious but someone please enlighten me!

Knacko
11-07-2006, 11:50 AM
Write a test case. Have it randomly select the numbers in each of the pairs, sort them sequentially and find the absolute difference between each pairing. If it doesn't find any cases where sequential pairing doesn't yield the smallest absolute difference, then you're fine.

Shaolin3a
11-08-2006, 09:53 AM
That would leave me very uncertain about my solution still since you can't exhaustively test all inputs but I suppose that's a good way to start. I'll do that.

jinu117
11-17-2006, 12:56 AM
Well, sounds like to me classic problem of sorting.. wouldn't binary tree with buckets with mapping between bucket space do this for you?

eshbach
11-17-2006, 02:50 AM
Jinu is right about the sorthing. All you have to do is sort both sets and the pairs will line up correctly. I threw together a quick example, but don't yell at me for the quality of this code, as it's stream-of-consciousness...

http://aaron.doomray.com/pictures/pairs.jpg

Shaolin3a
11-21-2006, 08:38 AM
Hey guys, thanks for the input! I finally figured it out. Sorry I didn't state the problem and specs all that well. I got the case where the sizes of the sets are equal which as both of you have said would just need to be sorted then matched correspondingly. But the real problem comes when the sets are of unequal sizes and I ended up implementing a dynamic programming algorithm to solve it. Thanks again by the way!

emboss
11-21-2006, 09:30 AM
The sorting method is provably optimal (assuming I haven't botched anything up):

Let an optimal pairing be given by pairs (a_i, b_i), ordered by increasing a_i. What we want to prove is that if the b's are not ordered in increasing order (ie: there exists an i, j such that a_j > a_i but b_j < b_i) then we can exchange b_i and b_j (improving the ordering of the b's) without increasing the total difference. Doing repeated exchanges will eventually result in the b's being ordered without increasing the total difference.

Assume that there exists an i, j such that a_j > a_i and b_j < b_i. Then a_j = a_i + x and b_j = b_i - y for x > 0 and y > 0. We need to prove that
|a_i - b_j| + |a_j - b_i| <= |a_i - b_i| + |a_j - b_j|
ie: |a_i - b_i + y| + |a_i - b_i + x| <= |a_i - b_i| + |a_i - b_i + x + y|

Setting u = a_i - b_j gives
|u + y| + |u + x| <= |u| + |u + x + y|. Due to the symmetry we can assume without a loss of generality that y > x. Then it's simply a case of working through the 5 ranges for u:
1) u >= 0
2) -x <= u < 0
3) -y <= u < -x
4) -(x + y) <= u < -y
5) u < -(x + y)

Considering each range:
1) u + y + u + x <= u + u + x + y
0 <= 0 (trivially true)
2) u + y + u + x <= -(u) + u + x + y
u <= -u (true as u < 0)
3) u + y + -(u + x) <= -(u) + u + x + y
-x <= x (true as x > 0)
4) -(u + y) + -(u + x) <= -(u) + u + x + y
-2(x + y) <= 2u (true as -(x + y) <= u)
5) -(u + y) + -(u + x) <= -(u) - (u + x + y)
0 <= 0 (trivially true)

Hence it is true for all u. Therefore, the pairing where both the a's and b's are in increasing order is optimal.

edit: Gnah, took too long ... a similar approach may work for the non-equal-length arrays as well.