X + Y sorting

In computer science, X + Y sorting is the problem of sorting pairs of numbers by their sum. Given two finite sets $X$ and $Y$ , both of the same length, the problem is to order all pairs $(x, y)$ in the Cartesian product $X \times Y$ in numerical order by the value of $x + y$ . The problem is attributed to Elwyn Berlekamp.[1][2]

Classical comparison sorting

Unsolved problem in computer science:

Is there an

X+Y

sorting algorithm faster than

O(n^{2}\log n)

?

(more unsolved problems in computer science)

This problem can be solved using a straightforward comparison sort on the Cartesian product of the two given sets. When the sets have size $n$ , their product has size $n^{2}$ , and the time for a comparison sorting algorithm is $O(n^{2}\log n)$ . This is the best upper bound known on the time for this problem. Whether $X+Y$ sorting can be done in a more slowly-growing time bound is an open problem.[3][2]

Harper et al. (1975) suggest separately sorting $X$ and $Y$ , and then constructing a two-dimensional matrix of the values of $X+Y$ that is sorted both by rows and by columns before using this partially-sorted data to complete the sort of $X+Y$ . Although this can reduce the number of comparisons needed by a constant factor, compared to naive comparison sorting, they show that any comparison sorting algorithm that can work for arbitrary $n\times n$ matrices that are sorted by rows and columns still requires $\Omega (n^{2}\log n)$ comparisons, so additional information about the set $X+Y$ beyond this matrix ordering would be needed for any faster sorting algorithm.[1]

Number of orderings

The numbers in the two input lists for the $X+Y$ sorting problem can be interpreted as Cartesian coordinates of a point in the $2n$ -dimensional space $\mathbb {R} ^{2n}$ . If one partitions the space $\mathbb {R} ^{2n}$ into cells, so the set $X+Y$ has a fixed ordering within each cell, then the boundaries between these cells are hyperplanes defined by an equality of pairs $x_{i}+y_{j}=x_{k}+y_{\ell }$ . The number of hyperplanes that can be determined in this way is ${\tbinom {n}{2}}^{2}=O(n^{4})$ , and the number of cells that this number of hyperplanes can divide a space of dimension $2n$ into is less than $n^{8n}$ . Therefore, the set $X+Y$ has at most $n^{8n}$ different possible orderings.[1][4]

Number of comparisons

The number of comparisons required to sort $X+Y$ is certainly lower than for ordinary comparison sorting: Michael Fredman showed in 1976 that $X+Y$ sorting can be done using only $O (n 2)$ comparisons. More generally, he shows that any set of $N$ elements, whose sorted ordering has already been restricted to a family $\Gamma$ of orderings, can be sorted using $\log _{2}|\Gamma |+O(N)$ comparisons, by a form of binary insertion sort. For the $X+Y$ sorting problem, $N=n^{2}$ , and $|\Gamma |\leq n^{8n}$ , so $\log _{2}|\Gamma |=O(n\log n)$ and Fredman's bound implies that only $O(n^{2})$ comparisons are needed. However, the time needed to decide which comparisons to perform may be significantly higher than the bound on the number of comparisons.[4] If only comparisons between elements of $X+Y$ are allowed, then there is also a matching lower bound of $\Omega (n^{2})$ on the number of comparisons needed.[4][5]

The first actual algorithm that achieves both $O(n^{2})$ comparisons and $O(n^{2}\log n)$ total complexity was published sixteen years later. The algorithm first recursively sorts the two sets $X+X$ and $Y+Y$ , and uses the equivalence $x_{i}-x_{j}\leq x_{k}-x_{\ell }\Leftrightarrow x_{i}+x_{\ell }\leq x_{j}+x_{k}$ to infer the sorted orderings of $X-X$ and $Y-Y$ without additional comparisons. Then, it merges the two sets $X-X$ and $Y-Y$ and uses the merged order and the equivalence $x_{i}+y_{j}\leq x_{k}+y_{\ell }\Leftrightarrow x_{i}-x_{k}\leq y_{\ell }-y_{j}$ to infer the sorted order of $X+Y$ without additional comparisons. The part of the algorithm that recursively sorts $X+X$ (or equivalently $Y+Y$ ) does so by splitting $X$ into two equal sublists $A$ and $B$ , recursively sorting $A+A$ and $B+B$ , inferring the ordering on $A+B$ as above, and merging the sorted results $A+A$ , $B+B$ , and $A+B$ together.[6]

Non-comparison-based algorithms

On a RAM machine with word size $w$ and integer inputs $0 \leq {x, y} < n = 2 w$ , the problem can be solved in $O (n log n)$ operations by means of the fast Fourier transform.[1]

Applications

Steven Skiena recounts a practical application in transit fare minimisation, an instance of the shortest path problem: given fares $x$ and $y$ for trips from departure A to some intermediate destination B and from B to final destination C, with only certain pairs of fares allowed to be combined, find the cheapest combined trip from A to C. Skiena's solution consists of sorting the pairs of fares as an instance of the $X+Y$ sorting problem, and then testing the resulting pairs in this sorted order until finding one that is allowed. For this problem, one can use a priority queue of pairs, initialized to contain a single pair with the cheapest overall pair of fares. Then, when a pair $(x,y)$ is found to be disallowed, two more pairs formed by combining $x$ and $y$ with their successors in their respective sorted lists of single-hop fares can be added (if not already present) to the priority queue. In this way, each successive pair can be found in logarithmic time, and only the pairs up to the first allowable one need to be sorted.[3]

Several other problems in computational geometry have equivalent or harder complexity to $X+Y$ sorting, including constructing Minkowski sums of staircase polygons, finding the crossing points of an arrangement of lines in sorted order by their $x$ -coordinates, listing pairs of points in sorted order by their distances, and testing whether one rectilinear polygon can be translated to fit within another.[7]

References

Harper, L. H.; Payne, T. H.; Savage, J. E.; Straus, E. (1975). "Sorting X + Y". Communications of the ACM. 18 (6): 347–349. doi:10.1145/360825.360869.
Demaine, Erik; Erickson, Jeff; O'Rourke, Joseph (20 August 2006). "Problem 41: Sorting X + Y (Pairwise Sums)". The Open Problems Project. Retrieved 23 September 2014.
Skiena, Steven (2008). "4.4 War Story: Give me a Ticket on an Airplane". The Algorithm Design Manual (2nd ed.). Springer. pp. 118–120. doi:10.1007/978-1-84800-070-4_4.
Fredman, Michael L. (1976). "How good is the information theory bound in sorting?". Theoretical Computer Science. 1 (4): 355–361. doi:10.1016/0304-3975(76)90078-5.
Dietzfelbinger, Martin (1989). "Lower bounds for sorting of sums". Theoretical Computer Science. 66 (2): 137–155. doi:10.1016/0304-3975(89)90132-1. MR 1019082.
Lambert, Jean-Luc (1992). "Sorting the sums ( $x i$ + $y j$ ) in $O (n 2)$ comparisons". Theoretical Computer Science. 103 (1): 137–141. doi:10.1016/0304-3975(92)90089-X.
Hernández Barrera, Antonio (1996). "Finding an $o (n 2 log n)$ algorithm is sometimes hard" (PDF). Proceedings of the 8th Canadian Conference on Computational Geometry (CCCG'96). pp. 289–294.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[harper-1] Harper, L. H.; Payne, T. H.; Savage, J. E.; Straus, E. (1975). "Sorting X + Y". Communications of the ACM. 18 (6): 347–349. doi:10.1145/360825.360869.

[topp-2] Demaine, Erik; Erickson, Jeff; O'Rourke, Joseph (20 August 2006). "Problem 41: Sorting X + Y (Pairwise Sums)". The Open Problems Project. Retrieved 23 September 2014.

[skiena-3] Skiena, Steven (2008). "4.4 War Story: Give me a Ticket on an Airplane". The Algorithm Design Manual (2nd ed.). Springer. pp. 118–120. doi:10.1007/978-1-84800-070-4_4.

[fredman-4] Fredman, Michael L. (1976). "How good is the information theory bound in sorting?". Theoretical Computer Science. 1 (4): 355–361. doi:10.1016/0304-3975(76)90078-5.

[5] Dietzfelbinger, Martin (1989). "Lower bounds for sorting of sums". Theoretical Computer Science. 66 (2): 137–155. doi:10.1016/0304-3975(89)90132-1. MR 1019082.

[6] Lambert, Jean-Luc (1992). "Sorting the sums ( $x i$ + $y j$ ) in $O (n 2)$ comparisons". Theoretical Computer Science. 103 (1): 137–141. doi:10.1016/0304-3975(92)90089-X.

[7] Hernández Barrera, Antonio (1996). "Finding an $o (n 2 log n)$ algorithm is sometimes hard" (PDF). Proceedings of the 8th Canadian Conference on Computational Geometry (CCCG'96). pp. 289–294.