Next Article in Journal
Three-Step Projective Methods for Solving the Split Feasibility Problems
Previous Article in Journal
Chen’s Biharmonic Conjecture and Submanifolds with Parallel Normalized Mean Curvature Vector
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

NP-Hardness of the Problem of Optimal Box Positioning

by
Alexei V. Galatenko
,
Stepan A. Nersisyan
* and
Dmitriy N. Zhuk
Faculty of Mechanics and Mathematics, Lomonosov Moscow State University, Leninskie Gory 1, 119991 Moscow, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2019, 7(8), 711; https://doi.org/10.3390/math7080711
Submission received: 12 June 2019 / Revised: 2 August 2019 / Accepted: 4 August 2019 / Published: 6 August 2019
(This article belongs to the Section Mathematics and Computer Science)

Abstract

:
We consider the problem of finding a position of a d-dimensional box with given edge lengths that maximizes the number of enclosed points of the given finite set P R d , i.e., the problem of optimal box positioning. We prove that while this problem is polynomial for fixed values of d, it is NP-hard in the general case. The proof is based on a polynomial reduction technique applied to the considered problem and the 3-CNF satisfiability problem.

1. Introduction

We consider the problem of optimal box positioning, that is, finding a position of a d-dimensional box with given edge lengths that maximizes the number of enclosed points of a given n-element set P R d . In this paper, we prove that this problem is NP-hard when integers n , d are not fixed and treated as parameters of the problem.
The problem of optimal box positioning has wide applications in computational geometry, data mining, and pattern recognition (e.g., see [1,2,3]). In [4], the authors presented a clustering approach based on the greedy algorithm that finds an approximate solution of the optimal box positioning problem. The algorithm was inspired by the apparatus of maximum interval pattern concepts (see, e.g., [4,5]), a technique that allows one to select patterns from fuzzy contexts. This approach was successfully applied to the dataset of tactile images registered by the Medical Tactile Endosurgical Complex [6,7,8], which allows intraoperative tactile examination of tissues. Comparison of the proposed clustering approach with the conventional k-means clustering resulted in a statistically significant advantage of the proposed method over k-means in clustering quality. Note that the result proved in the present paper justifies developing algorithms to solve an approximate version of the optimal box positioning over the exact one.
The rest of the paper is organized as follows. In Section 2, we describe some known results. In Section 3, we introduce formal definitions and formulate the problem of optimal integer box positioning and the auxiliary problem of the existence of an integer m-box. In Section 4, we prove the NP-hardness of the problem of optimal box positioning. In Section 5, we summarize the results.

2. Previous Results

Despite the fact that, to the best of our knowledge, a proof of the NP-hardness of the optimal box positioning problem is not available so far, some results are known for related problems. For example, in [3], Eckstein et al. considered a generalization of the problem of optimal box positioning: given two finite sets P + , P in R d , find a box B with arbitrary edge lengths such that
  • B does not intersect with P , and
  • the cardinality of B P + is maximal over all boxes that satisfy the first condition.
The authors proved the NP-hardness of the problem by applying a polynomial reduction of the classical NP-hard problem of finding a maximum independent set of vertices in a graph (e.g., see [9]) to the considered problem.
Barbay et al. considered a weighted generalization of the problem: given a finite set P R d and a function w : P R { } , find a box B with arbitrary edge lengths that maximizes the sum p B P w ( p ) [10]. This problem is also NP-hard since it generalizes the previous problem in which points from P + have weight + 1 and points from P have weight .
De Figueiredo and da Fonseca considered the weighted problem of the optimal unit ball positioning with non-negative weight function [11]. They obtained a lower bound ( n d ) under an additional restriction: an algorithm decides which operation to apply to a point based only on the coordinates of the point, ignoring its weight, so it processes input points in an order that does not depend on a weight function. Under this restriction, an algorithm must calculate the weight for each ball that is optimal for some weight function. Note that this restriction is not met in the unweighted version of the problem since we use the fact that the weight of each point is equal to + 1 . Also note that a unit box is a unit ball in metric.

3. Formal Definitions

Definition 1.
A d-dimensional box with edge lengths δ 1 , δ 2 , , δ d is a Cartesian product of the intervals a 1 , b 1 × × a d , b d , where b i a i = δ i ( i { 1 , , d } ).
Furthermore, we consider only boxes with integer edge lengths and vertice coordinates, i.e., δ i , a i , b i Z ( i { 1 , , d } ) . We call such boxes integer boxes.
Definition 2.
The problem of optimal integer box positioning is defined as follows: find an integer box with given edge lengths that maximizes the number of enclosed points of a set P = { p i } i = 1 n Z d .
In Section 4, we obtain NP-hardness of the problem of optimal integer box positioning as a corollary of the theorem about NP-completeness of the problem of the existence of an integer m-box.
Definition 3.
The problem of the existence of an integer m-box is a problem of the existence of an integer box with given edge lengths that contains at least m points from a set P = { p i } i = 1 n Z d .
In general, case parameters of both problems are integers n , d , δ 1 , δ 2 , , δ d and a set P. The number m is considered as a function of n , d or as a constant.
It is easy to see that both problems belong to the P complexity class if the parameter d is fixed. Indeed, without loss of generality, we can consider only boxes for which each a i is equal to the i-th coordinate of some point p k i from the set P. So to solve the problem, we can count the number of points in at most n d boxes. Since each count can be performed in O ( n d ) operations, the total number of operations for solving the problem is O ( d n d + 1 ) , which is polynomial in n.
Definition 4.
The 3-CNF satisfiability problem is the problem of the existence of an assignment ( s 1 , , s d ) { 0 , 1 } d to the Boolean variables x 1 , , x d , which turns formula i = 1 n l i , 1 l i , 2 l i , 3 in the conjunctive normal form to 1 (here, l i , j denotes literals over variables from the set { x 1 , , x d } ). For further details, see e.g., [9].
Without loss of generality, assume that variables of every conjunctive clause are distinct. Indeed, otherwise a clause is either identically equal to 1 (if it contains both a variable and its negation) or can be replaced with at most four clauses with the required property such that the conjunction of these clauses is identically equal to the initial clause.
Cook’s theorem [12] states that the 3-CNF satisfiability problem is NP-complete. This fact will give ground for our proof of NP-hardness of the problem of the existence of an integer m-box.

4. NP-Hardness of the Problem of Optimal Box Positioning

Theorem 1.
The problem of the existence of an integer m-box belongs to the NP complexity class.
Proof. 
Suppose we have a certificate: a box B which encloses at least m points from the set P. Then the certificate validation can be performed by counting cardinality of B P , which can be done by iterating over the set P and checking whether the current point lies in the box B. Since P contains n elements and each check can be done with O ( d ) comparisons, counting cardinality of B P will take O ( d n ) operations, which is polynomial in parameters n , d . □
Theorem 2.
The problem of the existence of an integer m-box is NP-hard.
Proof. 
We will prove this theorem by employing a polynomial reduction of the 3-CNF satisfiability problem (which is NP-hard [12]) in the problem of the existence of an integer m-box. Consider an arbitrary formula F in conjunctive normal form with d variables x 1 , , x d and n disjunctive clauses D 1 , , D n , each containing exactly 3 literals: F = i = 1 n D i , where D i = l i , 1 l i , 2 l i , 3 ; l i , j denotes a literal over one of the variables x 1 , , x d .
We construct the set P = { p i } Z d by the following procedure. Consider the disjunctive clause D i with variables x i , 1 , x i , 2 , x i , 3 and the set of its satisfying assignments S i = { S i , j } over the variable set { x i , 1 , x i , 2 , x i , 3 } . Since each disjunctive clause contains exactly 3 literals corresponding to distinct variables, it holds that | S i | = 7 . We map the pair ( D i , S i , j ) to the point z i , j Z d with coordinates ( z 1 , , z d ) by the following rule:
z l = 0 , if x l { x i , 1 , x i , 2 , x i , 3 }   and   the   value   of   x l   in   S i , j   is   0 ; 1 , if   x l { x i , 1 , x i , 2 , x i , 3 } ; 2 , if   x l { x i , 1 , x i , 2 , x i , 3 }   and   the   value   of   x l   in   S i , j   is   1 .
We define the set P as an image of this map over all clauses D 1 , , D n and their sets of satisfying assignments S 1 , , S n , so | P | 7 n . For further convenience, we also introduce sets Z i = { z i , j } , i { 1 , , n } , as subsets of P that consist of all points associated with D i .
To complete the proof of the theorem, we prove the following lemmas.
Lemma 1. 
In the above notation, for an arbitrary unit cube C Z d and for all i { 1 , , n } , the intersection C Z i contains zero points or one point.
Proof. 
Consider an arbitrary i { 1 , , n } and points Z i = z i , j j = 1 7 associated with D i . Since for any j , k { 1 , , 7 } , j k satisfying assignments S i , j and S i , k are different, there exists l { 1 , , d } such that the values of variable x l { x i , 1 , x i , 2 , x i , 3 } in S i , j and S i , k are opposite. Hence, the lth coordinates of z i , j and z i , k differ by 2 (one of these coordinates equals 0, and the other equals 2). Thus, points z i , j and z i , k cannot belong to the same unit cube. □
Lemma 2. 
In the above notation, a formula F is satisfiable if and only if there exists a unit cube C Z d such that | C P | = n .
Proof. 
Let us first prove that if F is satisfiable, then a cube C with | C P | = n exists. Let S = ( s 1 , , s d ) be a satisfying assignment for F. We construct a subset P ˜ P consisting of the points that correspond to the satisfying assignments S i j matching the satisfying assignment S. Since for each i { 1 , , n } there exists exactly one satisfying assignment S i j S i that matches S, we have | P ˜ | = n . Let z = ( z 1 , , z d ) be an arbitrary point in P ˜ and l { 1 , , d } . If x l is not met in the respective clause, the value of z l will be equal to 1. Otherwise, the value of z l will be equal to 2 · s l . This means that if s l = 0 , the value of z l will lie in the interval [ 0 , 1 ] , and otherwise in the interval [ 1 , 2 ] . Thus, the cube C = [ a 1 , a 1 + 1 ] × × [ a d , a d + 1 ] , where
a l = 0 , if s l = 0 , 1 , otherwise ,
covers the n-element set P ˜ P . Note that P ˜ contains exactly one point corresponding to each clause, so according to Lemma 1, the cube C has no common points with P \ P ˜ . Thus | C P | = n .
Now we prove that if a unit cube with | C P | = n exists, then F is satisfiable. Let C be the specified unit cube. By Lemma 1, we conclude that C P contains exactly one point corresponding to each clause. Since each edge length of C is equal to 1 and the cube vertex coordinates are integers, the list of l-th coordinates of the points from C P (for fixed l { 1 , , d } ) contains exactly one value from the set { 0 , 2 } , and we denote this value by 2 · s l . From the procedure of construction of the set P, we conclude that S = ( s 1 , , s d ) is a satisfying assignment for F. □
Lemmas 1 and 2 directly imply the following assertion.
Lemma 3. 
In the above notation, a formula F is satisfiable if and only if there exists a unit m-cube for m = n and the set P.
To complete the proof of Theorem 2, we consider the problem of the existence of an integer m-box (with m equal to n) in d-dimensional space for a box with all edge lengths equal to 1 (i.e., for the unit cube) and the constructed set P. Lemma 3 states that F is satisfiable if and only if there exists a unit cube that encloses n points. This statement in combination with the fact that set P can be constructed in time polynomial in n , d completes the proof of the theorem. □
Since the class of NP-complete problems is the intersection of the class NP and the class NP-hard, Theorems 1 and 2 immediately lead to the following theorem.
Theorem 3.
The problem of the existence of an integer m-box is NP-complete.
Now we are ready to prove the main theorem.
Theorem 4.
The problem of optimal integer box positioning is NP-hard.
Proof. 
This theorem is a trivial corollary of Theorem 3. Consider a set P = { p i } i = 1 n Z d . Then, finding the optimal position of an integer box B with edge lengths δ 1 , δ 2 , , δ d immediately leads to an answer to the problem of the existence of an integer m-box (by simply counting the number of points in the found box in O ( n d ) operations and comparing it with m), which is proved to be NP-complete. Thus, we made a polynomial reduction of the problem of the existence of an integer m-box to the problem of optimal integer box positioning. □
Note that the above proofs actually lead to stronger results, namely to NP-completeness of the problem of the existence of an integer unit m-cube and the NP-hardness of the problem of optimal integer unit cube positioning.
Corollary 1.
The problem of optimal integer box positioning with a set of prohibited points P (i.e., box should have an empty intersection with it) is NP-hard.
Proof. 
This statement immediately follows from the NP-hardness of the problem of optimal integer box positioning since it is a particular case of the considered problem with P = . □
Corollary 2.
The weighted problem of optimal integer box positioning with the range of the weight function in R { } is NP-hard.
Proof. 
This is also a corollary of the NP-hardness of the problem of optimal integer box positioning since we obtain an unweighted version of the problem by setting the weight function to + 1 for all points. □

5. Conclusions

The problem of optimal box positioning finds its applications in computer science, pattern recognition, and data analysis [1,2,3,4]. In this paper, we have proved that this problem is NP-hard.
On the one hand, this result means that algorithms based on optimal box positioning are in general inefficient for the analysis of high-dimensional data, thus it makes sense to develop algorithms that look for an approximately optimal box position. An example of such an algorithm used for data clustering can be found in [4].
On the other hand, NP-hardness does not necessarily imply average-case hardness. For example, the canonical NP-complete problem of CNF satisfiability (the one used in the proof of Cook’s theorem about the existence of NP-complete problems, [12]) can be solved using an algorithm with polynomial average time [13]. Thus, the problem of estimation of average complexity for finding an optimal box position remains an interesting open challenge.

Author Contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

Funding

The research was supported by the Russian Science Foundation (project 16-11-00058 “The development of methods and algorithms for automated analysis of medical tactile information and classification of tactile images”).

Acknowledgments

The authors thank Vladimir V. Galatenko for valuable comments and discussions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Agarwal, P.K.; Hagerup, T.; Ray, R.; Sharir, M.; Smid, M.H.M.; Welzl, E. Translating a planar object to maximize point containment. In Algorithms—ESA 2002; Möhring, R., Raman, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2002; pp. 42–53. [Google Scholar]
  2. Lamdan, Y.; Schwartz, J.T.; Wolfson, H.J. Object recognition by affine invariant matching. In Proceedings of the CVPR ’88: The Computer Society Conference on Computer Vision and Pattern Recognition, Ann Arbor, MI, USA, 5–9 June 1988; IEEE: Ann Arbor, MI, USA, 1988; pp. 335–344. [Google Scholar]
  3. Eckstein, J.; Hammer, P.L.; Liu, Y.; Nediak, M.; Simeone, B. The maximum box problem and its application to data analysis. Comput. Optim. Appl. 2002, 23, 285–298. [Google Scholar] [CrossRef]
  4. Nersisyan, S.A.; Pankratieva, V.V.; Staroverov, V.M.; Podolskii, V.E. A greedy clustering algorithm based on interval pattern concepts and the problem of optimal box positioning. J. Appl. Math. 2017. [Google Scholar] [CrossRef]
  5. Ganter, B.; Kuznetsov, S.O. Pattern Structures and Their Projections. In Conceptual Structures: Broadening the Base. ICCS 2001; Delugach, H.S., Stumme, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2001; pp. 129–142. [Google Scholar]
  6. Barmin, V.; Sadovnichy, V.; Sokolov, M.; Pikin, O.; Amiraliev, A. An original device for intraoperative detection of small indeterminate nodules. Eur. J. Cardiothorac. Surg. 2014, 46, 1027–1031. [Google Scholar] [CrossRef] [PubMed]
  7. Solodova, R.F.; Galatenko, V.V.; Nakashidze, E.R.; Andreytsev, I.L.; Galatenko, A.V.; Senchik, D.K.; Staroverov, V.M.; Podolskii, V.E.; Sokolov, M.E.; Sadovnichy, V.A. Instrumental tactile diagnostics in robot-assisted surgery. Med. Dev. 2016, 9, 377–382. [Google Scholar] [CrossRef] [PubMed]
  8. Solodova, R.F.; Galatenko, V.V.; Nakashidze, E.R.; Shapovalyants, S.G.; Andreytsev, I.L.; Sokolov, M.E.; Podolskii, V.E. Instrumental mechanoreceptoric palpation in gastrointestinal surgery. Minim. Invasive Surg. 2017. [Google Scholar] [CrossRef] [PubMed]
  9. Garey, M.K.; Johnson, D.S. Computers and Intractability, A Guide to the Theory of NP-Completeness; W.H. Freeman & Co.: New York, NY, USA, 1997. [Google Scholar]
  10. Barbay, J.; Chan, T.M.; Navarro, G.; Pérez-Lantero, P. Maximum-weight planar boxes in O(n2) time (and better). Inf. Process. Lett. 2014, 114, 437–445. [Google Scholar] [CrossRef]
  11. De Figueiredo, C.M.; da Fonseca, G.D. Enclosing weighted points with an almost-unit ball. Inf. Process. Lett. 2009, 109, 1216–1221. [Google Scholar] [CrossRef]
  12. Cook, S. The complexity of theorem-proving procedures. In STOC ’71 Proceedings of the Third Annual ACM Symposium on Theory of Computing; ACM: New York, NY, USA, 1971; pp. 151–158. [Google Scholar] [Green Version]
  13. Iwama, K. CNF satisfiability test by counting and polynomial average time. SIAM J. Comput. 1989, 18, 385–391. [Google Scholar] [CrossRef]

Share and Cite

MDPI and ACS Style

Galatenko, A.V.; Nersisyan, S.A.; Zhuk, D.N. NP-Hardness of the Problem of Optimal Box Positioning. Mathematics 2019, 7, 711. https://doi.org/10.3390/math7080711

AMA Style

Galatenko AV, Nersisyan SA, Zhuk DN. NP-Hardness of the Problem of Optimal Box Positioning. Mathematics. 2019; 7(8):711. https://doi.org/10.3390/math7080711

Chicago/Turabian Style

Galatenko, Alexei V., Stepan A. Nersisyan, and Dmitriy N. Zhuk. 2019. "NP-Hardness of the Problem of Optimal Box Positioning" Mathematics 7, no. 8: 711. https://doi.org/10.3390/math7080711

APA Style

Galatenko, A. V., Nersisyan, S. A., & Zhuk, D. N. (2019). NP-Hardness of the Problem of Optimal Box Positioning. Mathematics, 7(8), 711. https://doi.org/10.3390/math7080711

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop