## Abstract

One of the most significant algorithmic challenges in the “big data era” is handling instances that are too large to be processed by a single machine. The common practice in this regard is to partition the massive problem instance into smaller ones and process each one of them separately. In some cases, the solutions for the smaller instances are later on assembled into a solution for the whole instance, but in many cases this last stage cannot be pursued (e.g., because it is too costly, because of locality issues, or due to privacy considerations). Motivated by this phenomenon, we consider the following natural combinatorial question: Given a bin-packing instance (namely, a set of items with sizes in (0, 1] that should be packed into unit capacity bins) I and a partition {I_{i}}_{i} of I into clusters, how large is the ratio ^{Í}_{i} Opt(I_{i})/Opt(I), where Opt(J) denotes the optimal number of bins into which the items in J can be packed? In this paper, we investigate the supremum of this ratio over all instances I and partitions {I_{i}}_{i}, referred to as the bin-packing price of clustering (PoC). It is trivial to observe that if each cluster contains only one tiny item (and hence, Opt(I_{i}) = 1), then the PoC is unbounded. On the other hand, a relatively straightforward argument shows that under the constraint that Opt(I_{i}) ≥ 2, the PoC is 2. Our main challenge was to determine whether the PoC drops below 2 when Opt(I_{i}) > 2. In addition, one may hope that lim_{k}_{→∞} PoC(k) = 1, where PoC(k) denotes the PoC under the restriction to clusters I_{i} with Opt(I_{i}) ≥ k. We resolve the former question affirmatively and the latter one negatively: Our main results are that PoC(k) ≤ 1.951 for any k ≥ 3 and lim_{k}_{→∞} PoC(k) = 1.691 . . . Moreover, the former bound cannot be significantly improved as PoC(3) > 1.933. In addition to the immediate contribution of this combinatorial result to “big data” kind of applications, it turns out that it is useful also for an interesting online problem called bin-packing with delays.

Original language | English |
---|---|

Title of host publication | SPAA 2019 - Proceedings of the 31st ACM Symposium on Parallelism in Algorithms and Architectures |

Pages | 1-10 |

Number of pages | 10 |

ISBN (Electronic) | 9781450361842 |

DOIs | |

State | Published - 17 Jun 2019 |

Event | 31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019 - Phoenix, United States Duration: 22 Jun 2019 → 24 Jun 2019 |

### Publication series

Name | Annual ACM Symposium on Parallelism in Algorithms and Architectures |
---|

### Conference

Conference | 31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019 |
---|---|

Country/Territory | United States |

City | Phoenix |

Period | 22/06/19 → 24/06/19 |

## Keywords

- Bin packing
- Online algorithms
- Price of clustering

## ASJC Scopus subject areas

- Software
- Theoretical Computer Science
- Hardware and Architecture