Bee Identification Problem for DNA Strands

Johan Chrisnata, Han Mao Kiah, Alexander Vardy, Eitan Yaakobi

Research output: Contribution to journalArticlepeer-review

Abstract

Motivated by DNA-based applications, we generalize the bee identification problem proposed by Tandon et al. (2019). In this setup, we transmit all M codewords from a codebook over some channel and each codeword results in N noisy outputs. Then our task is to identify each codeword from this unordered set of MN noisy outputs. First, via a reduction to a minimum-cost flow problem on a related bipartite flow network called the input-output flow network, we show that the problem can be solved in O(M3) time in the worst case. Next, we consider the deletion and the insertion channels individually, and in both cases, we study the expected number of edges in their respective input-output networks. Specifically, we obtain closed expressions for this quantity for certain codebooks and when the codebook comprises all binary words, we show that this quantity is sub-quadratic when the deletion or insertion probability is less than 1/2. This then implies that the expected running time to perform joint decoding for this codebook is o(M3). For other codebooks, we develop methods to compute the expected number of edges efficiently. Finally, we adapt classical peeling-decoding techniques to reduce the number of nodes and edges in the input-output flow network.

Original languageEnglish
Pages (from-to)190-204
Number of pages15
JournalIEEE Journal on Selected Areas in Information Theory
Volume4
DOIs
StatePublished - 2023

Keywords

  • Bee identification problem
  • deletion channels
  • DNA-based data storage
  • multidraw channels

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Media Technology
  • Artificial Intelligence
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Bee Identification Problem for DNA Strands'. Together they form a unique fingerprint.

Cite this