Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

MQF and buffered MQF: quotient filters for efficient storage of k-mers with their counts and metadata

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      eScholarship, University of California
    • الموضوع:
      2021
    • Collection:
      University of California: eScholarship
    • نبذة مختصرة :
      BackgroundSpecialized data structures are required for online algorithms to efficiently handle large sequencing datasets. The counting quotient filter (CQF), a compact hashtable, can efficiently store k-mers with a skewed distribution.ResultHere, we present the mixed-counters quotient filter (MQF) as a new variant of the CQF with novel counting and labeling systems. The new counting system adapts to a wider range of data distributions for increased space efficiency and is faster than the CQF for insertions and queries in most of the tested scenarios. A buffered version of the MQF can offload storage to disk, trading speed of insertions and queries for a significant memory reduction. The labeling system provides a flexible framework for assigning labels to member items while maintaining good data locality and a concise memory representation. These labels serve as a minimal perfect hash function but are ~ tenfold faster than BBhash, with no need to re-analyze the original data for further insertions or deletions.ConclusionsThe MQF is a flexible and efficient data structure that extends our ability to work with high throughput sequencing data.
    • File Description:
      application/pdf
    • Relation:
      qt8z424070; https://escholarship.org/uc/item/8z424070
    • Rights:
      public
    • الرقم المعرف:
      edsbas.953015ED