One-dimensional intervals incremental inverted index (Di4) is a multi-resolution, single-dimension indexing framework for efficient, scalable, and extensible computation of genomic interval expressions. The framework has a tri-layer architecture: the semantic layer provides orthogonal and generic means (including the support of user-defined function) of sense-making and higher-lever reasoning from region-based datasets; the logical layer provides building blocks for region calculus and topological relations between intervals; the physical layer abstracts from persistence technology and makes the model adaptable to variety of persistence technologies, spanning from small-scale (e.g., B+tree) to large-scale (e.g., LevelDB). The extensibility of Di4 to application scenarios is shown with an example of comparative evaluation of ChIP-seq and DNase-Seq replicates. Performance of Di4 is benchmarked for small and large scale scenarios under common bioinformatics application scenarios. Di4 is freely available from https://genometric.github.io/Di4.

Next Generation Indexing for Genomic Intervals / Jalili, Vahid; Matteucci, Matteo; Goecks, Jeremy; Deldjoo, Yashar; Ceri, Stefano. - In: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. - ISSN 1041-4347. - STAMPA. - 31:10(2019), pp. 8468044.2008-8468044.2021. [10.1109/TKDE.2018.2871031]

Next Generation Indexing for Genomic Intervals

Yashar Deldjoo;
2019-01-01

Abstract

One-dimensional intervals incremental inverted index (Di4) is a multi-resolution, single-dimension indexing framework for efficient, scalable, and extensible computation of genomic interval expressions. The framework has a tri-layer architecture: the semantic layer provides orthogonal and generic means (including the support of user-defined function) of sense-making and higher-lever reasoning from region-based datasets; the logical layer provides building blocks for region calculus and topological relations between intervals; the physical layer abstracts from persistence technology and makes the model adaptable to variety of persistence technologies, spanning from small-scale (e.g., B+tree) to large-scale (e.g., LevelDB). The extensibility of Di4 to application scenarios is shown with an example of comparative evaluation of ChIP-seq and DNase-Seq replicates. Performance of Di4 is benchmarked for small and large scale scenarios under common bioinformatics application scenarios. Di4 is freely available from https://genometric.github.io/Di4.
2019
Next Generation Indexing for Genomic Intervals / Jalili, Vahid; Matteucci, Matteo; Goecks, Jeremy; Deldjoo, Yashar; Ceri, Stefano. - In: IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING. - ISSN 1041-4347. - STAMPA. - 31:10(2019), pp. 8468044.2008-8468044.2021. [10.1109/TKDE.2018.2871031]
File in questo prodotto:
File Dimensione Formato  
10.1109TKDE.2018.2871031_PostPrint.pdf

accesso aperto

Descrizione: Accepted version
Tipologia: Documento in Post-print
Licenza: Tutti i diritti riservati
Dimensione 5.71 MB
Formato Adobe PDF
5.71 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/196494
Citazioni
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact