Tabix
Tabix is a free software utility for indexing TAB-delimited genome position files.[1] It is commonly used in bioinformatics analysis to index large genomic data files such as GFF, VCF,[2] or BED files for efficient data retrieval.[3] Tabix was developed by Heng Li and is distributed under the MIT license.[4][5]
Use
[edit]Tabix requires the input file to be position-sorted and compressed using BGZF. After indexing, Tabix is able to retrieve data lines overlapping query intervals regions specified in the format "chr:start-end." The index files have a .tbi
or .csi
extension.[5]
It also supports data retrieval over network using direct URL if the index is present in the same location or locally. It also supports multithreading for operations except listing of sequence names.[5]
See also
[edit]References
[edit]- ^ Li, Heng (March 1, 2011). "Tabix: fast retrieval of sequence features from generic TAB-delimited files". Bioinformatics. 27 (5): 718–719. doi:10.1093/bioinformatics/btq671. ISSN 1367-4803. PMC 3042176. PMID 21208982.
- ^ "VCF+tabix Track Format". UCSC Genome Browser. University of California, Santa Cruz. Retrieved January 26, 2021.
- ^ Buffalo, Vince (2015). "Out-of-Memory Approaches: Tabix and SQLite". Bioinformatics data skills (1st ed.). California: O'Reilly. p. 427. ISBN 978-1-4493-6737-4. OCLC 916120899.
- ^ "Samtools/Htslib". GitHub. 2 May 2022.
- ^ a b c "tabix(1) manual page". www.htslib.org. Retrieved 2025-04-17.