BIG-TB: A Benchmark Dataset for Genomic Resistance Prediction and Interpretability in Mycobacterium tuberculosis
Published in Manuscript in Preparation, 2025
A unified 17K-isolate benchmark dataset for genotype-to-phenotype prediction across 11 WHO-priority antibiotics, integrating genomic, proteomic, and evolutionary modalities.
BIG-TB provides standardized train/test splits, harmonized variant annotation, and interpretability metrics for model comparison. It supports research into causal variant recovery and cross-drug generalization.