Posts by Collection

experience

Software Engineer (AI & ML)

M2SYS Technology, Dhaka, Bangladesh
Jul. 2020 – Feb. 2022

Delivered production ML systems for biometric security and contextual recommendations.

  • Developed image spoofing detection pipelines and contextual recommendation systems using deep learning and NLP.
  • Automated backend workflows with Camunda BPM, integrating ML inferences into large-scale enterprise deployments.
  • Deployed and monitored production ML services across distributed systems to meet reliability and latency targets.

AI Engineer

NITEX Solutions Ltd., Dhaka, Bangladesh
Mar. 2022 – Jul. 2022

Implemented Detectron2-based instance segmentation and OCR automation for apparel workflows.

  • Implemented Detectron2-based instance segmentation for product identification across fashion catalogs.
  • Built OCR-driven automation tools to streamline product annotation and metadata extraction.
  • Generated fashion trend moodboards by combining NLP topic modeling with computer-vision embeddings to power design workflows.

portfolio

publications

Protein Structure-Informed Regularized Linear Model Outperforms ESM for Predicting Antibiotic Resistance

Published in Program in Quantitative Genomics Conference (PQG), Harvard University, 2024

Poster highlighting a fused regularized linear model that integrates 3D structural features and surpasses ESM-based baselines for resistance prediction.

Summary: Poster presentation demonstrating how fused regularization over structural neighborhoods and biochemical feature groups yields stronger predictive accuracy than large protein language models on Mycobacterium tuberculosis resistance benchmarks.

Beyond Sequence-Only Models: Leveraging Structural Constraints for Antibiotic Resistance Prediction in Sparse Genomic Datasets

Published in ICLR 2025 MLGenX Workshop, 2025

Structural constraints paired with deep learning improve resistance prediction under extreme label sparsity.

Summary: Presents a structure-aware hybrid model that injects contact-map priors and residue-level constraints into sequence-based predictors to maintain interpretability with limited labeled isolates. Demonstrates improved accuracy across sparse genomic datasets for Mycobacterium tuberculosis.

Unveiling GPT-4V’s Hidden Challenges Behind High Accuracy on USMLE Questions

Published in Journal of Medical Internet Research (2025), 2025

Analyzes GPT-4V performance on medical licensing questions, revealing systematic failure modes masked by headline accuracy.

Summary: Dissects GPT-4V responses to USMLE-style questions, cataloging error patterns in multimodal reasoning, visual grounding, and factual calibration. Provides actionable evaluation protocols for clinical AI deployment.

The Structural Context of Mutations in Proteins Predicts Their Effect on Antibiotic Resistance

Published in Submitted to eLife, 2025

Protein structural context features yield state-of-the-art accuracy and interpretability for antibiotic resistance mutation prediction.

Status: Submitted to eLife. Preprint: bioRxiv 2025.09.23.676583 (2025)
Summary: Leverages residue-level structural descriptors—solvent accessibility, hydrogen-bonding networks, and ligand proximity—to explain and predict resistance across Mycobacterium tuberculosis drug targets. Structural context improves both AUROC and attribution faithfulness over sequence-only models.

BIG-TB: A Benchmark Dataset for Genomic Resistance Prediction and Interpretability in Mycobacterium tuberculosis

Published in Manuscript in Preparation, 2025

A unified 17K-isolate benchmark dataset for genotype-to-phenotype prediction across 11 WHO-priority antibiotics, integrating genomic, proteomic, and evolutionary modalities.

BIG-TB provides standardized train/test splits, harmonized variant annotation, and interpretability metrics for model comparison. It supports research into causal variant recovery and cross-drug generalization.

talks

BIG-TB: A Benchmark Dataset for Genomic Resistance Prediction and Interpretability in Mycobacterium tuberculosis

Published:

This spotlight talk presented the BIG-TB dataset — a multimodal benchmark of ~17,000 M. tuberculosis isolates curated to advance antibiotic resistance prediction and model interpretability.
The presentation highlighted how integrating sequence, structural, and evolutionary features enables models to generalize across resistance mechanisms and better align with biological reality.

Key topics discussed:

  • Dataset design principles and integration of WHO 2023 resistance catalogues
  • Evaluation of sequence-based (ESM, CNN) vs structure-aware models
  • Insights from causal variant discovery and explainability metrics (SHAP, Recall@k)

This work underscores the importance of interpretable, biologically grounded ML systems for global health and precision diagnostics.

This spotlight talk presented the BIG-TB dataset — a multimodal benchmark of ~17,000 M. tuberculosis isolates curated to advance antibiotic resistance prediction and model interpretability.
The presentation highlighted how integrating sequence, structural, and evolutionary features enables models to generalize across resistance mechanisms and better align with biological reality.

Key topics discussed:

  • Dataset design principles and integration of WHO 2023 resistance catalogues
  • Evaluation of sequence-based (ESM, CNN) vs structure-aware models
  • Insights from causal variant discovery and explainability metrics (SHAP, Recall@k)

This work underscores the importance of interpretable, biologically grounded ML systems for global health and precision diagnostics.

teaching

Teaching Assistant & Course Developer — CS520: Theory and Implementation of Advanced Software Engineering

Graduate / Upper-level Undergraduate Course, University of Massachusetts Amherst, College of Information & Computer Sciences, 2025

I have been part of the CS520: Theory and Implementation of Advanced Software Engineering teaching team at UMass Amherst across multiple semesters — Spring 2023, Fall 2023, Spring 2025, and Fall 2025 — and served as Course Developer Assistant during Summer 2023.
The course covers advanced topics in software design, testing, and quality assurance, with hands-on assignments focused on automation and reproducibility.


Roles and Responsibilities

  • Head Teaching Assistant (2025):
    Led instruction for 140+ students, managed a team of teaching assistants, and oversaw all course logistics including GitHub Classroom, Gradescope automation, and student communication.
    Provided weekly technical guidance on JUnit testing, coverage analysis, mutation testing, and CI/CD pipelines.

  • Teaching Assistant (2023):
    Conducted lab and office hours, graded assignments, and mentored students on testing frameworks, software architecture, and debugging best practices.
    Supported continuous improvements to the testing framework and grading automation.

  • Course Developer Assistant (Summer 2023):
    Collaborated with the instructor to revamp the course structure, labs, and automated grading pipelines.
    Designed modular testing templates and scripts (statement_coverage.sh, decision_coverage.sh, mutation.sh) that remain part of the current course infrastructure.


Highlights

  • Improved grading turnaround time by 40% through automation of coverage and mutation testing workflows.
  • Designed and maintained reproducible assignments such as IE2-Triangle and Expense Tracker.
  • Coordinated multiple TA teams and ensured consistency in evaluation rubrics and lab materials.
  • Received strong student feedback for clear explanations, structured guidance, and prompt support.

Teaching Focus

  • Topics Covered: Unit testing, software design principles, mutation testing, CI/CD pipelines, and coverage analysis.
  • Technologies Used: Java, JUnit, Ant, GitHub Actions, Gradescope API, and shell automation.
  • Pedagogical Approach: Emphasized reproducibility, automated assessment, and real-world testing practices to help students develop professional-grade software reliability skills.