Est. 2001 · 25-year retrospective

Twenty-five years of mining life's code.

BIOKDD is among the longest-running workshops at the ACM SIGKDD Conference and the longest-running one devoted to biomedical data mining — from the first sequence-motif algorithms of 2001 to the digital twins of 2026. Every year, the field's evolving frontier has been tested, debated, and published here.

25
Years (2001 – 2026)
24
Editions held
~350
Workshop papers
~60
Keynote speakers
4
Research eras
A note from the chair

Welcome to BIOKDD's 25-year retrospective. What started in 2001 as a small workshop at ACM SIGKDD has grown into the longest-running forum where data mining and biology meet — a place where new methods test themselves against new biology, where students share a session with department chairs, and where the next year's frontier is often visible eighteen months before the journals confirm it.

If you are a returning author, thank you for shaping this community. If you are new, you are very welcome here. Browse the 25-year archive below, plan to join us in August for BIOKDD 2026 on Digital Twins, and consider joining our LinkedIn group to stay in touch between editions.

— Jake Y. Chen
Triton Endowed Professor of Biomedical Informatics and Data Science · Founding Director, Systems Pharmacology AI Research Center · AIMed Lab, The University of Alabama at Birmingham
on behalf of the BIOKDD organizing committee
About BIOKDD

A field-defining forum where data mining and biomedicine meet.

BIOKDD was founded in 2001 to give biomedical data-mining research a permanent home inside the world's flagship knowledge-discovery conference. Twenty-four editions later, it has tracked — and often led — every major wave of biomedical informatics: sequence motifs, microarrays, networks, NGS, deep learning, multi-omics, large language models, and now digital twins.

"BIOKDD is where the data-mining community first encounters tomorrow's biology — and where biology first encounters tomorrow's data-mining methods."
  • One of the longest-running workshops at the ACM SIGKDD Conference — held annually since 2001, the only gap being 2009.
  • Open to full papers, abstracts and late-breaking research, all selected through peer review.
  • Selected works invited to a TCBB Special Issue — a tradition since 2008.
  • A welcoming venue for new researchers: 50–100 attendees, oral presentation for every accepted submission.
A quarter-century in four phases

From sequence motifs to digital twins — four research eras at one workshop.

Across 25 years, four distinct research phases emerge from the workshop's program. Each phase reflects what biology was producing — and what data-mining methods were ready to absorb.

Phase I · 2001 – 2008

Founding era — classical KDD meets the genome

The first eight years brought sequence motifs, microarray clustering, biclustering, frequent-pattern mining, classification of expression profiles, and the earliest gene/protein interaction studies into SIGKDD.

Sequence motifs Microarrays Frequent patterns Biclustering Protein structure
Phase II · 2010 – 2014

Networks & next-generation sequencing

NGS reshaped the data, and networks reshaped the methods. Pathway mining, network medicine, disease-gene prediction, and the first wave of structured biomedical big-data made BIOKDD a meeting point for systems and computational biologists.

NGS Network biology Pathways Disease genes Predictive models
Phase III · 2015 – 2020

Deep learning & multimodal biomedicine

Deep neural nets, single-cell omics, knowledge graphs, biomedical imaging, and electronic health records arrived together. BIOKDD became the place where the data-mining and biomedical-AI communities began to share a methodological vocabulary — culminating in the COVID-19 themed 2020 edition.

Deep learning Multi-omics Single-cell Knowledge graphs EHR mining
Phase IV · 2021 – 2026

GenAI, foundation models & digital twins

The current era opens with AlphaFold and closes with patient digital twins. Workshop themes track the frontier in real time — AI in medicine (2021), large-scale ML (2023), LLMs in bioinformatics (2024), generative biomolecular design (2025), digital twins (2026).

LLMs Foundation models Generative design Digital twins Multimodal AI
The 25-year archive

Every edition, one click away.

Twenty-five years of programs, keynotes, organisers and proceedings. Click any year to open its workshop page at biokdd.org/biokddXX/.

Steering & founding committee

A small group of researchers has carried the workshop for 25 years.

BIOKDD has been organised by a continuous chain of program and general chairs since 2001, with broad scientific program committees drawn from across academia, industry and government labs.

Sustained impact · since 2007

Jake Y. Chen

Triton Endowed Professor of Biomedical Informatics and Data Science · Founding Director, Systems Pharmacology AI Research Center · The University of Alabama at Birmingham

Joined as program co-chair in 2007 and has been in a chair or steering role every year since. Brought the biology and systems-biology community into BIOKDD — broadening the workshop from a pure data-mining venue into a meeting point for biomedical informaticians, and the principal force behind the workshop's sustained scientific impact across its second and third decades.

AIMed Lab · aimed-lab.org
Program & publications · since 2017

Da Yan

Associate Professor, Luddy School of Informatics, Computing, and Engineering · Indiana University Bloomington

Joined as program co-chair in 2017 and has run the workshop's program every year since. Established the IEEE/ACM TCBB Special Issue as a recurring BIOKDD tradition, ensuring that selected workshop papers are routed each year to a peer-reviewed journal venue — the workhorse behind the workshop's annual programmatic execution.

Faculty profile · luddy.indiana.edu
Founding general chair · steering since 2001

Mohammed J. Zaki

Professor of Computer Science · Rensselaer Polytechnic Institute

Founded BIOKDD in 2001 and shaped its character through the workshop's foundational phases — the SIGKDD partnership, the open call for biological data-mining research, and the original program-committee network from which today's community grew. Has remained on the steering committee every year since, giving BIOKDD its continuous 25-year institutional memory.

Faculty profile · cs.rpi.edu/~zaki
BIOKDD 2026 · 25th edition · now open for participation

Join us this August: Digital Twins for biology and medicine.

Submissions are closed and the program is being finalized. We now invite the broader community to attend the workshop: a full day of peer-reviewed talks, an invited keynote, abstract spotlights and a late-breaking research session — all framed around digital twins for biology and medicine.

Workshop date Aug 10 2026
Venue with SIGKDD 2026
Format Full-day, in-person
Status Submissions closed · program forthcoming
Register & attend BIOKDD 2026 →
BIOKDD 2026 · workshop preview Visit site →

What to expect on the day

  • TalksRegular workshop papers — full-length peer-reviewed talks, with selected works invited to the BIOKDD Special Issue with IEEE/ACM TCBB.
  • AbsAbstract spotlights — short oral presentations of preliminary research and ongoing collaborations.
  • LBRLate-breaking research — published or exceptionally high-impact recent work, in a dedicated oral session.
  • KeyInvited keynote on digital twins for biomedicine, plus a moderated discussion with the program chairs.
Stay connected

Twenty-five years, 1,800+ contributors.

Across 25 editions, more than 1,800 researchers have been part of BIOKDD as authors, reviewers, program-committee members, keynote speakers, and attendees. Our LinkedIn group is the smaller, active online hub where this community stays in touch between editions.

LinkedIn group · the online hub
Biokdd — Data Mining in Bioinformatics
A small but active group of BIOKDD authors, reviewers, program-committee members, and adjacent researchers in the data-mining × bioinformatics space.
Join the BIOKDD group on LinkedIn →
  • 01 Annual calls for papers are announced first in the group every spring.
  • 02 TCBB Special Issue notifications with submission deadlines and editorial guidance.
  • 03 Post-workshop discussion, slides, and recordings shared by program chairs and authors.
  • 04 Job posts & collaboration requests from the bioinformatics-AI community.