Albany 2019: 20th Conversation - Abstracts

Albany 2019
Conversation 20
June 11-15 2019
Adenine Press (2019)

Modeling of protein-DNA binding with a multi-module deep learning framework

Transcription factor (TF)-DNA binding is a fundamental component of gene regulatory processes, but how these proteins recognize their target sites in the genome is still not completely understood. TFs can recognize their binding sites by having a surface physicochemically complementary to the physicochemical signature of DNA, forming a series of contacts between amino acids and nucleotides. These contacts include direct hydrogen bonds, water-mediated hydrogen bonds, and hydrophobic contacts. In the last decade, several high-throughput technologies have been developed for a better understanding of the TF-DNA binding mechanisms by quantitatively measuring the binding affinities of a TF against thousands or even millions of different DNA sequences in vitro. Relevant methods include high-throughput sequencing technologies, such as SELEX-seq, HT-SELEX, or SMiLE-seq. These methods provide an alternative path to infer TF-DNA binding mechanisms without requiring time-consuming structural biology experiments. Here we present DeepRec (Deep Recognition for TF-DNA binding), a multi-module deep learning framework capable of building a precise predictive model for TF-DNA binding based on large-scale in vitro experimental data. The method integrates a forward perturbation-based interpretation approach to highlight the important patterns for deciphering the binding mechanisms. We demonstrate here applications of our method to SELEX-seq data for human helix-loop-helix (bHLH) protein MAX, the human myocyte enhancer factor-2B (MEF2B), and the human tumor suppressor protein p53. We accurately predicted DNA binding specificities and were able to unravel important insights into the binding mechanisms.

Tsu-Pei Chiu
Satyanarayan Rao
Ana Carolina D. Machado
Remo Rohs


Tsu-Pei Chiu is a postdoctoral fellow with Prof Remo Rohs, and will provide a short oral from the platform.

Quantitative and Computational Biology
Department of Biological Sciences
University of Southern California, USA

Email: rohs@usc.edu