Guojie Zhong  
(钟 国杰)

PhD Candidate, Department of Systems Biology, Columbia University.


Columbia University

Dept. of Systems Biology

1130 St. Nicholas Avenue

New York, NY 10032

I am currently a PhD student in Dr. Yufeng Shen’s lab at Department of Systems Biology, Columbia University. My main research interests are in computational genomics and human genetics.

My current research is about developing deep learning tools to predict modes of action of missense variants with three aims:

  1. Building efficient models for modes of action prediction from the representations of protein sequences, structures, and functions by protein language models, see project RESCVE.
  2. Curating benchmark datasets on modes of action prediction from different domains, including clinical genetics, deep mutational scan experiments and protein engineering.
  3. Applying the deep learning model to reveal the modes of action of missense variants on several diseases including neurodevelopmental disorders, autism, and congenital heart disease.

My previous PhD research experiences including:

  1. Developing statistical learning methods that leverage large-scale exome sequencing and single cell sequencing data to understand human disease genetics, see project VBASS.
  2. Cross-disease genetics of several developmental disorders including autism, congenital heart disease, congenital diaphragmatic hernia, etc.

I am always facinated by the application of novel machine learning algorithms to biological questions, especially in the genetics and genomics of human disease. These days I am exploring Transformer based Protein Language Models, Bayesian Graphical models, Variational Autoencoders.

Prior to Columbia, I got my B.S. in Integrated Science Program at Peking University with training mostly in biology, statistics and computer science. I joined Dr. Zemin Zhang’s Lab for my undergraduate thesis on developing computational methods to infer cellular spatial organization and cellular interaction from single cell genomics data, see project CSOmap.

I enjoy playing fingerstyle guitar and badminton in my free time, which both require a balance of power and control.

selected publications

  1. MLSB 2022
    Representation of missense variants for predicting modes of action
    G. Zhong, and Y. Shen
    Machine Learning in Structural Biology, Workshop at the 36th Conference on Neural Information Processing Systems (NeurIPS), 2022
  2. Statistical models of the genetic etiology of congenital heart disease
    G. Zhong, and Y. Shen
    Curr Opin Genet Dev, 2022
  3. Integration of gene expression data in Bayesian association analysis of rare variants
    G. Zhong, Y. A. Choi, and Y. Shen
    bioRxiv, 2022
  4. Identification and validation of candidate risk genes in endocytic vesicular trafficking associated with esophageal atresia and tracheoesophageal fistulas
    G. Zhong*, P. Ahimaz*, N. A. Edwards*, J. J. Hagen, C. Faure, and 13 more authors
    HGG Adv, 2022
  5. Reconstruction of cell spatial organization from single-cell RNA sequencing data based on ligand-receptor mediated self-assembly
    X. Ren*, G. Zhong*, Q. Zhang, L. Zhang, Y. Sun, and 1 more author
    Cell Res, 2020
  6. Landscape and Dynamics of Single Immune Cells in Hepatocellular Carcinoma
    Q. Zhang, Y. He, N. Luo, S. J. Patel, Y. Han, and 20 more authors
    Cell, 2019