About Research

Research

Overview

In order to fight off attacks by the microscopic bugs that cause disease, and to find and eliminate cancers, specialist cells of the immune system need to be able to recognise these invaders as foreign or abnormal. This is achieved using molecules on their outward facing cell membranes, known as receptors.

My PhD research will rely on the use of powerful algorithms to understand the types of immune cell receptors found in the human body during health and disease. The hope is that this will help identify new immune signatures of disease, and inform the use of these receptors for new drug development.

Past Work

MSc Artificial Intelligence, QMUL (2018-2020)

Hudson, A. and Gong, S. (2020) “Transfer Learning for Protein Structure Classification at Low Resolution.” Open Science Index, Bioengineering and Life Sciences Vol:14, No:11, 2020 waset.org/abstracts/129704

Scientists are interested in proteins, as they perform a wide variety of critical functions in nature. In order to understand those functions, we need to be able to visualise protein structure in three dimensions at near atomic resolution. However, obtaining such high-resolution structures can be expensive and time-consuming.

Under the leadership of Professor Sean Gong at QMUL, I investigated whether a kind of machine learning algorithm called a convolutional neural network (CNN) could be used to predict the fold of proteins whose structure had been determined at low resolution. This kind of model could feasibly be used to infer function from structures determined with relative ease.

By training an ensemble of CNNs on high-resolution protein structures labelled with their CATH domain membership, we demonstrated that it is possible to accurately label domains of protein structures solved at low resolution, including those determined using Nuclear Magnetic Resonance.

We achieved state of the art in this fold classification task, releasing the code on GitHub. Interestingly, we found that icorporating additional physico-chemical information into the training dataset did not significantly improve performance of the classifier. This suggests that structure representations should be chosen to strike a balance between performance and storage efficiency for any given task.

The full paper is available here

1 2 3 4 5 »