me



I’m Hew, a PhD student at the University of Oxford researching generative flow models for protein folding in the Oxford Protein Informatics Group (OPIG) supervised by Prof. Charlotte Deane.

I’m also a self-employed ML and software consultant with a portfolio of work for Isomerase where I developed their flagship directed evolution tool Evoselect.

Feel free to reach out to me if you’re interested in collaborating or my ML consulting.





Blog 2: Protein Modelling Pt.2 - Graph Network Messaging

Introduction

In the previous blog we outlined the statistical basis of the Potts model. The resultant Ising-like form is largely intractable for any reasonably sized protein and multiple sequence alignment. Although no closed form solution exists, methods exploiting numerical approaches to approximate the energy function have shown remarkable success when tested for their accuracy in predicting protein residue contacts. In this blog we’ll discuss the two earliest approaches to this.

Blog 3: Protein Modelling Pt.3 - Thermodynamics and Statistical Mechanics

Introduction

In the previous blog we describe the first method for approximating the Potts model via message passing as proposed by Weigt et al.. The resultant method was somewhat ineffecient relying on a slow iterative belief propagation. In this blog I will walk you through the next iteration in methods for approximating the Potts model, specifically the Mean Field approximation approach pioneered by the same group that introduced message passing. We will walk through the paper by Morcos et al. and they’re more elegant and efficient solution to the Potts model through a marriage of statistical mechanics and thermodynamics to evolutionary sequence analysis.

Blog 1: Protein Modelling Pt.1 - From Similarity to Structure Prediction

Introduction

Biology has typically been the least quantitative of the sciences. However, over recent decades the surge in sequencing data, made possible by next generation sequencing, has facilitated the application of statistical and machine learning to biology. In this blog I will describe what first got me into computational biology and what helped power the first drastic improvements in protein structure prediction introduced by AlphaFold2. Specifically, I will discuss what is now called evolutionary or Direct Coupling Analysis (DCA) and focus on several key papers that formulated this fascinating application of statistical mechanical principles to biology.

Flow Matching From the Mathematics

Introduction

In the world of computational structural biology you might have heard of diffusion models as the current big thing in generative modelling. Diffusion models are great because primarily they look cool when you visualise the denoising process to generate a protein structure (checkout RFdiffusion Colab notebook), but also because they are state of the art at diverse and designable protein backbone structure generation.

0%