The fundamental goal of this project is to try and understand more about the structure of the huntingtin protein molecule i.e. what does this long chain of amino acids look like in three dimensions in the cells of our bodies? By understanding more about what this protein looks like, we may be able to make some inferences about what the huntingtin protein is doing as its normal function in the cell, and perhaps how this is changed when the huntingtin gene is expanded, as seen in Huntington’s disease.
In this computational modelling experiment, I used the Phyre2 server to generate models of the huntingtin structure. These are predictions made on the basis of the protein sequence of huntingtin and comparison with known structures based on amino acid sequence alignment and comparison, you can read more about this method in this paper. As such, they should be carefully considered as they are not backed up by a wealth of experimental data. None-the-less, the models do provide some interesting insight into the potential structure of huntingtin.
As usual, I have uploaded all of my data to Zenodo, but you can read a summary of my conclusions below:
- The high confidence models determined by Phyre2 show the expected secondary structural features of huntingtin – extended alpha helical regions.
- These alpha helical motifs are generally folded into HEAT (2 helices joined by a helical hairpin) or armadillo (3 helices – H2 and H3 packed together in an antiparallel fashion, perpendicular to shorter H1, with a sharp loop between H1 and H2 mediated by a conserved glycine) repeats in the predicted models, again in line with previous predictions by InterPro.
- High confidence models (>80%) consisting of these tertiary structure features were built for regions 130-399, 626-936 and 2745-3091. The regions of huntingtin sequence for each of these models correlate to domain predictions by InterPro as well as experimentally determined putative domain boundaries from the limited proteolysis and mass spectrometry experiments.
- Despite fairly confident prediction of alpha-helical secondary structure motifs throughout the region of huntingtin sequence from 1201-2400, no models were built with confidence, consistent with the InterPro analysis and to some degree, the experimental data.
To follow up this work, I would like to repeat this analysis with other structural prediction programmes (e.g. SWISS-MODEL), continue experimental domain determination by limited proteolysis with alternative enzymes i.e. chymotrypsin and then begin domain construct design for huntingtin constructs by BVES expression.