C1orf41 Results
Evolution
Results from Blast showed that c1orf41 have high homology to small heat shock proteins which gave us a hint that the protein may be a heat shock protein.
Multiple sequence alignment with all of the 27 sequences obtained from BLAST is shown below:
The sequence of platypus, urchin_156 and urchin_159 was removed due to poor alignment. Besides that, the sequence for g.alga_fap was removed as it was a subsequence of g.alga_itp25.
The residues involved in binding site were highlighted. However, these residues were only conserved in terrestrial eukaryotes and some marine eukaryotes. Changes in the conserved amino acids may lead to higher binding affinity to ligand or different folding of the protein leading to different function.
After obtaining multiple sequence alignment, phylogenetic tree was construct based on the result from multiple sequence alignment. Bootstrap values are shown on each branch.
The unrooted tree above gave information on the evolutionary relationship between different organisms. This protein seems to have evolved from marine eukaryotes (red) to amphibians (blue) and then to terrestrial eukaryotes (green).
The bootstrap value shows the reliability of the branch. Higher bootstrap value show higher reliability.
Structure
Protein Structure'
PDB ID: 1TVG Information from the PDB stated that x-ray diffraction was sued to solve the structure of this protein with an R value of 0.215 and at 1.6Å. Two ligands were present in the crystal structure, a calcium (II) ion and a samarium (III) ion. This protein has 153 residues and its secondary structure consists of two helices and nine beta sheet strands.
Scop result
Figure 1: Above is the representation of c1orf41 secondary structure as presented in PDBsum.
Tertiary structure
Figure 2: This cartoon representation of c1orf41 shows that this protein is monomeric, consisting of a single domain. The fold of the protein is a jelly roll barrel where four pairs of anti-parallel beta strands are organised to form a barrel-like structure.
Figure 3: Surface representation of c1orf41. The Ca (II) ion is located within a pocket of the protein. Sm (III) interacts with Asp92 on the protein but its presence was probably due to the method used in solving the phase problem during structure solution using x-ray crystallogrophy.
Figure 4: Electrostatic charge distribution on the surface of c1orf41.
Structure analysis
1)Structure similarities
DALI was used to find proteins that are similar in structure to c1orf41. The results showed that sialidases, alpha-N-acetylglucosaminidases and galactose oxidases have similar structure to this protein.
Figure 5: DALI result showing the first 35 proteins that have similar structures to c1orf41. Even though the proteins are similar in structure they have very little sequence identity to c1orf41. 1xpw is c1orf41 but the structure was solved by NMR.
2)Domain Classification
Figure 6:Pfam result indicated that our protein has a F5/8 type C domain, also known as the discoidin domain which is apart of galactose binding domain super family.
Closer inspection of several of the other proteins from the DALI result showed that they also have discoidin domain in their structures. It was probably based on this domain that the DALI result was obtained.
3)Possible ligand binding sites
Observations and comparisons of several proteins from the DALI result with our protein showed that there is a similar position on each proteins where a metal ion is located.
Figure 7: This figure shows the position where calcium (II) ion (yellow) is located in within a loop of c1orf41 structure. This loop where metal ion is coordinated to was also observed in several proteins from the DALI result.
Figure 8: Shown at the top left is the discoidin domain of galactose oxidase with a sodium ion (purple) located within a loop. The bottom figure is bacterial sialidase discoidin domain with a sodium ion (purple) and a beta-D-galactose molecule. The sodium ion is also located i within a loop.
Figure 9: CastP also identified a surface cleft that is located at the loop position.
Figure 10: Pymol figure displaying possible metal binding position on c1orf41. Orange is the residues that may be involved in coordinating the metal and yellow is the calcium ion. The residues are based on CastP result.
Nest analysis by Profunc suggested two other ligand binding sites. Nests are structural motifs that commonly found in functionally important regions of protein structures.
Figure 11: Nest 1 and has scores higher than 2. So, they are more likely to be functional. His54 and Lys55 NH atoms are accessible from a large surface cleft of the protein. The cleft is also deep indicating that the nest is functionally important. The residues of Nest 2 are part of the loop that coordinates the Ca (II)ion.
Figure 12: Pymol representation of Nest 1 and 2.
Figure 13: Arrows indicating the pockets of the nests that may be accessible to ligands.