Arylformamidase Methods: Difference between revisions

From MDWiki
Jump to navigationJump to search
No edit summary
No edit summary
 
Line 1: Line 1:
'''Literature search'''
'''Literature search'''


A literature search was performed using the putative annotation ‘arylformamidase’. A paper by Pabarcus et al. 2007 was returned which, ironically, described the arylformamidase from Mus Musculus.  
A literature search was performed using PubMed.


'''Conservation of Catalytic Triad'''
'''Conservation of Catalytic Triad'''


An alignment of 2pbl and the protein of interest was performed using ClustalX. Default parameters were used. Residues of the catalytic triad were identified from the paper describing it and located in the sequence. Conservation of the residue and the surrounding sequence was observed. Note: in analysing conservation of the 2c7b catalytic triad with 2pbl, the clustalW alignment was found to differ from the alignment provided as part of the DALI results.
Alignments were obtained by using ClustalX or extracted from the DALI results. Catalytic residues were identified from literature and located in the sequence.  
 
'''Analysis of HSL Family Sequence Characteristics'''
 
Members of the protein family were identified from literature and their sequence obtained from NCBI. A multiple sequence alignment was performed using the default parameters of ClustalX. Sequence regions characteristic to the protein family were identified from literature and located in the multiple sequence alignment.  


'''Sequence & Homology'''
'''Sequence & Homology'''


 
Using the amino acid sequence of 2PBL, a BLASTP search was performed using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences were found using [http://www.ncbi.nlm.nih.gov/sites/entrez?itool=protein_brief&DbFrom=protein&Cmd=Link&LinkName=protein_homologene&IdsFromResult=58330909 NCBI HomoloGene]. These were appended to the list and a multiple sequence alignment was performed using CLUSTAL X.  
Using the query sequence [[Arylformamidase]],a BLASTP search was performed on the bacterial protein sequence using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences were found using [http://www.ncbi.nlm.nih.gov/sites/entrez?itool=protein_brief&DbFrom=protein&Cmd=Link&LinkName=protein_homologene&IdsFromResult=58330909 NCBI HomoloGene]. These were appended to the list and a multiple sequence alignment was performed using CLUSTAL X.  


The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorithm. The program FigTree was used to create the visual representation of this tree.
The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorithm. The program FigTree was used to create the visual representation of this tree.


A similar BLASTP search was performed using the human homologue to our query sequence. 126 of the top scoring matches were selected for a multiple sequence alignment. This was the minimum number of sequences which would also include the query sequence. The sequences were aligned, bootstrapped and a tree created as above. The tree revealed some questionable matches, joining humans with pufferfish for instance, which, whilst evolutionarily interesting poses more questions than answers.
A BLASTP search was performed using the human homologue of 2PBL. Top scoring sequences of this search were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence.  
 
Top scoring sequences from the results of the BLASTP search using the human homologue were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence.  


As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorithm.
As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorithm.


'''Structure'''
'''Structure'''


A number of bioinformatics databases were used to obtain the structure of arylformamidase. These include RCSB Protein Data Bank (PDB ID: 2PBL)[http://www.rcsb.org/pdb/home/home.do] for the quaternary protein structure and PDBsum [http://www.ebi.ac.uk/pdbsum/] for the secondary structure.The predicted interaction of arylformamidase with other proteins was determined using the STRING database [http://string.embl.de//]. The DALI database was used for the structural comparison of arylformamidase with other proteins [http://ekhidna.biocenter.helsinki.fi/dali_server/]. Pymol was used to annotate the conserved catalytic triad in arylformamidase  structure.
A number of bioinformatics databases were used to obtain the protein structure of 2PBL. These include [http://www.rcsb.org/pdb/home/home.do RCSB Protein Data Bank] (PDB ID: 2PBL) for the quaternary protein structure and [http://www.ebi.ac.uk/pdbsum/ PDBsum] for the secondary structure. The [http://ekhidna.biocenter.helsinki.fi/dali_server/ DALI] database was used for the structural comparison with other proteins . Pymol and ProteinWorkshop 1.5 were used for figure generation.
 


[[Arylformamidase | Return to the main page...]]
[[Arylformamidase | Return to the main page...]]

Latest revision as of 01:48, 10 June 2008

Literature search

A literature search was performed using PubMed.

Conservation of Catalytic Triad

Alignments were obtained by using ClustalX or extracted from the DALI results. Catalytic residues were identified from literature and located in the sequence.

Analysis of HSL Family Sequence Characteristics

Members of the protein family were identified from literature and their sequence obtained from NCBI. A multiple sequence alignment was performed using the default parameters of ClustalX. Sequence regions characteristic to the protein family were identified from literature and located in the multiple sequence alignment.

Sequence & Homology

Using the amino acid sequence of 2PBL, a BLASTP search was performed using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences were found using NCBI HomoloGene. These were appended to the list and a multiple sequence alignment was performed using CLUSTAL X.

The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorithm. The program FigTree was used to create the visual representation of this tree.

A BLASTP search was performed using the human homologue of 2PBL. Top scoring sequences of this search were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence.

As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorithm.

Structure

A number of bioinformatics databases were used to obtain the protein structure of 2PBL. These include RCSB Protein Data Bank (PDB ID: 2PBL) for the quaternary protein structure and PDBsum for the secondary structure. The DALI database was used for the structural comparison with other proteins . Pymol and ProteinWorkshop 1.5 were used for figure generation.

Return to the main page...