Arylformamidase Methods
Literature search
A literature search was performed using PubMed.
Conservation of Catalytic Triad
Alignments were obtained by using ClustalX or extracted from the DALI results. Catalytic residues were identified from literature and located in the sequence.
Analysis of HSL Family Sequence Characteristics
Members of the protein family were identified from literature and their sequence obtained from NCBI. A multiple sequence alignment was performed using the default parameters of ClustalX. Sequence regions characteristic to the protein family were identified from literature and located in the multiple sequence alignment.
Sequence & Homology
Using the amino acid sequence of 2PBL, a BLASTP search was performed using a non-redundant database. The top scoring matches to an E-value of 3e-054, 35 sequences in total, were selected. Eukaryotic homologous sequences were found using NCBI HomoloGene. These were appended to the list and a multiple sequence alignment was performed using CLUSTAL X.
The data output from the multiple sequence alignment was bootstrapped 1000 times and a phylogenetic tree was created using the neighbour-joining algorithm. The program FigTree was used to create the visual representation of this tree.
A BLASTP search was performed using the human homologue of 2PBL. Top scoring sequences of this search were appended to the original top scoring sequences of the results BLASTP search on the bacterial query sequence.
As above, using CLUSTAL X, a multiple sequence alignment was generated, the data was then bootstrapped 1000 times and a phylogenetic tree generated using the neighbour-joining algorithm.
Structure
A number of bioinformatics databases were used to obtain the protein structure of 2PBL. These include RCSB Protein Data Bank (PDB ID: 2PBL) for the quaternary protein structure and PDBsum for the secondary structure. The DALI database was used for the structural comparison with other proteins . Pymol and ProteinWorkshop 1.5 were used for figure generation.