S-layers are paracrystalline mono-layered assemblies of (glyco)proteins which coat the surface of bacteria [, ]. Several S-layer proteins and some other cell wall proteins contain one or more copies of a domain of about 50-60 residues, which has been called SLH (for S-layer homology). Although it was originally proposed that SLH domains bind to peptidoglycan, it is now evident that pyruvylated secondary cell wall polymers (SCWPs), which are either teichoic acids, teichuronic acids, lipoteichoic acids or lipoglycans, serve as the anchoring structures for SLH motifs in the Gram-positive cell wall [, ]. However, the study of S-layer protein SbpA of Bacillus sphaericus revealed that SLH motifs are not sufficient for specific binding to SCWPs. Thus, the molecular basis explaining SLH affinity and specificity of interaction with cell wall polymers are not completely elucidated [].
Kinesin is a microtubule-associated force-producing molecular motor protein that transports numerous organelles along mirotubules. Kinesin is an oligomeric complex composed of two heavy chains and two identical light chains. The light chain has been proposed to function in the coupling of cargo to the heavy chain or in the modulation of its ATPase activity. The specificity of kinesin-cargo binding is thought to depend on the type of light chain that a kinesin molecule contains, where different isoforms of kinesin light chains are associated with different types of cargo, mitochondria and membranes of the Golgi complex [, ].The structure of Drosophila kinesin light chain was shown to have a core composed of a coiled-coil domain followed by five imperfect tandem repeats and a sixth shorter motif []. These repeats are highly conserved across species. The N and C termini are more variable and alternative splicing is responsible for the production of isoforms that differ in those two regions.
This entry represents the CRA (or CT11-RanBPM) domain, which is a protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi) and which is found in Ran-binding proteins such as Ran-binding protein 9 (RanBP9 or RanBPM) and RanBP10. RanBPM is a scaffolding protein important in regulating cellular function in both the immune system and the nervous system, and may act as an adapter protein to couple membrane receptors to intracellular signaling pathways. This domain is at the C terminus of the proteins and is the binding domain for the CRA motif, which is comprised of approximately 100 amino acids at the C-terminal of RanBPM. It was found to be important for the interaction of RanBPM with Fragile X messenger ribonucleoprotein 1 (FMRP/FMR1), but its functional significance has yet to be determined [].
The 26S proteasome is the major ATP-dependent protease in eukaryotes which plays a key role in intracellular protein degradation [, ].The lid of the 26S proteasome contains six PCI-domain-containing proteins (Rpn3/5/6/7/9/12), two MPN-domain-containing proteins (Rpn8/11), and one peptide, Sem1 []. The only catalytically active member of the lid is Rpn11, which serves as the essential deubiquitinase of the proteasome []. The C terminus of each subunit in the lid is predicted to form one or more helices. These C-terminal helices are highly conserved and have been predicted to form a helical bundle structure [].In yeast, Rpn9 was found to be necessary for the integrity and efficiency of the 26S proteasome [, ].This entry represents a helix domain C-terminal to the PCI domain found in Rpn9, a subunit of the 26S proteasome. C-terminal truncations of Rpn5 or Rpn9 have been shown not to cause any major lid assembly defects but to prevent the association of Rpn12 [, ].
Lysophospholipase NTE1 was identified in yeast as an endoplasmic reticulum integral membrane protein that acts as a phospholipase B, catalysing the double deacylation of phosphatidylcholine to glycerophosphocholine []. Phosphatidylcholine is the major phospholipid component of eukaryotic membranes. NTE1 plays an important role in membrane lipid homeostasis, and is responsible for the rapid phosphatidylcholine turnover in response to inositol, elevated temperatures, or when choline is present in the growth medium. NTE1-mediated phosphatidylcholine deacylation is strongly affected by Sec14p, a component of the yeast secretory machinery involved in lipid metabolism and vesicular trafficking.The mammalian and Drosophila homologues of NTE1, neuropathy target esterase and swiss cheese, respectively, have been implicated in normal brain development []. The absence of these proteins is associated with increased cytoplasmic vesicularisation and multi-layered membrane stacks.NTE1 contains two cyclic nucleotide-binding domains and a patatin domain. This entry represents the patatin domain.
This entry represents the SH3 domain of FNBP1.Formin-binding protein 1 (FNBP1, also known as formin-binding protein 17) contains a N-terminal FER-CIP4 homology (FCH) domain and a C-terminal SH3 domain. It belongs to the CIP4 (Cdc42 interacting protein-4) subfamily of the F-BAR protein family. F-BAR proteins (F for FCH, Fer-CIP4 homology domain) are proteins with an extended CIP4-Fer domain. The F-BAR proteins have been implicated in cell membrane processes such as membrane invagination, tubulation and endocytosis []. FNBP1 was originally isolated as a molecule that binds to the proline-rich region of formin []. It induces tubular membrane invaginations and participates in endocytosis []. It interacts with sorting nexin, SNX2, and is linked to acute myelogeneous leukemia [].
MIA2 is expressed specifically in hepatocytes and its expression is controlled by hepatocyte nuclear factor 1 binding sites in the MIA2 promoter [, ]. It inhibits the growth and invasion of hepatocellular carcinomas (HCC) and may act as a tumour suppressor []. A mutation in MIA2 in mice resulted in reduced cholesterol and triglycerides. Since MIA2 localizes to ER exit sites, it may function as an ER-to-Golgi trafficking protein that regulates lipid metabolism []. MIA2 contains an N-terminal SH3-like domain, similar to MIA.MIA (melanoma inhibitory activity) family members include MIA, MIAL, MIA2, and MIA3 (also called TANGO). MIA was found to be strongly expressed and secreted by malignant melanomas. It contains a domain that adopts a Src Homology 3 (SH3) domain-like fold; however, it contains an additional antiparallel beta sheet and two disulfide bonds compared to classical SH3 domains. Unlike classical SH3 domains, MIA does not bind proline-rich ligands [, ].
Cortactin was originally identified as a substrate of Src kinase []. It is an actin regulatory protein that binds to the Arp2/3 complex and stabilizes branched actin filaments [, ]. It is involved in cellular processes that affect cell motility, adhesion, migration, endocytosis, and invasion [, , , ]. Cortactin contains an N-terminal acidic domain, several copies of a repeat domain found in cortactin and HS1, a proline-rich region, and a C-terminal SH3 domain []. The N-terminal region interacts with the Arp2/3 complex and F-actin, and is crucial in regulating branched actin assembly []. Cortactin also serves as a scaffold and provides a bridge to the actin cytoskeleton for membrane trafficking and signaling proteins that bind to its SH3 domain. Binding partners for the SH3 domain of cortactin include dynamin2, N-WASp, MIM, FGD1, among others []. This entry represents the SH3 domain of cortactin.
Formin-binding protein 1 (FNBP1, also known as formin-binding protein 17) contains a N-terminal FER-CIP4 homology (FCH) domain and a C-terminal SH3 domain. It belongs to the CIP4 (Cdc42 interacting protein-4) subfamily of the F-BAR protein family. F-BAR proteins (F for FCH, Fer-CIP4 homology domain) are proteins with an extended CIP4-Fer domain. The F-BAR proteins have been implicated in cell membrane processes such as membrane invagination, tubulation and endocytosis []. FNBP1 was originally isolated as a molecule that binds to the proline-rich region of formin []. It induces tubular membrane invaginations and participates in endocytosis []. It interacts with sorting nexin, SNX2, and is linked to acute myelogeneous leukemia [].This entry represents the F-BAR domain of FNBP1. F-BAR domains are dimerization modules that bind and bend membranes and are found in proteins involved in membrane dynamics and actin reorganization [].
Members of the tautomerase superfamily (TSF) are characterised by a beta-α-β building block and a catalytic amino terminal proline [, ]. There are five known TSF families, which are named for the first characterised member [].This entry represents the malonate semialdehyde decarboxylase (MSAD) family []. MSAD is part of a bacterial pathway for the degradation of the soil fumigant known as 1,3-dichloropropene []. There are five homologues identified, including YusQ/YodA/YrdN from Bacillus subtilis, IolK from Lactobacillus casei strain BL23, and Bp4401 from Burkholderia phymatum. IolK was found in a degradative pathway for myo-inositol, but its disruption does not affect growth or myo-inositol utilization []. Bp4401 has a modest hydratase activity. YrdN, YodA, and IolK have comparable decarboxylase and hydratase activities, but this activity is missing in YusQ and Bp4401 [].
Alanine racemase plays a role in providing the D-alanine required for cell wallbiosynthesis by isomerising L-alanine to D-alanine.The molecular structure of alanine racemase from Bacillus stearothermophilus was determined by X-ray crystallography to a resolution of 1.9 A []. The alanine racemase monomer is composed of two domains, an eight-stranded alpha/beta barrel at the N terminus, and a C-terminal domain essentially composed of β-strands. The pyridoxal 5'-phosphate (PLP) cofactor lies in and above the mouth of the alpha/beta barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C terminus of the first β-strand of the alpha/beta barrel.This N-terminal domain is also found in the PROSC (proline synthetase co-transcribed bacterial homologue) family of proteins, which are not known to have alanine racemase activity.
Bicaudal-C (BICC, BICC1 in vertebrates) is an RNA-binding protein with translational repression function []. It is involved in the regulation of embryonic differentiation and plays a role in the regulation of Dvl (Dishevelled) signaling, particularly in the correct cilia orientation and nodal flow generation []. In Drosophila, disruption of BICC can disturb the normal migration direction of the anterior follicle cell of oocytes []. In mammals, mutations in this gene are associated with polycystic kidney disease and it was suggested that the BICC1 protein can indirectly interact with ANKS6 protein (ANKS6 is also associated with polycystic kidney disease) through some protein and RNA intermediates [].BICC1 contains N-terminal K homology RNA-binding vigilin-like repeats and a C-terminal SAM domain. This entry represents the SAM (sterile alpha motif) domain, which is a protein-protein interaction domain [].
Epidermal growth factor receptor kinase substrate (EPS8) is a regulator of Rac signaling [, , ]. It consists of a phosphotyrosine-binding (PTB) and an SH3 domain.PTB domains have a common PH-like fold and are found in various eukaryotic signaling molecules []. This domain was initially shown to binds peptides with a NPXY motif with differing requirements for phosphorylation of the tyrosine, although more recent studies have found that some types of PTB domains can bind to peptides lack tyrosine residues altogether []. In contrast to SH2 domains, which recognize phosphotyrosine and adjacent carboxy-terminal residues, PTB-domain binding specificity is conferred by residues amino-terminal to the phosphotyrosine []. PTB domains are classified into three groups: phosphotyrosine-dependent Shc-like, phosphotyrosine-dependent IRS-like, and phosphotyrosine-independent Dab-like PTB domains [].The entry refers to the PTB domain found in EPS8, part of the Dab-like subgroup.
The Rho-family GTPase, Cdc42, can regulate the actin cytoskeleton through activation of Actin nucleation-promoting factor WAS protein (WASP) family members []. Mutations in WASP lead to the Wiskott-Aldrich syndrome, a paediatric disorder characterised by actin cytoskeletal defects in haematopoietic cells, leading clinically to thrombocytopenia, eczema and immunodeficiency. The WASP proteins signal to the cytoskeleton through the Arp2/3 complex, an actin-nucleating assembly that regulates the structure and dynamics of actin filament networks at the leading edge of the cell.WASP family members have unique N-terminal regions, followed by a central segment rich in proline, and a common C-terminal region. The C-terminal region contains the VCA region that binds the Arp2/3 complex and actin, while the distinct N-terminal region enables family members to activate Arp2/3 in response to different upstream signals.
This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. It contains an evolutionary conserved signature W-X-Y-X6-11-GPF-X4-M-X2-W-X3-GYF, the site of interaction with proline-rich peptides. Proteins containing this domain include RME-8 (Required for receptor-mediated endocytosis 8), a DNAJC13 protein. RME-8 was first identified as a protein that is required for endocytosis in Caenorhabditis elegans. It coordinates the activity of the WASH complex with the function of the retromer SNX dimer to control endosomal tubulation []. Proteins containing this domain also include Arabidopsis trithorax-related3 (Atxr3) and Tic56. Atxr3 is the major enzyme responsible for H3K4me3, which is critical for regulating gene expression and plant development []. Tic56 is an essential subunit of a 1-MDa protein complex at the inner chloroplast envelope membrane []. Tic56 also plays important roles in rRNA processing and chloroplast ribosome assembly [].
Budding yeast Pin2 was identified as a protein that when overexpressed can induce the [PIN+]prion phenotype, which is a prerequisite for prion formation by Sup35, referred to as the [PSI+]prion []. Pin2 localization is dependent on both exo- and endocytosisis. It is an exomer-dependent cargo that localizes at the plasma membrane of the bud early in the cell cycle, and the bud neck at cytokinesis. The prion-like domain (PLD) in Pin2 serves as a Pin2 retention signal in the trans-Golgi network (TGN) and may act as stress-response element. Under environmental stress, Pin2 is endocytosed, and the PLD aggregates and causes sequestration of Pin2. This aggregation is reversible upon stress removal and Pin2 can be re-exported to the plasma membrane []. Why Pin2 needs to be retain under stress is not clear, but it may be related to the observation that Pin2 interacts with various components of the cell-wall integrity pathway.
This entry represents the PH domain of guanine nucleotide exchange factor DBS. The DBS PH domain participates in binding to both the Cdc42 and RhoA GTPases []. PH domains have diverse functions, but in general are involved in targeting proteins to the appropriate cellular location or in the interaction with a binding partner [].DBS, also called MCF2L or OST, functions as a Rho GTPase guanine nucleotide exchange factor (RhoGEF), facilitating the exchange of GDP and GTP. It was originally isolated from a cDNA screen for sequences that cause malignant growth. It plays roles in regulating clathrin-mediated endocytosis and cell migration through its activation of Rac1 and Cdc42 [, ]. Depending on cell type, DBS can also activate RhoA and RhoG [, ]. DBS contains a Sec14-like domain [], spectrin-like repeats, a RhoGEF or Dbl homology (DH) domain, a Pleckstrin homology (PH) domain [], and an SH3 domain.
Thg1 was originally characterised as synthesising the guanine nucleotide at the -1 position of the histidinyl tRNA (HtRNA). Thg1 has also been shown to have polymerase activity, which has been proposed to be the ancestral activity of this enzyme [, ]. Thg1 polymerases contain an additional region of conservation C-terminal to the core palm domain that comprise of 5 helices and two strands []. This region has several well-conserved charged residues including a basic residue found towards the end of the first helix of this unit might contribute to the Thg1-specific active site []. This C-terminal module of Thg1 is predicted to form a helical bundle that functions equivalently to the fingers of the other nucleic acid polymerases, probably in interacting with the template HtRNA [].
This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain.
Members of the cysteine/serine-rich nuclear protein family (CSRNP) contain cysteine- and serine-rich regions and a basic domain. They are nuclear proteins that possess a transcriptional activation domain and bind the sequence AGAGTG [, ]. The proteins actively influence transcriptional activity, but it has not been established which of their domains are involved in DNA binding. It is thought that this may potentially be mediated by the conserved cysteine-rich or the basic domain.It has been shown that CSRNP genes are down-regulated in various different cancers, suggesting that they act as tumour suppressors []. This would usually imply a relation to reduced apoptosis, but this has yet to be proven; in some studies, a reduction in apoptosis was not detected as a result of deficiencies of CSRNP genes.
The axin interaction dorsalization-associated (Aida) protein was characterised in zebrafish as a protein that utilizes its C-terminal region to interact with axis formation inhibitor (Axin), which is a microtubule-interacting scaffold protein for several distinct signalling proteins in the Wnt cascade. The C-terminal region of the Aida protein is a distinct version of the C2 domain. This Aida-type C2 domain is found in the C-terminal region of the proteins and it is critical for interactions with cytoskeletal in the context of cellular adhesion points, thus, it is combined with diverse domains related to cytoskeletal functions, e.g. EF hands, coiled coils, IQ calmodulin-binding motifs, ankyrin repeats and myosin head motor domain, or with a second lipid-binding domain, e.g. the PH domain. The Aida-type C2 domain is found only in the metazoan, choanoflagellate, chromist and chlorophyte lineages [, ].This domain has predominantly a β-strand globular fold composed of an antiparallel β-sandwich with two β-sheets, and three short α-helices to stabilize the conformation [].
In Entamoeba histolytica, the Gal/GalNAc lectin contributes to its virulence by establishing adhesion to host cell []. The Gal/GalNAc lectin is a heterodimeric molecule composed of a transmembrane heavy (170kDa) subunit and glycosylphosphatidylinositol-anchored light 31kDa and 35kDa subunits, which are non-covalently associated with an intermediate subunit of 150kDa [, ]. Inhibition of expression of 35kDa subunit of Gal/GalNAc lectin inhibits the cytotoxic and cytopathic activity of E. histolytica, but no decrease in adherence capacity to mammalian cells was evident. Interestingly, a carbohydrate-binding activity has been reported for the 35kDa light subunit of the lectin molecules of the closely related Entamoeba invadens []. Proteins in this entry are related to the light subunit. The light subunit consists of several polypeptide chains with considerable antigenic homology. The two light (31/35kDa) subunits of the lectinare present in two isoforms: the 31kDa isoform is glycerolphosphatidylinositol (GPI) anchored; and the 35kDa isoform is more highly glycosylated [].
Hydrophobic surface binding proteins are typically between 171 to 275 amino acids in length. Although the HsbA amino acid sequence suggests that HsbA may be hydrophilic, HsbA adsorbed to hydrophobic PBSA (Polybutylene succinate-co-adipate) surfaces in the presence of NaCl or CaCl2. When HsbA was adsorbed on the hydrophobic PBSA surfaces, it promoted PBSA degradation via the CutL1 polyesterase. CutL1 interacts directly with HsbA attached to the hydrophobic QCM electrode surface. These results suggest that when HsbA is adsorbed onto the PBSA surface, it recruits CutL1, and that when CutL1 is accumulated on the PBSA surface, it stimulates PBSA degradation []. This entry is also characterised by a antigenic cell wall galactomannoprotein in Aspergillus fumigatus, which is a protein of 284 amino acid residues. It contains a serine- and threonine-rich region for O glycosylation, a signal peptide, and a putative glycosylphosphatidyl inositol attachment signal sequence. Ultrastructural analysis showed that the protein is present in the cell walls of hyphae and conidia [].
This superfamily represents the N-terminal domain of UPF0234 uncharacterised proteins, which includes YajQ.In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period [, ]. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains [, ].The polypeptide chain of YajQ is folded into two domains with identical folding topology. Each domain has a four-stranded antiparallel β-sheet flanked on one side by two α-helices. This structural motif is a characteristic feature of many RNA-binding proteins [].
This superfamily represents the C-terminal domain of UPF0234 uncharacterised proteins, which includes YajQ. It also found also in UPF0381 uncharacterised proteins.In Pseudomonas syringae, YajQ functions as a host protein involved in the temporal control of bacteriophage Phi6 gene transcription. It has been shown to bind to the phage's major structural core protein P1, most likely activating transcription by acting indirectly on the RNA polymerase. YajQ may remain bound to the phage particles throughout the infection period [, ]. Earlier, YajQ was characterized as a putative nucleic acid-binding protein based on the similarity of its (ferredoxin-like) three-dimensional topology with that of RNP-like RNA-binding domains [, ].The polypeptide chain of YajQ is folded into two domains with identical folding topology. Each domain has a four-stranded antiparallel β-sheet flanked on one side by two α-helices. This structural motif is a characteristic feature of many RNA-binding proteins [].
This entry represents a group of E3 ubiquitin-protein ligases, including CHIP from eukaryotes and LubX from bacteria. CHIP is a multifunctional protein that functions both as a co-chaperone and an E3 ubiquitin-protein ligase. It couples protein folding and proteasome mediated degradation by interacting with heat shock proteins (e.g. HSC70) and ubiquitinating their misfolded client proteins thereby targeting them for proteasomal degradation [, ]. It is also important for cellular differentiation and survival (apoptosis), as well as susceptibility to stress. It targets a wide range of proteins, such as expanded ataxin-1, ataxin-3, huntingtin, and androgen receptor, which play roles in glucocorticoid response, tau degradation, and both p53 and cAMP signaling [, ]. LubX is an E3 ubiquitin ligase that interferes with host's ubiquitination pathway. LubX contains two U-box domains and was shown to interact with a diverse group of mammalian E2-conjugating enzymes including UBE2W, UBEL6, and members of the UBE2D and UBE2E families to direct ubiquitination of mammalian Cdc2-like kinase 1 (Clk1) [, ].
Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [].CinA is the first gene in the competence-inducible (cin) operon, and is thought to be specifically required at some stage in the process of transformation [].This family consists of putative competence-damaged proteins from the cin operon, and nicotinamide-nucleotide (NMN) amidohydrolase proteins. In the case of T. thermophilus, CinA () was shown to have both NMN deamidase and ADP-ribose pyrophosphatase activities [].
RbsD is a component of the ribose operon. It was originally thought to be a high affinity ribose transport protein, but further analysis []shows that it is a D-ribose pyranase . It catalyzes the interconversion of beta-pyran and beta-furan forms of D-ribose. It also catalyzes the conversion between beta-allofuranose and beta-allopyranose.FucU is a component of the fucose operon and is a L-fucose mutarotase , involved in the anomeric conversion of L-fucose. It also exhibits a pyranase activity for D-ribose [].Both have been classified in the RbsD/FucU family of proteins. Members of this family are ubiquitous having been found in organisms from eubacteria to mammals.The crystal structure of Bacillus subtilis RbsD reveals a 3 layers (α-β-alpha) subunit that associates into a homodecameric assembly [].
Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants []. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing []. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes []. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [].
There is a unique sequence domain at the C terminus of all known 4.1 proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates [].
C-terminal jelly roll/Ig-like domain (C-JID) was defined in cryogenic electron microscopy (cryoEM) structures of plant intracellular immune receptors containing Toll/interleukin-1 receptor (TIR), nucleotide-binding (NB-ARC) and leucine-rich repeat (LRR) domains (TIR-NLRs) [, ]. Structurally, the C-JID core is represented by a β-sandwich made up of 8 to 9 β-strands. C-JID matches the so-called post-LRR or C-terminal non-LRR domain detected earlier via MEME and BLAST searches [, ]. The domain showed a strong distribution bias towards TIR-NLRs of dicotyledonous plant species despite broader taxonomic distribution of TIR-NLR in plant groups [, ]. Structure-function analyses of cryoEM structures suggest that C-JID domains play a role in substrate recognition, such as binding to effector proteins from pathogens, and thus are involved in the initiation of signalling by TIR-NLR receptors [, ]. Presence of C-JID (or post-LRR) and its importance for the function of Arabidopsis TIR-NLR RPS4 that partners with RRS1 for effector recognition suggest that C-JID has additional functions [, , ].
Protein containing this domain are highly divergent in their overall sequence, however, they share a common region of roughly 200 amino acids knownas the SEC7 domain [[cite27373159], ]. The 3D structure of the domain displays several α-helices []. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian guanine-nucleotide-exchange factors [].SEC7 domain containing proteins are guanine nucleotide exchange factors (GEFs) specific for the ADP-rybosylation factors (ARF), a Ras-like GTPases which is important for vesicular protein trafficking. These proteins can be divided into five families, based on domain organisation and conservation of primary amino acid sequence: GBF/BIG, cytohesins,eFA6, BRAGs, and F-box []. They are found in all eukaryotes, and are involved in membrane remodeling processes throughout the cell [].
The late transcription region 2 (L2) of Adenovirus type 2 has an ORF of 80 residues positioned between nucleotides 17,676 and 17,915. It encodes an 11K polypeptide, which has the initiating methionine residue removed, leaving a 79-residue product. The L2 region that encoded 11K polypeptide is arginine rich (21%) and has a predicted molecular weight of 8,715. It was cleaved by the viral endoprotease to give two products which co-migrated on sodium dodecyl sulphate-polyacrylamide gels as virion polypeptide X [].The role of the L2 precursor, by virtue of its two domains, might be to condense the viral prochromatin for encapsidation. Subsequent cleavage within the particle after residue 31 releases the cross-link and prepares the viral chromatin for a relaxed conformation, which is required during infection and uncoating. Cleavage seems to be necessary for infectivity.This family consists of several adenovirus late L2 mu core protein or protein X sequences [].
This entry consists of p25 and p26 proteins from the Beet necrotic yellow vein virus (BNYVV), which is a plant pathogenic virus []. It is characterised by a positive-stranded single stranded RNA genome that is rod-shaped and non-enveloped in nature. The virus is transmitted by Polymyxa betae, a fungus from the order Plasmodiophorales.p25 is an RNA-3-encoded protein that is responsible for the production of rhizomania symptoms of sugar beet roots []. An estimate of the ratio between synonymous and non-synonymous substitution rates (omega) with maximum-likelihood models showed that the p25 sequences presented the highest (of three benyvirus proteins) mean omega values with strong positive selection acting on 14 amino acids, and particularly on amino acid 68, where the omega value was the highest so far encountered in plant viruses [].
VirB9 is a component of the type IV secretion system, which is employed by pathogenic bacteria to export virulence proteins directly from the bacterial cytoplasm into the host cell. Unlike the more common type III secretion system, type IV systems evolved from the conjugative apparatus, which is used to transfer DNA between cells. VirB9 was initially identified as an essential virulence gene on the Agrobacterium tumefaciens Ti plasmid. In the pilin-like conjugative structure, VirB9 appears to form a stabilizing complex in the outer membrane, by interacting with the lipoprotein VirB7. The heterodimer has been shown to stabilize other components of the type IV system [, , , ].This entry represents the C-terminal domain of VirB9. It is also found in TrbG, a probable conjugal transfer protein from Rhizobium []and CagX, a component of the Helicobacter pylori cag PAI-encoded type IV secretion system [].
This superfamily entry represents the C-terminal domain of Ofd1 (oxoglutarate and iron-dependent oxygenase, from Schizosaccharomyces pombe), a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N (the N-terminal transcription factor domain of Sre1) in the presence of oxygen []. The domain is conserved from yeasts to humans and was also characterised in S. cerevisiae Tpa1 [, ]. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts [].
This entry describes one of at least three types of phospho-2-dehydro-3-deoxyheptonate aldolase (DAHP synthase). This enzyme catalyzes the first of 7 steps in the biosynthesis of chorismate, that last common precursor of all three aromatic amino acids and of PABA, ubiquinone and menaquinone. Some members of this family, including an experimentally characterised member from Bacillus subtilis, are bifunctional, with a chorismate mutase domain N-terminal to this region. The member of this family from Synechocystis PCC 6803, CcmA, was shown to be essential for carboxysome formation. However, no other candidate for this enzyme is present in that species, chorismate biosynthesis does occur, other species having this protein lack carboxysomes but appear to make chorismate, and a requirement of CcmA for carboxysome formation does not prohibit a role in chorismate biosynthesis.
Alkaline phosphatase ([intenz:EC 3.1.3.1]) (ALP) []is a zinc and magnesium-containing metalloenzyme which hydrolyzes phosphate esters, optimally at high pH. It is found in nearly all living organisms, with the exception of some plants. In Escherichia coli, ALP is found in the periplasmic space. In yeast, it is found in lysosome-like vacuoles and in mammals, it is aglycoprotein attached to the membrane by a GPI-anchor.In mammals, four different isozymes are currently known []. Three of them are tissue-specific: the placental, placental-like (germ cell) and intestinalisozymes. The fourth form is tissue non-specific and was previously known asthe liver/bone/kidney isozyme.Streptomyces species involved in the synthesis of streptomycin (SM), an antibiotic, express a phosphatase () (gene strK) which is highly related to ALP. It specifically cleaves both streptomycin-6-phosphate and, more slowly, streptomycin-3"-phosphate []. This entry represents the ALP active site, and includes the region around the active site serine. It also matches the active site of the related enzyme, streptomycin-6-phosphate phosphatase.
Glycation isa nonenzymatic covalent reaction between proteins and endogenous reducing sugars or dicarbonyls (methylglyoxal, glyoxal) that results in protein inactivation. DJ-1 was described in vitro as a protein deglycase that repaired methylglyoxal- and glyoxal-glycated proteins [, ]. Since then there have been reports against [], and supporting this role for DJ-1 [].Furthermore, supporting its deglycase activity, DJ-1 and its bacterial homologues have been shown to be able to repair methylglyoxal- and glyoxal-glycated nucleotides and nucleic acids []. This ability would make DJ-1 a target for diabetic and cancer research []. DJ-1, also known as Park7, has been associated with human parkinsonism [].Included in this family is also YajL from Escherichia coli, the bacterial homologue of DJ-1 [, ]. This group of proteins are classified as either DJ-1 putative peptidases or non-peptidase homologues in MEROPS peptidase family C56 (clan PC(C)).
This domain is characterised by two well-conserved short regions separated by a variable region in both sequence and length. The first of the two regions is found in a large number of proteins outside this group, a number of which have been characterised as methyltransferases. One member of this group, FkbM, was shown to be required for a specific methylation in the biosynthesis of the immunosuppressant FK506 in Streptomyces strain MA6548 [].This domain is also found in other methyltransferases such as SdnD from the fungus Sordaria araneosa, involved in the biosynthesis of glycoside antibiotics with tetracyclic diterpene aglycone structure []and in AMB antimetabolite synthase AmbE from Pseudomonas aeruginosa, which participates in the biosynthesis of the antimetabolite L-2-amino-4-methoxy-trans-3-butenoic acid (AMB), a non-proteinogenic amino acid which is toxic for prokaryotes and eukaryotes [, ].
The malate dehydrogenase (MDH) of some extremophiles is more similar to the L-lactate dehydrogenases (L-LDH; ) from various sources than to other MDHs []. The archaebacterial MDH deviates from the eubacterial and eukaryotic enzymes having a low selectivity for the coenzyme (NAD(H) or NADP(H)) and catalysing the reduction of oxaloacetate to malate more efficiently than the reverse reaction []. It has been suggested that this class of dinucleotide cofactor-dependent dehydrogenases do not contain a Rossman-fold motif, as it was prior believed to be the case [].The enzyme is a dimer, where each subunit consists of three domains: domain I, domain II (NADPH binding domain), and domain III. Domain I contains N- and C-terminal regions and consists of the four-helix bundle []. The NADPH binding domain is formed of a seven-stranded antiparallel β-sheet fold [].This superfamily consists of the NADPH binding domain found in bacterial and archaeal enzymes with malate, L-lactate, L-sulpholactate dehydrogenase activities, and related proteins.
Lectins occur in plants, animals, bacteria and viruses. Initially described for their carbohydrate-binding activity [], they are now recognised as a more diverse group of proteins, some of which are involved in protein-protein, protein-lipid or protein-nucleic acid interactions []. There are at least twelve structural families of lectins, of which C-type (Ca+-dependent) lectins is one. C-type lectins can be further divided into seven subgroups based on additional non-lectin domains and gene structure: (I) hyalectans, (II) asialoglycoprotein receptors, (III) collectins, (IV) selectins, (V) NKgroup transmembrane receptors, (VI) macrophage mannose receptors, and (VII) simple (single domain) lectins [].The term 'C-type lectin domain' was introduced to distinguish a carbohydrate-recognition domain (CRD) which is present in all Ca2+-dependent lectins, but not in other types of animal lectins. However, there are proteins with modules similar in overall structure to CRDs that serve functions other than sugar binding. Therefore, a more general term C-type lectin-like domain was introduced to refer to such domains, although both terms are sometimes used interchangeably [].This superfamily represents a structural domain found in C-type lectins, as well as in other proteins, including:The C-terminal domain of invasin []and intimin [].Link domain, which includes the Link module of tumor necrosis factor-inducible gene 6 protein (TSG-6) [](a hyaladherin with important roles in inflammation and ovulation) and the hyaluronan binding domain of CD44 (which contains extra N-terminal β-strand and C-terminal β-hairpin) []. The Link domain may have emerged as a result of a deletion of the long loop region from an ancestral canonical C-type lectin domain [].Endostatin []and the endostatin domain of collagen alpha 1 (XV) [], these domains being decorated with many insertions in the common fold.
This entry represents MRG protein family, whose members include MORF4L1/2 (MRG15/MRGX) and MSL3L1/2 from humans, ESA1-associated factor 3 (Eaf3) from yeasts and male-specific lethal 3 (MSL3) from flies. They contain an N-terminal chromodomain that binds H3K36me3, a histone mark associated with transcription elongation []. Saccharomyces cerevisiae Eaf3 is a component of both NuA4 histone acetyltransferase and Rpd3S histone deacetylase complexes [, ]. It was found that Eaf3 mediates preferential deacetylation of coding regions through an interaction between the Eaf3 chromodomain and methylated H3-K36 that presumably results in preferential association of the Rpd3 complex []. The Drosophila MSL proteins (MSL1, MSL2, MSL3, MLE, and MOF) are essential for elevating transcription of the single X chromosome in the male (X chromosome dosage compensation) []. Together with two partlyredundant non-coding RNAs, roX1 and roX2, they form the MSL complex, also known as dosage compensation complex or DCC. MSL complex upregulates transcription by spreading the histone H4 Lys16 (H4K16) acetyl mark []and allows compensation for the loss of one X-chromosomal allele by increasing the transcription from the retained allele []. The MSL3 chromodomain has been shown to bind DNA and methylated H4K20 in vitro []. Human MORF4L1, also known as MRG15, is a component of the NuA4 histone acetyltransferase complex that transcriptional activates genes by acetylation of nucleosomal histones H4 and H2A. This modification may both alter nucleosome - DNA interactions and promote interaction of the modified histones with other proteins which positively regulate transcription. NuA4 complex may also play a direct role in DNA repair when directly recruited to sites of DNA damage. MRG15 is also a component of the mSin3A/Pf1/HDAC complex which acts to repress transcription by deacetylation of nucleosomal histones. MRG15 was found to interact with PALB2, a tumour suppressor protein that plays a crucial role in DNA damage repair by homologous recombination []. Furthermore, MRG15 play a role in the response to double strand breaks (DSBs) by recruiting the BRCA complex (BRCA1, PALB2, BRCA2 and RAD51) to sites of damaged DNA [, ].
The BURP domain was named after the proteins in which it was first identified: BNM2, USP, RD22, and PG1beta. It is found in the C terminus of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain-containing proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they are expressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain []. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions [].Some proteins known to contain a BURP domain are listed below [, , ]:Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis.Field bean USPs, abundant non-storage seed proteins with unknown function.Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium.Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells.Arabidopsis RD22 drought induced protein.Maize ZRP2, a protein of unknown function in cortex parenchyma.Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits.Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
This entry represents a family of acid phosphatase [, ]from plants which are closely related to the class B non-specific acid phosphatase OlpA (, which is believed to be a 5'-nucleotide phosphatase) and somewhat more distantly to another class B phosphatase, AphA (). Together these three clades define a subfamily of Acid phosphatase (Class B), which corresponds to the IIIB subfamily of the haloacid dehalogenase (HAD) superfamily of aspartate nucleophile hydrolases. It has been reported that the best substrates were purine 5'-nucleoside phosphates []. This is in concordance with the assignment of the Haemophilus influenzae hel protein (from ) as a 5'-nucleotidase, however there is presently no other evidence to support this specific function for this family of plant phosphatases. Many genes from this family have been annotated as vegetative storage proteins (VSPs) due to their close homology with these earlier-characterised gene products which are highly expressed in leaves. There are significant differences however, including expression levels and distribution []. The most important difference is the lack in authentic VSPs of the nucleophilic aspartate residue, which is instead replaced by serine, glycine or asparagine. Thus these proteins can not be expected to be active phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the Glycine max (Soybean) VSP []. In 1994 this assertion was refuted by the separation of the activity from the VSP. This entry explicitly excludes the VSPs which lack the nucleophilic aspartate. The possibility exists, however, that some members of this family may, while containing all of the conserved HAD-superfamily catalytic residues, lack activity and have a function related to the function of the VSPs rather than the acid phosphatases.
Members of this group contain a modified version of the HD-GYP domain and an uncharacterised N-terminal domain. There is currently no experimental data for members of this group.HD-GYP is a conserved domain found in response regulator modules of various signal transduction systems. The involvement of the HD-GYP domain in signal transduction was originally proposed on the basis of its association with CheY-like and other signal transduction domains []and was later directly demonstrated experimentally by showing that RpfG is involved in regulation of the biosynthesis of extracellular endoglucanase and polysaccharide [].A modification of the HD-GYP domain, which is found in this group, , and several smaller groups, lacks the conserved distal portion of the domain and has certain substitutions in the characteristic metal-binding residues []of the HD superfamily phosphohydrolases, which likely render it catalytically inactive. Note that the prototypical HD domain () is not recognised in many members of this group.The exact mode of action and targets of the HD-GYP output domain are not known []. HD-GYP proteins are associated to the HD domain superfamily of metal-dependent phosphohydrolases; HD designates the principal conserved residues implicated in metal binding and catalysis []. The HD-GYP version of the HD-type domain has many additional highly conserved residues, including a conserved GYP motif, hence its name [, ].It has been noted that the highly conserved sequence of the HD-GYP domain suggests high substrate specificity []. On the basis of its association with the GGDEF diguanylate cyclase domain, it has been also predicted that the HD-GYP domain may be involved in the metabolism of cyclic diguanylate or in dephosphorylation of some phosphotransfer domain [].
This entry represents C3a, C4a and C5a anaphylatoxins, which are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins.Complement components C3, C4 and C5 are large glycoproteins that have important functions in the immune response and host defence []. They have a wide variety of biological activities and are proteolytically activated by cleavage at a specific site, forming a- and b-fragments []. A-fragments form distinct structural domains of approximately 76 amino acids, coded for by a single exon within the complement protein gene. The C3a, C4a and C5a components are referred to as anaphylatoxins [, ]: they cause smooth muscle contraction, histamine release from mast cells, and enhanced vascular permeability []. They also mediate chemotaxis, inflammation, and generation of cytotoxic oxygen radicals []. The proteins are highly hydrophilic, with a mainly α-helical structure held together by 3 disulphide bridges [].Fibulins are secreted glycoproteins that become incorporated into a fibrillar extracellular matrix when expressed by cultured cells or added exogenously to cell monolayers [, ]. The five known members of the family share an elongated structure and many calcium-binding sites, owing to the presence of tandem arrays of epidermal growth factor-like domains. They have overlapping binding sites for several basement-membrane proteins, tropoelastin, fibrillin, fibronectin and proteoglycans, and they participate in diverse supramolecular structures. The amino-terminal domain I of fibulin consists of three anaphylatoxin-like (AT) modules, each approximately 40 residues long and containing four or six cysteines. The structure of an AT module was determined for the complement-derived anaphylatoxin C3a, and was found to be a compact α-helical fold that is stabilised by three disulphide bridges in the pattern Cys1-4, Cys2-5 and Cys3-6 (where Cys is cysteine). The bulk of the remaining portion of the fibulin molecule is a series of nine EGF-like repeats [].
Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain, three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain.They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers []. EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.This repeat was first described in 1986 by Lawler and Hynes []. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) []as well as extracellular matrix protein like mindin, F-spondin [], SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium [, ]contain one or more instance of this repeat. It has been involved in cell-cell interaction, inhibition of angiogenesis []and apoptosis [].The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling []. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat [].The TSP1 repeat structure has a disulfide-rich fold with all-beta sheets, each with three antiparallel strands.
The EH (for Eps15 Homology) domain is a protein-protein interaction module of approximately 95 residues which was originally identified as a repeated sequence present in three copies at the N terminus of the tyrosine kinase substrates Eps15 and Eps15R [, ]. The EH domain was subsequently found in several proteins implicated in endocytosis, vesicle transport and signal transduction in organisms ranging from yeast to mammals. EH domains are present in one to three copies and they may include calcium-binding domains of the EF-hand type [, ]. Eps15 is divided into three domains: domain I contains signatures of a regulatory domain, including a candidate tyrosine phosphorylation site and EF-hand-type calcium-binding domains, domain II presents the characteristic heptad repeats of coiled-coil rod-like proteins, and domain III displays a repeated aspartic acid-proline-phenylalanine motif similar to a consensus sequence of several methylases [].EH domains have been shown to bind specifically but with moderate affinity to peptides containing short, unmodified motifs through predominantly hydrophobic interactions. The target motifs are divided into three classes: class I consists of the concensus Asn-Pro-Phe (NPF) sequence; class II consists of aromatic and hydrophobic di- and tripeptide motifs, including the Phe-Trp (FW), Trp-Trp (WW), and Ser-Trp-Gly (SWG) motifs; and class III contains the His-(Thr/Ser)-Phe motif (HTF/HSF) [, ]. The structure of several EH domains has been solved by NMR spectroscopy. The fold consists of two helix-loop-helix characteristic of EF-hand domains, connected by a short antiparallel β-sheet. The target peptide is bound in a hydrophobic pocket between two alpha helices. Sequence analysis and structural data indicate that not all the EF-hands are capable of binding calcium because of substitutions of the calcium-liganding residues in the loop [, , ]. This domain is often implicated in the regulation of protein transport/sorting and membrane trafficking. Messenger RNA translation initiation and cytoplasmic poly(A) tail shortening require the poly(A)-binding protein (PAB) in yeast. The PAB-dependent poly(A) ribonuclease (PAN) is organised into distinct domains containing repeated sequence elements [].
SH2-bearing genes cloned from Dictyostelium include two transcription factors, STATa and STATc, and a signaling factor, SHK1 (shkA). A database search of the Dictyostelium discoideum genome revealed two additional putative STAT sequences, dd-STATb and dd-STATd, and four additional putative SHK genes, dd-SHK2 (shkB), dd-SHK3 (shkC), dd-SHK4 (shkD), and dd-SHK5 (shkE). This entry contains SH2 domains of shkD and shkE. All of the SHK members are most closely related to the protein kinases found in plants. However these kinases in plants are not conjugated to any SH2 or SH2-like sequences. Alignment data indicates that the SHK SH2 domains carry some features of the STAT SH2 domains in Dictyostelium. When STATc's linker domain was used for a BLAST search, the sequence between the protein kinase domain and the SH2 domain (the linker) of SHK was recovered, suggesting a close relationship among these molecules within this region. SHK's linker domain is predicted to contain an α-helix which is indeed homologous to that of STAT. Based on the phylogenetic alignment, SH2 domains can be grouped into two categories, STAT-type and Src-type. SHK family members are in between, but are closer to the STAT-type which indicates a close relationship between SHK and STAT families in their SH2 domains and further supports the notion that SHKs linker-SH2 domain evolved from STAT or STATL (STAT-like Linker-SH2) domain found in plants. In SHK, STAT, and SPT6, the linker-SH2 domains all reside exclusively in the C-terminal regions. In general SH2 domains are involvedin signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [, , , ].
This entry represents the TsaD protein family that is widely distributed. TsaD and its archaeal homologue Kae1 () belong to the Kae1/TsaD family (), a conserved protein family with unknown function.This entry includes bacterial TsaD and its homologues, such as Qri7 (localize to the mitochondria) from budding yeast []. TsaD (also known as Gcp or YgjD) was originally described as a glycoprotease essential for cell viability []and a critical mediator involved in the modification of cell wall peptidoglycan synthesis and/or cell division []. Gcp is a member of the Kae1/TsaD family, required for the formation of a threonylcarbamoyl group on adenosine at position 37 in tRNAs that read codons beginning with adenine []. YgjD has been renamed as TsaD, and it has been shown that YgjD and proteins YrdC (TsaC), YjeE (TsaE), and YeaZ (TsaB), are necessary and sufficient for t6A biosynthesis in vitro, and may constitute a complex [].The first characterised member of the Kae1/TsaD family was annotated as Gcp for O-sialoglycoprotein endopeptidase [], but this activity could not be confirmed []. Later, its homologue, Kae1 from Pyrococcus abyssi, has been shown to have DNA-binding properties and apurinic-endonuclease activity []. Members of this family have since been studied in yeast, archaea and bacteria resulting in sometimes conflicting data, several proposed functions and annotations but no definitive characterisation. For instance, some members have been linked to DNA maintenance in bacteria and mitochondria []and transcription regulation and telomere homeostasis in eukaryotes [, ], but their function remained unclear. Recent research indicates that this family is involved in the biosynthesis of N6-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs that read codons beginning with adenine [].
The intracellular second messenger cyclic adenosine monophosphate (cAMP) exerts many of its physiological effects by activating cAMP-dependent protein kinase (PKA), which in turn phosphorylates and regulates the functions of downstream protein targets including ion channels, enzymes, and transcription factors. PKA is a tetrameric enzyme composed of a two regulatory (R) and two catalytic (C) subunits. Binding of 2 cAMP molecules to each R subunit leads to holoenzyme dissociation into the R dimer and two active subunits [, , ]. There are 4 different R sububits divided in two types, type I (RI-alpha and RI-beta), and type II (RII-alpha and RII-beta), and two main C subunits (C-alpha and C-beta) []. Type I PKA is predominantly cytoplasmic, whereas type II PKA usually associates with specific cellular structures and organelles. The intracellular organization of PKA is controlled through the association with AKAPs (A-kinase-anchoring proteins) [, , ].PKA plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis. The cAMP/PKA signaling pathway regulates glucose homeostasis at multiple levels including insulin and glucagon secretion, glucose uptake, glycogen synthesis and breakdown, gluconeogenesis []. The cAMP/PKA pathway acts downstream of GPCRs and regulates the activities of key molecules involved in insulin secretion, including GLUT2, KATP, and Cav [].While the genes encoding the alpha and beta PKA C subunits are present in all vertebrates, in primates a third subunit, C-gamma, is encoded by an intronless gene, PRKACG, and it is was thought to be a retrotransposon []. This isoform was isolated in human testis []and mutations in the gene lead to a bleeding disorder known as platelet-type 19 (BDPLT19) associated with impaired platelet activation and cytoskeleton reorganization [].
This entry represents a transmembrane domain with a (10,12) β-barrel structure. This domain is found in:Outer membrane peptidase omptinOuter membrane adhesin OpcAThe outer membrane omptin belongs to the MEROPS family A26 (clan AF). The omptin family, comprises a number of novel outer membrane-associated serine peptidases that are distinct from trypsin-like peptidases in that they cleave polypeptides between two basically-charged amino acids []. The enzyme is sensitive to the serine protease inhibitor diisopropylfluoro-phosphate, to divalent cations such as copper, zinc and iron [], and istemperature regulated, activity decreasing at lower temperatures [, ]. Temperature regulation is most prominently shown in the Yersinia pestiscoagulase/fibrinolysin protein, where coagulase activity is prevalent below 30 degrees Celsius, and fibrinolysin (protease) activity is prevalentabove this point, the optimum temperature being 37 degrees []. It is possible that this assists in 'flea blockage' and transmission of the bacteria to animals [].The outer membrane adhesin OpcA is Neisseria species-specific. OpcA (formerly called 5C) was isolated from Neisseria meningitidis, causative agent of meningococcal meningitis and septicemia. An outer membrane protein embedded in the lipid bilayer, OpcA was shown to play an important role in meningococcal adhesion and invasion of both epithelial and endothelial cells, mediating attachment to host cells by binding proteoglycan cell-surface receptors []. OpcA forms a 10-stranded β-barrel with five highly mobile extracellular loops that protrude above the surface of the membrane []. These extracellular loops combine to form a crevice in the external surface that is lined by positively charged residues, which is predicted to be a binding site for proteoglycan polysaccharides involved in pathogenesis. Conformational changes in the extracellular loops modulate the surface of OpcA, which could affect the proteoglycan binding site []. These conformational changes could also lead to pore opening.
Eukaryotic eIF-5A was initially thought to function as a translation initiation factor, based on its ability to stimulate methionyl-puromycin synthesis. However, subsequent work revealed a role for eIF5A in translation elongation [, ]. Depletion or inactivation of eIF-5A in the yeast Saccharomyces cerevisiae (Baker's yeast) resulted in the accumulation of polysomes and an increase in ribosomal transit times. Addition of recombinant eIF-5A from yeast, but not a derivative lacking hypusine, enhanced the rate of tripeptide synthesis in vitro. Moreover, inactivation of eIF-5A mimicked the effects of the eEF2 inhibitor sordarin, indicating that eIF-5A might function together with eEF2 to promote ribosomal translocation. Finally, it was shown that eIF5A is specifically required to promote peptide-bond formation between consecutive proline residues. It has been proposed to stimulate the peptidyl-transferase activity of the ribosome and facilitate the reactivity of poor substrates like proline [].eIF-5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [, , ]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported []. The archaeal IF-5A proteins have not been studied as comprehensively as their eukaryotic homologues, though the crystal structure of the Pyrobaculum aerophilum protein has been determined. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [].This entry represents the archaeal IF-5A proteins.
The BURP domain was named after the proteins in which it was first identified: BNM2, USP, RD22, and PG1beta. It is found in the C terminus of a number of plant cell wall proteins, which are defined not only by the BURP domain, but also by the overall similarity in their modular construction. The BURP domain-containing proteins consists of either three or four modules: (i) an N-terminal hydrophobic domain - a presumptive transit peptide, joined to (ii) a short conserved segment or other short segment, (iii) an optional segment consisting of repeated units which is unique to each member, and (iv) the C-terminal BURP domain. Although the BURP domain proteins share primary structural features, their expression patterns and the conditions under which they areexpressed differ. The presence of the conserved BURP domain in diverse plant proteins suggests an important and fundamental functional role for this domain []. It is possible that the BURP domain represents a general motif for localization of proteins within the cell wall matrix. The other structural domains associated with the BURP domain may specify other target sites for intermolecular interactions [].Some proteins known to contain a BURP domain are listed below [, , ]:Brassica protein BNM2, which is expressed during the induction of microspore embryogenesis.Field bean USPs, abundant non-storage seed proteins with unknown function.Soybean USP-like proteins ADR6 (or SALI5-4A), an auxin-repressible, aluminium-inducible protein and SALI3-2, a protein that is up-regulated by aluminium.Soybean seed coat BURP-domain protein 1 (SCB1). It might play a role in the differentiation of the seed coat parenchyma cells.Arabidopsis RD22 drought induced protein.Maize ZRP2, a protein of unknown function in cortex parenchyma.Tomato PG1beta, the beta-subunit of polygalacturonase isozyme 1 (PG1), which is expressed in ripening fruits.Cereal RAFTIN. It is essential specifically for the maturation phase of pollen development.
Growth hormone (GH) is a pituitary hormone involved in cell and overall bodygrowth, carbohydrate-protein-lipid metabolism and osmotic homeostasis.Control of GH release was initially ascribed to 2 pathways: stimulation byhypothalamic GH-releasing hormone (GHRH) and inhibition by somatostatin.More recently, synthetic compounds, termed GH secretagogues (GHS), were shown to stimulate GH release strongly. This effect is elicited by an orphanG protein-coupled receptor (GPCR), subsequently named the GHS receptor(GHS-R). The endogenous ligand for this receptor was purified from rat andhuman stomach and named ghrelin [].The purified cDNA for ghrelin encodes a 117 amino acid prepropeptide. Thefirst 23 amino acid residues form a signal peptide that is cleaved to leaveproghrelin. Residues 24-51 are cleaved to yield active ghrelin, discardingthe C-terminal fragment []. The 28-residue ghrelin peptide that is left is biologically inactive. Esterification with n-octanoic acid at Ser3 is required for biological activity. Ghrelin mRNA is expressed mainly in thestomach in a distinct endocrine cell type in the submucosal layer, known asX/A-like cells. The active peptide is secreted into the bloodstream ratherthan the stomach. Ghrelin responsive cells are found in abundance in a limited area of the hypothalamic arcuate nucleus (ARC), a region involved incontrol of food intake. As well as releasing GH indirectly via its action on the ARC region of the hypothalamus, ghrelin also appears to be able tostimulate GH release via direct action on the pituitary [].A further variant of the ghrelin peptide exists in rat stomach, des-Gln14-ghrelin. This is produced by alternative splicing and does not require theesterification by n-octanoic acid for biological activity. However, its presence in only small quantities in the stomach suggests ghrelin is themajor active form. The ghrelin active peptide and the GHS receptor sharesequence similarity with motilin and the motilin receptor, respectively, suggesting an evolutionary relationship.
Huntingtin-interacting protein 1 (HIP1) belongs to the Sla2 family. HIP1 was first identified due to its interaction with huntingtin (htt), a protein that when mutated is involved in the genetic neurodegenerative disorder Huntington's disease []. Later, HIP1 was found to play a role in trafficking and is linked to cancers []. HIP1 depends on clathrin for its membrane localisation and plays a role in pits maturation and formation of the coated vesicle []. Besides endocytosis, HIP1 is also involved in cellular processes such as tumorigenesis [], transcription regulation []and cell death [].The Sla2 family, also known as Sla2p/HIP1/HIP1R family, including Sla2p from budding yeasts [], End4 from fission yeasts [], Huntingtin-interacting protein 1 (HIP1) and HIP1-related (HIP1R) from humans and mice []. They are adaptor proteins thatlink actin to clathrin and endocytosis in the clathrin-mediated endocytosis (CME) pathway. They contain the ENTH and ANTH (E/ANTH) domain that binds both inositol phospholipids and proteins that contribute to the nucleation and formation of clathrin coats on membranes []. They also contain an I/LWEQ (talin-like) domain, which is an actin-binding domain found in proteins that serve as linkers between the actin cytoskeleton and cellular compartments []. The talin-like domains of Sla2p and HIP1R have been shown to bind F-actin (filamentous actin), however, the same level of actin binding has not been observed with HIP1 []. The central domain of the Sla2p/HIP1/HIP1R proteins contains a coiled-coil domain and several consensus sequences enabling protein-protein interactions []. Yeast Sla2p has been extensively studied: Sla2p arrives at existing endocytic patches with a ~25 seconds delay relative to clathrin and dissociates simultaneously with clathrin upon recruitment of actin-related proteins [, ]. It serves as a bridge, connecting the endocytic patch to the cortical actin cytoskeleton [].
Coronins are evoluntionarily conserved WD-repeat-containing proteins mostly involved in actin cytoskeleton organisation. The WD40 motif is found in a multitude of eukaryotic proteins involved in a variety of cellular processes []. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The final 40 amino acids are predicted to form a coiled-coil in a coronin homodimer []. Coronin was first identified as an actin binding protein in Dictyostelium discoideum. It was named Coronin because of its association with crown-shaped cell surface projections of growth-phase D []. Since then, several Coronin homologues and isoforms have been identified from yeast to human. Mammalian Coronin isoforms include Coronin 1A/B/C, Coronin 2A/B, Coronin 6 and Coronin 7. The yeast Coronin homologue is known as Crn1, while the Drosophila homologue is known as pod1. In budding yeast, Crn1 regulates the actin filament nucleation/branching activity of the actin-related protein 2/3 (Arp2/3) complex through interaction with the Arc35p subunit [].Mammalian Coronin 1A is exclusively expressed in leukocytes and involved in the regulation of leukocyte specific signaling events []. The crystal structure of Coronin 1A has been solved [, ]. Mammalian Coronin 1B can protect new (ATP-rich) filaments from F-actin severing Cofilin and dismantle old (ADP-rich) filaments by inducing Arp2/3 dissociation in lamellipodia [, ]. It is worth noting that Coronin 7 in this entry has not been shown to interact with actin []. Unlike most of the Coronin isoforms, it binds to the outer side of Golgi complex membranes and acts as a mediator of cargo vesicle formation at the trans-Golgi network [].
Wnt proteins constitute a large family of secreted molecules that are involved in intercellular signalling during development. The name derives from the first 2 members of the family to be discovered: int-1 (mouse) and wingless (Drosophila) []. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals []. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer's disease [, ].Wnt-mediated signalling is believed to proceed initially through binding to cell surface receptors of the frizzled family; the signal is subsequently transduced through several cytoplasmic components to B-catenin, which enters the nucleus and activates the transcription of several genes important indevelopment []. Several non-canonical Wnt signalling pathways have also been elucidated that act independently of B-catenin. Canonical and noncanonical Wnt signaling branches are highly interconnected, and cross-regulate each other [].Members of the Wnt gene family are defined by their sequence similarity to mouse Wnt-1 and Wingless in Drosophila. They encode proteins of ~350-400 residues in length, with orthologues identified in several, mostly vertebrate, species. Very little is known about the structure of Wnts as they are notoriously insoluble, but they share the following features characteristics of secretory proteins: a signal peptide, several potential N-glycosylation sites and 22 conserved cysteines []that are probably involved in disulphide bonds. The Wnt proteins seem to adhere to the plasma membrane of the secreting cells and are therefore likely to signal over only few cell diameters. Fifteen major Wnt gene families have been identified in vertebrates, with multiple subtypes within some classes.Wnt-7 cDNA was isolated from mouse using a PCR-based strategy, where it was found to be expressed in adult tissues, particularly in brain and lung []. Two subtypes are known to exist, designated Wnt-7A and B. Wnt-7A has been implicated in development of the uterus, cerebellum and limbs; Wnt-7B is believed to play a role in lung and placental development.
Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site [, ]. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide [, ]. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.Thiocyanate hydrolase (SCNase) is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia [].The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase [].This entry represents the beta subunit.
This entry contains members of the ADAM-TS2 family of metallopeptidases that belong to MEROPS peptidase family M12, subfamily M12B: adamalysin (clan MA).Proteolysis of the extracellular matrix plays a critical role in establishing tissue architecture during development and in tissue degradation in diseases such as cancer, arthritis, Alzheimer's disease and a variety of inflammatory conditions []. The proteolytic enzymes responsible for this process are members of diverse protease families, including the secreted zinc metalloproteases (MPs) []. Recently, a new MP family, ADAM-TS (a disintegrin-like and metalloprotease domain with thrombospondin type I modules) has been identified. The family consists of at least 20 members that share a high degree of sequence similarity and conserved domain organisation [, ]. The defining domains of the ADAM-TS family are (from N- to C-termini) a pre-pro metalloprotease domain of the reprolysin type, a snake venom disintegrin-like domain, a thrombospondin type-I (TS) module, a cysteine-rich region, and a cysteine-free (spacer) domain []. Domain organisation following the spacer domain C terminus shows some variability in certain ADAM-TS members, principally in the number of additional TS domains. Members of the ADAM-TS family have been implicated in a range of diseases. ADAM-TS1, for example, is reported to be involved in inflammation and cancer cachexia [], whilst recessively inherited ADAM-TS2 mutations cause Ehlers-Danlos syndrome type VIIC, a disorder characterised clinically by severe skin fragility []. ADAM-TS4 is an aggrecanase involved in arthritic destruction of cartilage []. ADAM-TS2 was initially termed procollagen I/II amino-propeptide processing enzyme (PCINP). It was re-classified as ADAM-TS2 on the basis of cDNA sequence similarity with ADAM-TS1 []. In vitro studies have shown that stable expression of ADAM-TS2 cDNA in mammalian cells results in secretion of an active recombinant enzyme that specifically cleaves type I procollagen [].
Protein phosphorylation, which plays a key role in most cellular activities, is a reversible process mediated by protein kinases and phosphoprotein phosphatases. Protein kinases catalyse the transfer of the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. Phosphoprotein phosphatases catalyse the reverse process. Protein kinases fall into three broad classes, characterised with respect to substrate specificity []:Serine/threonine-protein kinasesTyrosine-protein kinasesDual specificity protein kinases (e.g. MEK - phosphorylates both Thr and Tyr on target proteins)Protein kinase function is evolutionarily conserved from Escherichia coli to human []. Protein kinases play a role in a multitude of cellular processes, including division, proliferation, apoptosis, and differentiation []. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins. The catalytic subunits of protein kinases are highly conserved, and several structures have been solved [], leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases [].TESK1 (testis-specific protein kinase 1) is a protein kinase with a structure composed of an N-terminal protein kinase domain and a C-terminal proline-rich domain and is most closely related to the LIM motif-containing protein kinase (LIMK) subfamily []. TESK1 has kinase activity with dual specificity on both serine/threonine and tyrosine residues []. When expressed in HeLa cells, TESK1 stimulates the formation of actin stress fibres and focal adhesions and functions downstream of integrins through phosphorylation and inactivation of cofilin []. In a yeast two-hybrid screen, Sprouty4 was identified as a binding partner of TESK1 [], and was subsequently found to negatively regulate cell spreading by inhibiting the kinase activity of TESK1 [].
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor []. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [, ]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [, ].A classification of RCMTs has been proposed on the basis of sequence similarity []. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].The prototypical member of the Nucleolar Protein 2 RCMT subfamily, the S.cerevisiae NOP2, is an essential nucleolar protein required for pre-rRNA processing and 60S ribosomal subunit assembly []that acts as a ribosomal RNA methyltransferase [, ]. Its human homologue, the proliferation-associated nucleolar antigen P120, is a promising tumour marker []. P120 has been demonstrated to be implicated in rRNA biogenesis [, ], and is also proposed to act as an rRNA methyltransferase [].
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor []. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [, ]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [, ].A classification of RCMTs has been proposed on the basis of sequence similarity []. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].As mentioned above, RCMT9 is a novel subtype of RCMT-related proteins. Putative orthologues of this subfamily have been detected only in Viridiplantae, Alveolata, Euglenozoa and Mycetozoa taxa. Members of this group are distantly related to the Nuclear protein 1 (NCL1) subfamily [].
TssC (also known as VipB) is a family of Gram-negative type VI secretion system components of the tail sheath. They have been known as COG3517. These sheath-components, of which there are many copies in the sheath, are also variously referred to as TssC. On contact with another bacterial cell the sheath contracts and pushes the puncturing device and tube through the cell envelope and punches the target bacterial cell []. VipA and VipB (TssB and TssC) proteins were shown to form a cog-wheel like tubular structure in V. cholerae that was noticed to resemble T4 phage gp18 polysheath. Two β-strands of VipA and four β-strands of VipB intertwine forming the middle layer of the sheath. The sheath assembles around an inner Hcp tube and is attached to a structure called a baseplate that spans the bacterial membranes. Importantly, VipA/VipB sheath was shown to form a long contractile organelle in V. cholerae and in E. coli, suggesting that sheath contraction powers the secretion [].This entry includes TssC mostly from Bacteroidetes. The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath []. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA [].
Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation ofNO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site [, ]. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide [, ]. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.Thiocyanate hydrolase (SCNase) is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia [].The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase [].
The Ras association domain (RASSF) proteins are named due to the presence of a Ras association (RA) domain in their N or C terminus that can potentially interact with the Ras GTPase family of proteins. These GTPases control a variety of cellular processes, such as membrane trafficking, apoptosis, and proliferation. RASSF proteins contain several other functional domains that modulate associations with other proteins. RASSF proteins with the RA domain at the C terminus (which are termed C-terminal or classical RASSF) usually also include a Salvador-RASSF-Hippo (SARAH) domain involved in several protein-protein interactions and for homo- and heterodimerisation of RASSF isoforms. N-terminal RASSF proteins (with the RA domain in the N terminus) do not usually contain a SARAH domain [].At least 10 RASSF family members have been characterised (with multiple splice variants), many of which have been shown to play a role in tumour suppression. RASSF proteins also act as scaffolding agents in microtubule stability, regulate mitotic cell division, control cell migration and cell adhesion, and modulate NF-KB activity and the duration of inflammation. Loss of RASSF expression through promoter methylation has been shown in numerous types of cancer, including leukemia, melanoma, breast and prostate cancer [].RASSF9 is one of the N-terminal RASSF proteins, characterised by an RA domain in the N terminus. It was previously called P-CIP1 (peptidylglycine alpha-amidating monooxygenase (PAM) COOH-terminal interactor protein-1), and was believed to play a role in the regulation of PAM in endosomal pathways []. Very little else is known about the function of RASSF9, but it has been suggested to be involved in epidermal homeostasis [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs(Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [, , ].Neurotensin is a 13-residue peptide transmitter, sharing significantsimilarity in its 6 C-terminal amino acids with several other neuropeptides,including neuromedin N. This region is responsible for the biological activity, the N-terminal portion having a modulatory role. Neurotensin is distributed throughout the central nervous system, with highest levels in the hypothalamus, amygdala and nucleus accumbens. It induces a variety of effects, including: analgesia, hypothermia and increased locomotor activity. It is also involved in regulation of dopamine pathways. In the periphery, neurotensin is found in endocrine cells of the small intestine, where it leads to secretion and smooth muscle contraction.The existence of 2 neurotensin receptor subtypes, with differing affinitiesfor neurotensin and differing sensitivities to the antihistamine levocabastine, was originally demonstrated by binding studies in rodent brain. Two neurotensin receptors (NT1 and NT2) with such properties have since been cloned and have been found to be G-protein-coupled receptor family members [].The NT2 receptor was cloned from rat, mouse and human brains based on itssimilarity to the NT1 receptor. The receptor was found to be a low affinity,levocabastine sensitive receptor for neurotensin. Unlike the high affinity,NT1 receptor, NT2 is insensitive to guanosine triphosphate and has low sensitivity to sodium ions []. Highest levels of expression of the receptor are found in the brain, in regions including: the olfactory system, cerebral and cerebellar cortices, hippocampus and hypothalamic nuclei. The distribution is distinct from that of the NT1 receptor, with only a fewareas (diagonal band of Broca, medial septal nucleus and suprachiasmatic nuclei) expressing both receptor subtypes. The receptor has also been found at lower levels in the kidney, uterus, heart and lung []. Activationof the NT2 receptor by non-peptide agonists suggests that the receptor cancouple to phospholipase C, phospholipase A2 and MAP kinase. A functionalresponse to neurotensin, however, is weak []or absent, and neurotensin appears to act as an antagonist of the receptor []. It has been suggested that a substance other than neurotensin may act as the natural ligand for this receptor.
Neurotransmitter ligand-gated ion channels are transmembrane receptor-ion channel complexes that open transiently upon binding of specific ligands, allowing rapid transmission of signals at chemical synapses [, ]. Five of these ion channel receptor families have been shown to form a sequence-related superfamily:Nicotinic acetylcholine receptor (AchR), an excitatory cation channel in vertebrates and invertebrates; in vertebrate motor endplates it is composed of alpha, beta, gamma and delta/epsilon subunits; in neurons it is composed of alpha and non-alpha (or beta) subunits [].Glycine receptor, an inhibitory chloride ion channel composed of alpha and beta subunits [].Gamma-aminobutyric acid (GABA) receptor, an inhibitory chloride ion channel; at least four types of subunits (alpha, beta, gamma and delta) are known [].Serotonin 5HT3 receptor, of which there are seven major types (5HT3-5HT7) [].Glutamate receptor, an excitatory cation channel of which at least three types have been described (kainate, N-methyl-D-aspartate (NMDA) and quisqualate) [].These receptors possess a pentameric structure (made up of varying subunits), surrounding a central pore. All known sequences of subunits from neurotransmitter-gated ion-channels are structurally related. They are composed of a large extracellular glycosylated N-terminal ligand-binding domain, followed by three hydrophobic transmembrane regions which form the ionic channel, followed by an intracellular region of variable length. A fourth hydrophobic region is found at the C-terminal of the sequence [, ].Serotonin (5-hydroxytryptamine, 5-HT) is widely distributed in both the central and peripheral nervous system, where it acts as a neurotransmitterand neuromodulator []. It has been implicated in several aspects of brain function, including regulation of affective states, ingestive behavior and addiction. 5-HT can activate a number of different receptor subtypes that produce diverse neuronal responses, principally through activation of G-protein-mediated signalling pathways. Signalling through the 5-HT3 receptor (5-HT3R) differs, since this subtype belongs to the ligand-gated ion channel (LGIC) superfamily, which also includes the inhibitory gamma-aminobutyric acid type A and glycine receptors, and excitatory nicotinic acetylcholine receptors (nAChR) []. 5-HT3 receptor function has been implicated in a variety of neural processes, including pain perception, emesis, anxiety and drug abuse.Like the other members of the LGIC superfamily, the 5HT3R exhibits a high degree of sequence similarity, and therefore putative structural similarity, with nAChRs []. Thus, functional 5HT3Rs comprise a pentamer: the ion channel is formed at the centre of a rosette formed between five homologous subunits. Two classes of 5-HT3R subunit are currently known, termed 5-HT3A and 5-HT3B. Whilst homomeric pentamers of 5-HT3A form functional receptors, heteromeric assemblies display channel conductances, cation permeabilities and current-voltage relationships typical of characterised neuronal 5-HT3 channels [].The proposed topology of 5-HT3R subunits comprises four putative transmembrane (TM) domains (designated M1-4); a large extracellular N-terminal region (~200 amino acids); and a variable cytoplasmic loop between M3 and M4. The M2 domains from each subunit are thought to form the channel pore. The agonist binding site is formed by the N terminus, which, on binding, induces a conformational change in the channel pore, a process often referred to as "gating"[]. Opening of the pore allows cation flux through the neuronal membrane and depolarises the membrane potential. Thus, 5-HT3Rs may be thought of as excitatory receptors [].Whilst it was initially thought that 5-HT3Rs comprised a homopentamer ofalpha subunits, the channel conductance and permeability to anions wasdifferent in homomeric receptors from that observed in native channels. Morerecently, another 5-HT3 receptor subunit, 5-HT3B, was identified and clonedfrom a human brain cDNA library []. This subunit was unable to formfunctional channels when expressed alone in oocytes, but produced functionalreceptors when injected with 5-HT3A into the same cell. It is thought that5HT3B contributes towards tissue-specific functional changes in5-HT3-mediated signalling [].
Glutamate synthase (GOGAT, GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate (2-OG) and L-glutamine via intramolecular channeling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation []. GOGAT is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor [].There are four classes of glutamate synthase (GOGAT) [], []:1. Bacterial NADPH-dependent GOGAT (NADPH-GOGAT, ). This standard bacterial NADPH-GOGAT (GltS) is composed of a large (alpha, GltB) subunit, and a small (beta, GltD) subunit.2. Ferredoxin-dependent GOGAT in cyanobacteria and plants (Fd-GOGAT from photosynthetic cells, ) displays a single-subunit structure corresponding to the large bacterial subunit.3. Pyridine-linked GOGAT in both photosynthetic and nonphotosynthetic eukaryotes (eukaryotic GOGAT or NADH-GOGAT, ) displays a single-subunit structure corresponding to the fusion of the small and the large bacterial subunits. 4. The archaeal type with stand-alone proteins corresponding to the N-terminal, FMN-binding, and the C-terminal domains of the large subunit and to the small subunit.The large (alpha, GltB) subunit of bacterial glutamate synthase (GOGAT) consists of three domains: the N-terminal amidotransferase domain (), the central domain, and the C-terminal domain (). This entry represents a stand-alone version of the central domain. The stand-alone form occurs in the archaeal-type of GOGAT, where the large subunit is represented by three separate proteins, corresponding to the three domains of the "standard"bacterial enzyme [].The second (central) domain of the bacterial GOGAT large subunit consists of a linker domain and the FMN-binding domain (). The FMN-binding domain has a beta/alpha barrel topology. In this domain, the 2-iminoglutarate intermediate, formed upon the addition of ammonia onto 2-oxoglutarate, is reduced by the FMN cofactor producing the second molecule of L-glutamate []. This domain also contains the enzyme 3Fe-4S cluster [].All members of this entry contain the FMN-binding domain. However, they lack the linker domain, and some have 1-3 copies of (4Fe-4S binding domain) in the N-terminal region.Originally, only the ORF encoding the central domain of GOGAT was recognised and annotated as GltB in archaea, and the rest of the large subunit was thought to be missing, which may lead to some misannotations []. This led to speculations that the archaeal form of the GOGAT large subunit is the ancestral minimum form of the enzyme. Later analysis showed, however, that in all archaea where the large subunit has been found, its entire sequence is represented by three separate ORFs [].
Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site [, ]. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide [, ]. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.Thiocyanate hydrolase (SCNase) is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia [].The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase [].This entry represents the structural domain of the alpha subunit of both iron- and cobalt-containing nitrile hydratases; the alpha subunit is a duplication of two structural repeats, each consisting of 4 layers, alpha/beta/beta/alpha []. This structure is also found in the related protein, the gamma subunit of thiocyanate hydrolase (SCNase).
RNA (C5-cytosine) methyltransferases (RCMTs) catalyse the transfer of a methyl group to the 5th carbon of a cytosine base in RNA sequences to produce C5-methylcytosine. RCMTs use the cofactor S-adenosyl-L-methionine (SAM) as a methyl donor []. The catalytic mechanism of RCMTs involves an attack by the thiolate of a Cys residue on position 6 of the target cytosine base to form a covalent link, thereby activating C5 for methyl-group transfer. Following the addition of the methyl group, a second Cys residue acts as a general base in the beta-elimination of the proton from the methylated cytosine ring. The free enzyme is restored and the methylated product is released [].Numerous putative RCMTs have been identified in archaea, bacteria and eukaryota [, ]; most are predicted to be nuclear or nucleolar proteins []. The Escherichia coli Ribosomal RNA Small-subunit Methyltransferase Beta (RSMB) FMU (FirMicUtes) represents the first protein identified and characterised as a cytosine-specific RNA methyltransferase. RSMB was reported to catalyse the formation of C5-methylcytosine at position 967 of 16S rRNA [, ].A classification of RCMTs has been proposed on the basis of sequence similarity []. According to this classification, RCMTs are divided into 8 distinct subfamilies []. Recently, a new RCMT subfamily, termed RCMT9, was identified []. Members of the RCMT contain a core domain, responsible for the cytosine-specific RNA methyltransferase activity. This 'catalytic' domain adopts the Rossman fold for the accommodation of the cofactor SAM []. The RCMT subfamilies are also distinguished by N-terminal and C-terminal extensions, variable both in size and sequence [].The rRNA small subunit methyltransferase B (RsmB) protein, often referred to as Fmu, has been demonstrated to methylate only C967 of the 16S ribosomal RNA and to produce only m5C at that position []. The structure of the E. coli protein has been determined []. It contains three subdomains which share structural homology to DNA m5C methyltransferases and two RNA binding protein families. The N-terminal sequence shares homology to another (noncatalytic) RNA binding protein, e.g. the ribosomal RNA antiterminator protein NusB (). The catalytic lobe of the N1 domain, comprises the conserved core identified in all of the putative RNA m5C MTase sequences. Although the N1 domain is structurally homologous to known RNA binding proteins, there is no clear sequence motif that defines its role in RNA binding and recognition. At the functional centre of the catalytic lobe is the MTase domain of Fmu (residues 232-429), which adopts a fold typical of known AdoMet-dependent methyltransferases. In spite of the lack of a conserved RNA binding motif in the N1 domain, the closeassociation of the N1 and MTase domains suggest that any RNA bound in the active site of the MTase domain is likely to interact with the N1 domain.
Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an Arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.The allergens in this family include allergens with the following designations: Api m 3.Melittin is the principal protein component of the venom of the honeybee, Apis mellifera. It inhibits protein kinase C, Ca2+/calmodulin-dependent protein kinase II, myosin light chain kinase and Na+/K+-ATPase (synaptosomal membrane) and is a cell membrane lytic factor. Melittin is a small peptide with no disulphide bridge; the N-terminal part of the molecule is predominantly hydrophobic and the C-terminal part is hydrophilic and strongly basic.The molecular mechanisms underlying the various effects of melittin on membranes have not been completely defined and much of the evidence indicates that different molecular mechanisms may underlie different actions of the peptide [].Extensive work with melittin has shown that the venom has multiple effects, probably, as a result of its interaction with negatively changed phospholipids. It inhibits well known transport pumps such as the Na+-K+-ATPase and the H+-K+-ATPase. Melittin increases the permeability of cell membranes to ions, particularly Na+and indirectly Ca2+, because of the Na+-Ca2+-exchange. This effect results in marked morphological and functional changes, particularly in excitable tissues such as cardiac myocytes. In some other tissues, e.g., cornea, not only Na+but Cl- permeability is also increased by melittin. Similar effects to melittin on H+-K+-ATPase have been found with the synthetic amphipathic polypeptide Trp-3 [].The study of melittin in model membranes has been useful for the development of methodology for determination of membrane protein structures. A molecular dynamics simulation of melittin in a hydrated dipalmitoylphosphatidylcholine (DPPC) bilayer was carried out. The effect of melittin on the surrounding membrane was localised to its immediate vicinity, and its asymmetry with respect to the two layers may be a result of the fact that it is not fully transmembranal. Melittin's hydrophilic C terminus anchors it at the extracellular interface, leaving the N terminus "loose"in the lower layer ofthe membrane [].
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This group of serine peptidases belong to MEROPS peptidase family S49 (protease IV family, clan S-). The predicted active site serine for members of this family occurs in a transmembrane domain.Signal peptides of secretory proteins seem to serve at least two important biological functions. First, they are required forprotein targeting to and translocation across membranes, such as the eubacterial plasma membrane and the endoplasmicreticular membrane of eukaryotes. Second, in addition to their role as determinants for proteintargeting and translocation, certain signal peptides have a signalling function.During or shortly after pre-protein translocation, the signal peptide is removed by signal peptidases. The integral membrane protein, SppA (protease IV), of Escherichia coli was shown experimentally to degrade signal peptides. The member of this family from Bacillus subtilis has only been shown to be required for efficient processing ofpre-proteins under conditions of hyper-secretion []. These enzymes have a molecular mass around 67kDa and a duplication such that the N-terminal half shares extensive homology with the C-terminal half and was shown in E. coli to form homotetramers. E. coli SohB, which is most closely homologous to the C-terminal duplication of SppA, is predicted to perform a similar function of small peptide degradation, but in the periplasm.Many prokaryotes have a single SppA/SohB homologue that may perform the function of either or both.
Von Ebner's gland protein (VEGP), a protein highly expressed by the small acinar von Ebner's salivary glands of the tongue, but not in the secretory duct, undertakes the selective binding of sapid chemicals and their transport to taste receptors []in salivary secretions. VEGP can help to clear the bitter-tasting compound denatonium benzoate in vivo[], suggesting a possible clearance function in taste reception, although it fails to bindother bitter compounds []. VEGP is also secreted by the lachrymal gland into tear fluid, where, historically, it has been called tear prealbumin [].Together with lysozyme and lactoferrin, VEGP forms 70-80% of total tear protein, although diseases affecting the lachrymal gland decrease this. TearVEGP has been suggested to enhance the bactericial activity of lysozyme andto have an anti-microbial function, perhaps through transported compoundswith anti-bacterial properties []. VEGP has been shown to bind retinol [],and can be co-extracted with fatty acids, particularly stearate and palmitate, phospholipids, glycolipids and fatty alcohols (including cholesterol) []. VEGP may act as a transporter of lipids, synthesised in the dorsal, or meibomian, glands of the eyelid, to the thin film they form at the tear-fluid/air interface. Recently, two lipocalins, specifically expressed in the posterior andvomeronasal glands of the mouse nasal septum, have been identified and weresuggested to act in the chemoreception of, as yet-unidentified, small lipophilic pheromones []. One of these proteins was immunolocalised on thevomeronasal sensory epithelium, the site of primary pheromone reception, andthe immunoreactivity was greatest during periods when contact between animals plays an important role in modulating behaviour.Canis familiaris (dog) allergen 1 (Can f1) is the major allergen present in dogdander and is produced by tongue epithelial tissue []. Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature SubcommitteeKing T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E.,Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed ofthe first three letters of the genus; a space; the first letter of thespecies name; a space and an arabic number. In the event that two speciesnames have identical designations, they are discriminated from one anotherby adding one or more letters (as necessary) to each species designation.The allergens in this family include allergens with the following designations: Bos d 2 and Can f 2.
Chloride channels (CLCs) constitute an evolutionarily well-conserved family of voltage-gated channels that are structurally unrelated to the other known voltage-gated channels. They are found in organisms ranging from bacteria to yeasts and plants, and also to animals. Their functions in higher animals likely include the regulation of cell volume, control of electrical excitability and trans-epithelial transport [].The first member of the family (CLC-0) was expression-cloned from the electric organ of Torpedo marmorata [], and subsequently nine CLC-like proteins have been cloned from mammals. They are thought to function as multimers of two or more identical or homologous subunits, and they have varying tissue distributions and functional properties. To date, CLC-0, CLC-1, CLC-2, CLC-4 and CLC-5 have been demonstrated to form functional Cl- channels; whether the remaining isoforms do so is either contested or unproven. One possible explanation for the difficulty in expressing activatable Cl- channels is that some of the isoforms may function as Cl- channels of intracellular compartments, rather than of the plasma membrane. However, they are all thought to have a similar transmembrane (TM) topology, initial hydropathy analysis suggesting 13 hydrophobic stretches long enough to form putative TM domains []. Recently, the postulated TM topology has been revised, and it now seems likely that the CLCs have 10 (or possibly 12) TM domains, with both N- and C-termini residing in the cytoplasm [].A number of human disease-causing mutations have been identified in the genes encoding CLCs. Mutations in CLCN1, the gene encoding CLC-1, the major skeletal muscle Cl- channel, lead to both recessively and dominantly-inherited forms of muscle stiffness or myotonia []. Similarly, mutations in CLCN5, which encodes CLC-5, a renal Cl- channel, lead to several forms of inherited kidney stone disease []. These mutations have been demonstrated to reduce or abolish CLC function.CLC-4 was initially identified as a putative member of the CLC familyfollowing mapping of the human Xp22.3 chromosome region []. Together withCLC-5 and CLC-3, it forms a distinct branch of the CLC gene family. Initialexpression studies of CLC-4 did not yield measurable Cl-currents; however,recent studies of human CLC-4 have revealed that it gives rise to Cl-currents that rapidly activate at positive voltages, and are sensitive toextracellular pH, with currents decreasing when pH falls below 6.5 [].
Chloride channels (CLCs) constitute an evolutionarily well-conserved family of voltage-gated channels that are structurally unrelated to the other known voltage-gated channels. They are found in organisms ranging from bacteria to yeasts and plants, and also to animals. Their functions in higher animals likely include the regulation of cell volume, control of electrical excitability and trans-epithelial transport [].The first member of the family (CLC-0) was expression-cloned from the electric organ of Torpedo marmorata [], and subsequently nine CLC-like proteins have been cloned from mammals. They are thought to function as multimers of two or more identical or homologous subunits, and they have varying tissue distributions and functional properties. To date, CLC-0, CLC-1, CLC-2, CLC-4 and CLC-5 have been demonstrated to form functional Cl- channels; whether the remaining isoforms do so is either contested or unproven. One possible explanation for the difficulty in expressing activatable Cl- channels is that some of the isoforms may function as Cl- channels of intracellular compartments, rather than of the plasma membrane. However, they are all thought to have a similar transmembrane (TM) topology, initial hydropathy analysis suggesting 13 hydrophobic stretches long enough to form putative TM domains []. Recently, the postulated TM topology has been revised, and it now seems likely that the CLCs have 10 (or possibly 12) TM domains, with both N- and C-termini residing in the cytoplasm [].A number of human disease-causing mutations have been identified in the genes encoding CLCs. Mutations in CLCN1, the gene encoding CLC-1, the major skeletal muscle Cl- channel, lead to both recessively and dominantly-inherited forms of muscle stiffness or myotonia []. Similarly, mutations in CLCN5, which encodes CLC-5, a renal Cl- channel, lead to several forms of inherited kidney stone disease []. These mutations have been demonstrated to reduce or abolish CLC function.CLC-1 was the first member of the CLC family cloned from mammalian species[], and has 998 amino acid residues (human isoform). It is principallyexpressed in skeletal muscle, but low transcript levels can be detected inkidney, heart and smooth muscle. In skeletal muscle, it gives rise to themajorityof the muscle membrane Cl-conductance (which accounts for ~70-80%of the total resting conductance). These channels are partially open underresting conditions, and it is likely that following a prolonged series ofmuscle action potentials, they act to reduce excitability, limiting tetanicactivation. As mentioned above, mutations in CLC-1 can cause recessive(Becker) as well as dominant (Thomsen) myotonia. Such mutations reducechannel function, rendering skeletal muscle hyperexcitable. This leads todefective muscle relaxation after voluntary contraction.
This entry represents a group of signal transduction response regulators which contain a modified version of the HD-GYP domain as an output domain.Response regulators of the microbial two-component signal transduction systems typically consist of an N-terminal CheY-like receiver (phosphoacceptor) domain and a C-terminal output (usually DNA-binding) domain. In response to an environmental stimulus, a phosphoryl group is transferred from the His residue of sensor histidine kinase to an Asp residue in the CheY-like receiver domain of the cognate response regulator [, , ]. Phosphorylation of the receiver domain induces conformational changes that activate an associated output domain, which in turn triggers the response. Phosphorylation-induced conformational changes in response regulator molecule have been demonstrated in direct structural studies [].HD-GYP is a conserved domain found in response regulator modules of various signal transduction systems. The involvement of the HD-GYP domain in signal transduction was originally proposed on the basis of its association with CheY-like and other signal transduction domains []and was later directly demonstrated experimentally by showing that RpfG is involved in regulation of the biosynthesis of extracellular endoglucanase and polysaccharide [].A modification of the HD-GYP domain, which is found in this group, , and several smaller groups, lacks the conserved distal portion of the domain and has certain substitutions in the characteristic metal-binding residues []of the HD superfamily phosphohydrolases, which likely render it catalytically inactive. Note that the prototypical HD domain () is not recognised in many members of this group.The exact mode of action and targets of the HD-GYP output domain are not known []. HD-GYP proteins are associated to the HD domain superfamily of metal-dependent phosphohydrolases; HD designates the principal conserved residues implicated in metal binding and catalysis []. The HD-GYP version of the HD-type domain has many additional highly conserved residues, including a conserved GYP motif, hence its name [, ].It has been noted that the highly conserved sequence of the HD-GYP domain suggests high substrate specificity []. On the basis of its association with the GGDEF diguanylate cyclase domain, it has been also predicted that the HD-GYP domain may be involved in the metabolism of cyclic diguanylate or in dephosphorylation of some phosphotransfer domain [].
DNA ligase (polydeoxyribonucleotide synthase) is the enzyme that joins two DNA fragments by catalysing the formation of an internucleotide ester bond between phosphate and deoxyribose. It is active during DNA replication, DNA repair and DNA recombination. There are two forms of DNA ligase, one requires ATP (), the other NAD (), the latter being restricted to eubacteria. Eukaryotic, archaebacterial, viral and some eubacterial DNA ligases are ATP-dependent. The first step in the ligation reaction is the formation of a covalent enzyme-AMP complex. The co-factor ATP is cleaved to pyrophosphate and AMP, with the AMP being covalently joined to a highly conserved lysine residue in the active site of the ligase. The activated AMP residue is then transferred to the 5'phosphate of the nick, before the nick is sealed by phosphodiester-bond formation and AMP elimination [, ].Vertebrate cells encode three well-characterised DNA ligases (DNA ligases I, III and IV), all of which are related in structure and sequence. With the exception of the atypically small PBCV-1 viral enzyme, two regions of primary sequence are common to all members of the family. The catalytic region comprises six conserved sequence motifs (I, III, IIIa, IV, V-VI), motif I includes the lysine residue that is adenylated in the first step of the ligation reaction. The function of the second, less well-conserved region is unknown. When folded, each protein comprises of two distinct sub-domains: a large amino-terminal sub-domain ('domain 1') and a smaller carboxy-terminal sub-domain ('domain 2'). The ATP-binding site of the enzyme lies in the cleft between the two sub-domains. Domain 1 consists of two antiparallel beta sheets flanked by alpha helices, whereas domain 2 consists of a five-stranded beta barrel and a single alpha helix, which form the oligonucleotide-binding fold [, ]. This domain is found in many but not all ATP-dependent DNA ligase enzymes (). It is thought to be involved in DNA binding and in catalysis. In human DNA ligase I (), and in Saccharomyces cerevisiae (Baker's yeast) (), this region was necessary for catalysis, and separated from the amino terminus by targeting elements. In Vaccinia virus () this region was not essential for catalysis, but deletion decreases the affinity for nicked DNA and decreased the rate of strand joining at a step subsequent to enzyme-adenylate formation [].
Cytochrome P450 enzymes are a superfamily of haem-containing mono-oxygenases that are found in all kingdoms of life, and which show extraordinary diversity in their reaction chemistry. In mammals, these proteins are found primarily in microsomes of hepatocytes and other cell types, where they oxidise steroids, fatty acids and xenobiotics, and are important for the detoxification and clearance of various compounds, as well as for hormone synthesis and breakdown, cholesterol synthesis and vitamin D metabolism. In plants, these proteins are important for the biosynthesis of several compounds such as hormones, defensive compounds and fatty acids. In bacteria, they are important for several metabolic processes, such as the biosynthesis of antibiotic erythromycin in Saccharopolyspora erythraea (Streptomyces erythraeus).Cytochrome P450 enzymes use haem to oxidise their substrates, using protons derived from NADH or NADPH to split the oxygen so a single atom can be added to a substrate. They also require electrons, which they receive from a variety of redox partners. In certain cases, cytochrome P450 can be fused to its redox partner to produce a bi-functional protein, such as with P450BM-3 from Bacillus megaterium [], which has haem and flavin domains.Organisms produce many different cytochrome P450 enzymes (at least 58 in humans), which together with alternative splicing can providea wide array of enzymes with different substrate and tissue specificities. Individual cytochrome P450 proteins follow the nomenclature: CYP, followed by a number (family), then a letter (subfamily), and another number (protein); e.g. CYP3A4 is the fourth protein in family 3, subfamily A. In general, family members should share >40% identity, while subfamily members should share >55% identity.Cytochrome P450 proteins can also be grouped by two different schemes. One scheme was based on a taxonomic split: class I (prokaryotic/mitochondrial) and class II (eukaryotic microsomes). The other scheme was based on the number of components in the system: class B (3-components) and class E (2-components). These classes merge to a certain degree. Most prokaryotes and mitochondria (and fungal CYP55) have 3-component systems (class I/class B) - a FAD-containing flavoprotein (NAD(P)H-dependent reductase), an iron-sulphur protein and P450. Most eukaryotic microsomes have 2-component systems (class II/class E) - NADPH:P450 reductase (FAD and FMN-containing flavoprotein) and P450. There are exceptions to this scheme, such as 1-component systems that resemble class E enzymes [, , ]. The class E enzymes can be further subdivided into five sequence clusters, groups I-V, each of which may contain more than one cytochrome P450 family (eg, CYP1 and CYP2 are both found in group I). The divergence of the cytochrome P450 superfamily into B- and E-classes, and further divergence into stable clusters within the E-class, appears to be very ancient, occurring before the appearance of eukaryotes.Cytochrome P450 has a multihelical structure.
Aquaporins are water channels, present in both higher and lower organisms, that belong to the major intrinsic protein family. Most aquaporins are highly selective for water, though some also facilitate the movement of small uncharged molecules such as glycerol []. In higher eukaryotes these proteins play diverse roles in the maintenance of water homeostasis, indicating that membrane water permeability can be regulated independently of solute permeability. In microorganisms however, many of which do not contain aquaporins, they do not appear to play such a broad role. Instead, they assist specific microbial lifestyles within the environment, e.g. they confer protection against freeze-thaw stress and may help maintain water permeability at low temperatures []. The regulation of aquaporins is complex, including transcriptional, post-translational, protein-trafficking and channel-gating mechanisms that are frequently distinct for each family member.Structural studies show that aquaporins are present in the membrane as tetramers, though each monomer contains its own channel [, , ]. The monomer has an overall "hourglass"structure made up of three structural elements: an external vestibule, an internal vestibule, and an extended pore which connects the two vestibules. Substrate selectivity is conferred by two mechanisms. Firstly, the diameter of the pore physically limits the size of molecules that can pass through the channel. Secondly, specific amino acids within the molecule regulate the preference for hydrophobic or hydrophilic substrates.Aquaporins are classified into two subgroups: the aquaporins (also known as orthodox aquaporins), which transport only water, and the aquaglyceroporins, which transport glycerol, urea, and other small solutes in addition to water [, ].Aquaporin-9 was identified from human leukocytes by homology cloning []. AQP9 has unusually broad solute permeability. It is expressed in hepatocyte plasma membranes and also in lung, small intestine and spleen cells []. Expression of AQP9 in liver was induced up to 20-fold in rats fasted for 24 to 96 hours, and the AQP9 level gradually declined after re-feeding []. AQP9 shares greater sequence identity with AQP3 and AQP7 than with other members of the family, suggesting that these 3 proteins belong to a subfamily.
Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site [, ]. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide [, ]. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.Thiocyanate hydrolase (SCNase) is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia [].The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase [].This entry represents the structural domain of the alpha subunit of both iron- and cobalt-containing nitrile hydratases; the alpha subunit is a duplication of two structural repeats, each consisting of 4 layers, alpha/beta/beta/alpha []. This structure is also found in the related protein, the gamma subunit of thiocyanate hydrolase (SCNase).
Eukaryotic eIF-5A was initially thought to function as a translation initiation factor, based on its ability to stimulate methionyl-puromycin synthesis. However, subsequent work revealed a role for eIF5A in translation elongation [, ]. Depletion or inactivation of eIF-5A in the yeast Saccharomyces cerevisiae (Baker's yeast) resulted in the accumulation of polysomes and an increase in ribosomal transit times. Addition of recombinant eIF-5A from yeast, but not a derivative lacking hypusine, enhanced the rate of tripeptide synthesis in vitro. Moreover, inactivation of eIF-5A mimicked the effects of the eEF2 inhibitor sordarin, indicating that eIF-5A might function together with eEF2 to promote ribosomal translocation. Finally, it was shown that eIF5A is specifically required to promote peptide-bond formation between consecutive proline residues. It has been proposed to stimulate the peptidyl-transferase activity of the ribosome and facilitate the reactivity of poor substrates like proline [].eIF-5A is a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively [, , ]. IF-5A is the sole protein in eukaryotes and archaea to contain the unusual amino acid hypusine (Ne-(4-amino-2-hydroxybutyl)lysine) that is an absolute functional requirement. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported []. The archaeal IF-5A proteins have not been studied as comprehensively as their eukaryotic homologues, though the crystal structure of the Pyrobaculum aerophilum protein has been determined. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible.The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding [].This family also includes the Woronin body major protein Hex1, whose sequence and structure are similar to eukaryotic initiation factor 5A (eIF5A), suggesting they share a common ancestor during evolution []. Woronin bodies are important for stress resistance and virulence [].
The membrane attack complex/perforin (MACPF) domain is conserved in bacteria, fungi, mammals and plants. It was originally identified and named as being common to five complement components (C6, C7, C8-alpha, C8-beta, and C9) and perforin. These molecules perform critical functions in innate and adaptive immunity. The MAC family proteins and perforin are known to participate in lytic pore formation. In response to pathogen infection, a sequential and highly specific interaction between the constituent elements occurs to form transmembrane channels which are known as the membrane-attack complex (MAC).Only a few other MACPF proteins have been characterised and several are thought to form pores for invasion or protection [, , ]. Examples are proteins from malarial parasites [], the cytolytic toxins from sea anemones [], and proteins that provide plant immunity [, ]. Functionally uncharacterised MACPF proteins are also evident in pathogenic bacteria such as Chlamydia spp []and Photorhabdus luminescens (Xenorhabdus luminescens) [].The MACPF domain is commonly found to be associated with other N- and C-terminal domains, such as TSP1 (see ), LDLRA (see ), EGF-like (see ),Sushi/CCP/SCR (see ), FIMAC or C2 (see ). They probably control or target MACPF function [, ]. The MACPF domain oligomerizes, undergoes conformational change, and is required for lytic activity.The MACPF domain consists of a central kinked four-stranded antiparallel beta sheet surrounded by alpha helices and beta strands, forming two structural segments. Overall, the MACPF domain hasa thin L-shaped appearance. MACPF domains exhibit limited sequence similarity but contain a signature [YW]-G-[TS]-H-[FY]-x(6)-G-G motif [, , ].Some proteins known to contain a MACPF domain are listed below:Vertebrate complement proteins C6 to C9. Complement factors C6 to C9 assemble to form a scaffold, the membrane attack complex (MAC), that permits C9 polymerisation into pores that lyse Gram-negative pathogens [, ].Vertebrate perforin. It is delivered by natural killer cells and cytotoxic T lymphocytes and forms oligomeric pores (12 to 18 monomers) in the plasma membrane of either virus-infected or transformed cells.Arabidopsis thaliana (Mouse-ear cress) constitutively activated cell death 1 (CAD1) protein. It is likely to act as a mediator that recognises plant signals for pathogen infection [].Arabidopsis thaliana (Mouse-ear cress) necrotic spotted lesions 1 (NSL1) protein [].Venomous sea anemone Phyllodiscus semoni (Night anemone) toxins PsTX-60A and PsTX-60B [].Venomous sea anemone Actineria villosa (Okinawan sea anemone) toxin AvTX-60A [].Plasmodium sporozoite microneme protein essential for cell traversal 2 (SPECT2). It is essential for the membrane-wounding activity of the sporozoite and is involved in its traversal of the sinusoidal cell layer prior to hepatocyte-infection [].P. luminescens Plu-MACPF. Although nonlytic, it was shown to bind to cell membranes [].Chlamydial putative uncharacterised protein CT153 [].
The membrane attack complex/perforin (MACPF) domain is conserved in bacteria, fungi, mammals and plants. It was originally identified and named as being common to five complement components (C6, C7, C8-alpha, C8-beta, and C9) and perforin. These molecules perform critical functions in innate and adaptive immunity. The MAC family proteins and perforin are known to participate in lytic pore formation. In response to pathogen infection, a sequential and highly specific interaction between the constituent elements occurs to form transmembrane channels which are known as the membrane-attack complex (MAC).Only a few other MACPF proteins have been characterised and several are thought to form pores for invasion or protection [, , ]. Examples are proteins from malarial parasites [], the cytolytic toxins from sea anemones [], and proteins that provide plant immunity [, ]. Functionally uncharacterised MACPF proteins are also evident in pathogenic bacteria such as Chlamydia spp []and Photorhabdus luminescens (Xenorhabdus luminescens) [].The MACPF domain is commonly found to be associated with other N- and C-terminal domains, such as TSP1 (see ), LDLRA (see ), EGF-like (see ),Sushi/CCP/SCR (see ), FIMAC or C2 (see ). They probably control or target MACPF function [, ]. The MACPF domain oligomerizes, undergoes conformational change, and is required for lytic activity.The MACPF domain consists of a central kinked four-stranded antiparallel beta sheet surrounded by alpha helices and beta strands, forming two structural segments. Overall, the MACPF domain has a thin L-shaped appearance. MACPF domainsexhibit limited sequence similarity but contain a signature [YW]-G-[TS]-H-[FY]-x(6)-G-G motif [, , ].Some proteins known to contain a MACPF domain are listed below:Vertebrate complement proteins C6 to C9. Complement factors C6 to C9 assemble to form a scaffold, the membrane attack complex (MAC), that permits C9 polymerisation into pores that lyse Gram-negative pathogens [, ].Vertebrate perforin. It is delivered by natural killer cells and cytotoxic T lymphocytes and forms oligomeric pores (12 to 18 monomers) in the plasma membrane of either virus-infected or transformed cells.Arabidopsis thaliana (Mouse-ear cress) constitutively activated cell death 1 (CAD1) protein. It is likely to act as a mediator that recognises plant signals for pathogen infection [].Arabidopsis thaliana (Mouse-ear cress) necrotic spotted lesions 1 (NSL1) protein [].Venomous sea anemone Phyllodiscus semoni (Night anemone) toxins PsTX-60A and PsTX-60B [].Venomous sea anemone Actineria villosa (Okinawan sea anemone) toxin AvTX-60A [].Plasmodium sporozoite microneme protein essential for cell traversal 2 (SPECT2). It is essential for the membrane-wounding activity of the sporozoite and is involved in its traversal of the sinusoidal cell layer prior to hepatocyte-infection [].P. luminescens Plu-MACPF. Although nonlytic, it was shown to bind to cell membranes [].Chlamydial putative uncharacterised protein CT153 [].
Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes []. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Many families of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence []. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases [].Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base []. The geometric orientations of the catalytic residues are similar between families, despite different protein folds []. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC) [, ].This group of serine peptidases, which includes HetR, are associated with heterocystous cyanobacteria and belong to MEROPS peptidase family S48 (clan S-). HetR is a DNA-binding serine-type protease required for heterocyst differentiation in heterocystous cyanobacteria under conditions of nitrogen deprivation. Mutation of HetR from of Anabaena sp. (strain PCC 7120) by site-specific mutagenesis of Ser-152 showed that this residue was one of the peptidase active site residues. It was suggested that peptidase activity might be needed for repression of HetR overproduction under conditions of nitrogen deprivation []. Modification of Cys-48 prevented disulphide-bond formation and homodimerisation of HetR and DNA-binding. The homodimer of HetR binds the promoter regions of hetR, hepA, and patS, suggesting a direct control of the expression of these genes by HetR. The pentapeptide RGSGR, which is present at the C terminus of PatS, blocks heterocyst formation, inhibits the DNA binding of HetR and prevents hetR up-regulation [].
This entry includes proteins with a JmjC domain. Proteins are bifunctional, acting as histone lysine demethylases and ribosomal histidine hydroxylases. Proteins include:Bifunctional lysine-specific demethylase and histidyl-hydroxylase NO66 (also known as Jumanjic domain protein 1) [].Ribosomal oxygenase 1 (also known as ribosomal oxygenase NO66), which specifically demethylates 'Lys-4' (H3K4me) and 'Lys-36' (H3K36me) of histone H3 [].Ribosomal oxygenase 2, which demethylates trimethylated 'Lys-9' on histone H3 (H3K9me3), leading to an increase in ribosomal RNA expression []. It also hydroxylates 60S ribosomal protein L27a on 'His-39' [].50S ribosomal protein L16 3-hydroxylase from Escherichia coli, which catalyzes the hydroxylation of 50S ribosomal protein L16 on 'Arg-81' [].The JmjN and JmjC domains are two non-adjacent domains which have been identified in the jumonji family of transcription factors. Although it was originally suggested that the JmjN and JmjC domains always co-occur and might form a single functional unit within the folded protein, the JmjC domain was later found without the JmjN domain in organisms from bacteria to human [, , ].Proteins containing JmjC domain are predicted to be metalloenzymes that adopt the cupin fold and are candidates for enzymes that regulate chromatin remodelling []. The cupin fold is a flattened β-barrel structure containing two sheets of five antiparallel β-strands that form the walls of a zinc-binding cleft. Based on the crystal structure of JmjC domain containing protein FIH and JHDM3A/JMJD2A, the JmjC domain forms an enzymatically active pocket that coordinates Fe(III) and alphaKG. Three amino-acid residues within the JmjC domain bind to the Fe(II) cofactor and two additional residues bind to alphaKG []. JmjC domains were identified in numerous eukaryotic proteins containing domains typical of transcription factors, such as PHD, C2H2, ARID/BRIGHT and zinc fingers [, ]. The JmjC has been shown to function in a histone demethylation mechanism that is conserved from yeast to human []. JmjC domain proteins may be protein hydroxylases that catalyse a novel histone modification []. The human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalysing hydroxylation [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].Human annexin A10 (annexin 14) was first identified in silico by searches ofdbEST with a number of divergent annexins []. The analysis revealed singlehuman and mouse ESTs corresponding to a novel and rarely expressed annexinin which three of the four tetrad core repeats lack the calcium-binding domain. It was proposed that this subtype, together with A5 annexin, gaverise to the Type VI octad through a process of gene duplication and fusion in early chordate evolution [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Annexin type XI. The N-terminal of annexin XI is hydrophobic, rich in glycine, tyrosine, and proline residues, and is larger than that of the other annexin members. Annexin XI is ubiquitously expressed in a variety of tissues and cell types of eukaryotes, but its subcellular distribution varies considerably. Some growth and differentiation conditions favour the presence of annexin XI in the nucleus, whereas others favour either a cytoplasmic distribution or both. Annexin XI is upregulated in mitotic cells and stains mitotic spindles. It is required for cytokinesis completion []. Ca2+ was found to influence both the association of annexin XI with tubulin and the nuclear or cytoplasmic subcellular localisation of annexin XI [].The human orthologue of annexin XI was found to be identical to the 56K autoantigen, found in individuals with a range of autoimmune diseases such as rheumatoid arthritis [].
G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [, , ].Thrombin is a coagulation protease that activates platelets, leukocytes, endothelial and mesenchymal cells at sites of vascular injury, acting partlythrough an unusual proteolytically activated GPCR []. Gene knockout experiments have provided definitive evidence for a second thrombin receptorin mouse platelets and have suggested tissue-specific roles for differentthrombin receptors. Because the physiological agonist at the receptor wasoriginally unknown, it was provisionally named protease-activated receptor(PAR) []. At least 4 PAR subtypes have now been characterised. Thus, the thrombin and PAR receptors constitute a fledgling receptor family that shares a novel proteolytic activation mechanism [].The human thrombin receptor, designated protease-activated receptor 4 (PAR4),has been cloned and characterised []. Northern blot analysis showed that PAR4 mRNA was expressed in a number of tissues, high levels being presentin lung, pancreas, thyroid, testis and small intestine. Using fluorescence in situ hybridisation, the human PAR4 gene has been mapped to chromosome 19p12 [].
SETD3 is a protein-histidine N-methyltransferase that specifically mediates methylation of actin at 'His-73' []. It was initially reported to have histone methyltransferase activity and methylate 'Lys-4' and 'Lys-36' of histone H3 (H3K4me and H3K36me). However, this conclusion was based on mass spectrometry data wherein mass shifts were inconsistent with a bona fide methylation event. In vitro, the protein-lysine methyltransferase activity is weak compared to the protein-histidine methyltransferase activity [].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.A review published in 2003 []divides allmethyltransferases into 5 classes based on the structure of their catalyticdomain (fold):class I: Rossmann-like alpha/betaclass II: TIM beta/α-barrel alpha/betaclass III: tetrapyrrole methylase alpha/betaclass IV: SPOUT alpha/beta class V: SET domain all betaA more recent paper []based on a study of the Saccharomyces cerevisiaemethyltransferome argues for four more folds:class VI: transmembrane all alpha class VII: DNA/RNA-binding 3-helical bundle all alphaclass VIII: SSo0622-like alpha betaclass IX: thymidylate synthetase alpha betaClass V proteins contain the SET domain usually flanked byother domains forming the so-called pre- and post-SET regions. Except themembers of the STD3 family which N-methylate histidine in beta-actin (EC2.1.1.85) [, ], enzymes belonging to this class N-methylatelysine in proteins. Most of them are histone methyltransferases (EC 2.1.1.43)like the histone H3-K9 methyltransferase dim-5 or the histoneH3-K4 methyltransferase SETD7 [, ]. Some others methylatethe large subunit of the enzyme ribulose-bisphosphate-carboxylase/oxygenase(RuBisCO) (EC 2.1.1.127) in plants; in these enzymes the SET domain isinterrupted by a novel domain []. Cytochrome c lysine N-methyltransferases(EC 2.1.1.59) do not possess a SET domain, or at least not a SET domaindetected by any of the detection methods; however they do display a SET-likeregion and for this reason they are also assigned to this class [].
Nitrile hydratases () are bacterial enzymes that catalyse the hydration of nitrile compounds to the corresponding amides. They are used as biocatalysts in acrylamide production, one of the few commercial scale bioprocesses, as well as in environmental remediation for the removal of nitriles from waste streams. Nitrile hydratases are composed of two subunits, alpha and beta, and are normally active as a tetramer, alpha(2)beta(2). Nitrile hydratases contain either a non-haem iron or a non-corrinoid cobalt centre, both types sharing a highly conserved peptide sequence in the alpha subunit (CXLCSC) that provides all the residues involved in coordinating the metal ion. Each type of nitrile hydratase specifically incorporated its metal with the help of activator proteins encoded by flanking regions of the nitrile hydratase genes that are necessary for metal insertion. The Fe-containing enzyme is photo-regulated: in the dark the enzyme is inactivated due to the association of nitric oxide (NO) to the iron, while in the light the enzyme is active by photo-dissociation of NO. The NO is held in place by a claw setting formed through specific oxygen atoms in two modified cysteines and a serine residue in the active site [, ]. The cobalt-containing enzyme is unaffected by NO, but was shown to undergo a similar effect with carbon monoxide [, ]. Fe- and cobalt-containing enzymes also display different inhibition patterns with nitrophenols.Thiocyanate hydrolase (SCNase) is a cobalt-containing metalloenzyme with a cysteine-sulphinic acid ligand that hydrolyses thiocyanate to carbonyl sulphide and ammonia [].The two enzymes, nitrile hydratase and SCNase, are homologous over regions corresponding to almost the entire coding regions of the genes: the beta and alpha subunits of thiocyanate hydrolase were homologous to the amino- and carboxyl-terminal halves of the beta subunit of nitrile hydratase, and the gamma subunit of thiocyanate hydrolase was homologous to the alpha subunit of nitrile hydratase [].This entry represents the alpha subunit of both iron- and cobalt-containing nitrile hydratases; the alpha subunit is a duplication of two structural repeats, each consisting of 4 layers, alpha/beta/beta/alpha. It excludes the thiocyanate hydrolase gamma subunit of Thiobacillus thioparus, a sequence that appears to have evolved from within the family of nitrile hydratase alpha subunits but which differs by several indels and a more rapid accumulation of point mutations.
This entry includes TESPA1, ITPRID1 (also known CCDC129) and ITPRID2 (also known as SSFA2). SSFA2, also known as Ki-ras-induced actin-interacting protein (KRAP), interacts with inositol 1,4,5-trisphosphate receptor (IP3R, also known as ITPR) []. SSFA2 was first localised as a membrane-bound form with extracellular regions suggesting it might be involved in the regulation of filamentous actin and signals from the outside of the cells []. It has now been shown to be critical for the proper subcellular localisation and function of IP3R. Inositol 1,4,5-trisphosphate receptor functions as the Ca2+ release channel on specialised endoplasmic reticulum membranes, so the subcellular localisation of IP3R is crucial for its proper function [].TESPA1 (Thymocyte-expressed positive selection-associated protein 1) is required for the development and maturation of T-cells, its function being essential for the late stages of thymocyte development. It plays a role in T-cell antigen receptor (TCR)-mediated activation of the ERK and NFAT signaling pathways, possibly by serving as a scaffolding protein that promotes the assembly of the LAT signalosome in thymocytes []. TESPA1 shows sequence homology to SSFA2 and physically associates with IP3R in T and B lymphocytes [].The function of ITPRID1 is not clear.
This entry includes TMEPAI and LRAD4 (C18ORF1). They contain two PY motifs and one Smad-interacting motif (SIM) domain. They are both known to inhibit transforming growth factor-beta (TGF-beta) signalling via its competition for binding of receptor-regulated Smad with Smad anchor for receptor activation [, ].TMEPAI was originally identified as a highly androgen-induced gene by serial analysis of gene expression in androgen-treated LNCaP prostate cancer (CaP) cells []. It is a type I transmembrane protein that has an N-erminal extracellular and a single transmembrane domains. TMEPAI contains two PY motifs that can be targetedby the WW domain. It is involved in a negative feedback loop to control the duration and intensity of TGF-beta signaling, which regulates growth suppression, apoptosis induction, extracellular matrix production, and differentiation []. It is also involved in androgen receptor signalling, phosphatase and tensin homologue deleted on chromosome 10 signalling, and formation of autophagosomes in addition to degradation of TbetaRI (TGF-beta type I receptor) through lysosomes []. TMEPAI has been linked to cancers [, , ].Low-density lipoprotein receptor class A domain-containing protein 4 (LRAD4) is a negative regulator of TGF-beta signaling. TMEPAI might help LRAD4 to inhibit TGF-beta signaling in a coordinated manner when cells are stimulated with high levels of TGF-beta []. LRAD4 is elevated in hepatic cancers and tumour tissues [].
The Schlafen (SLFN) family includes several mouse and human member genes that have been implicated in important functions, such as the control of cell proliferation, induction of immune responses, and the regulation of viral replication [, , , ]. Mouse and human SLFN proteins are regulated by interferons (IFNs) []. All SLFNs contain an unique Slfn box. SLFN family is comprised of 3 groups, based on the size of the encoded proteins [, , , ]:Group1: Slfn1, Slfn2, and Slfn Like 1.Group2: Slfn3, Slfn4 and Slfn12.Group3: Slfn5, Slfn8-11, Slfn13 and Slfn14.Comparing to group1 proteins, group2 and 3 proteins contain an extra SWADL domain C terminus to the AAA domain. Group3 proteins also possess a large extension C terminus to their SWADL domain. This C-terminal extension is homologous to the superfamily I of DNA/RNA helicases []. It has been proposed that the divergent AAA ATPase domain may function as an RNA-binding domain []. This entry also includes orthopoxvirus sequences. Analyses indicate that a member of the Schlafen family was horizontally transferred from murine rodents to orthopoxviruses, where it is hypothesised to play a role in allowing the virus to survive host immune defense mechanisms [].
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. An evolutionarily conserved sequence of around 70 amino acids, the KH domain is present in a wide variety of nucleic acid-binding proteins. The KH domain binds RNA, and can function in RNA recognition []. It is found in multiple copies in several proteins, where they can function cooperatively or independently. For example, in the AU-rich element RNA-binding protein KSRP, which has 4 KH domains, KH domains 3 and 4 behave as independent binding modules to interact with different regions of the AU-rich RNA targets []. The solution structure of the first KH domain of FMR1 []and of the C-terminal KH domain of hnRNP K []determined by nuclear magnetic resonance(NMR) revealed a β-α-α-β-β-α structure. Proteins containing KH domains include:Bacterial and organelle PNPases [].Archaeal and eukaryotic exosome subunits [].Eukaryotic and prokaryotic RS3 ribosomal proteins [].Vertebrate Fragile X messenger ribonucleoprotein 1 (FMR1) [].Vigilin, which has 14 KH domains [].AU-rich element RNA-binding protein KSRP.hnRNP K, which contains 3 KH domains.Human onconeural ventral antigen-1 (NOVA-1) [].According to structural analyses [, , ], the KH domain can be separated in two groups - type 1 and type 2.
The K homology (KH) domain was first identified in the human heterogeneousnuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acidsthat is present in a wide variety of quite diverse nucleic acid-bindingproteins []. It has been shown to bind RNA [, ]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [].According to structural analyses [, , ], the KH domain can be separated in two groups. The first group or type-1 contain a beta-α-α-β-β-alpha structure, whereas in the type-2 the two last β-sheets are located in the N-terminal part of the domain (α-β-beta-α-α-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein era.
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins []. It has been shown to bind RNA [, ]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [].According to structural [, , ]analysis the KH domain can be separated in two groups. The first group or type-1 contain a β-α-α-β-β-α structure, whereas in the type-2 the two last β-sheet are located in the N-terminal part of the domain (α-β-β-α-α-β). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helices 1 and 2 in type-1 and between helices 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferase (); vertebrate Fragile X messenger ribonucleoprotein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).
This entry represents a family of proteins that includes flap endonucleases from bacteria and viruses. Flap endonucleases (FENs) catalyse the exonucleolytic hydrolysis of blunt-ended duplex DNA substrates and the endonucleolytic cleavage of 5'-bifurcated nucleic acids at the junction formed between single and double-stranded DNA [].Escherichia phage T5 encodes the flap endonuclease D15, which catalyzes both the 5'-exonucleolytic and structure-specific endonucleolytic hydrolysis of DNA branched nucleic acid molecules [, , ]. In bacteriophage T4, disruption of the rnh gene (which encodes a FEN, known historically as T4 RNase H) results in slower, less accurate DNA replication. Bacteriophage T4 has both 5' nuclease and flap endonuclease activities []. In prokaryotes, the essential FEN reaction can be performed by the N-terminal 5'-3' exonuclease domain present on DNA polymerase I. Some eubacteria, however, possess a second FEN-encoding gene, in addition to their DNA polymerase I FEN domain [, ]. Two distinct classes of these independent bacterial FENs exist: ExoIX (Xni) from Escherichia coli and SaFEN (Staphylococcus aureus FEN). SaFEN has both FEN and 5'-3' exonuclease activities. Xni (also known as YgdG) was previously identified as a 3'-5' exonuclease and named exonuclease IX (exonuclease 9) [, ]but subsequently found to possess flap endonuclease activity, but not exonuclease activity [, , ].
Type VI secretion system (T6SS) appears to be confined to Proteobacteria. It is important for bacterial pathogenesis, but it is also found in non-pathogenic bacteria, suggesting that T6SS involvement is not limited to virulence []. T6SS was identified in Vibrio cholerae []and Pseudomonas aeruginosa [], and exports Hcp (Haemolysin-Coregulated Protein) and a class of proteins named Vgr (Val-Gly Repeats). In addition to Vgr and Hcp proteins, T6SS is characterised by the presence of an AAA+ Clp-like ATPase and of two additional genes icmF and dotU, encoding homologues of T4SS stabilising proteins []. Type VI secretion system spike protein VgrG2a and VgrG2b are homologous to (gp27)3-(gp5)3 phage-tail proteins, which is followed by a domain of unknown function (DUF2345). Unlike VgrG2a, VgrG2b belongs to a subclass of VgrG proteins, called evolved VgrGs that have an additional C-terminal extension with a putative zinc-dependent metallopeptidase domain. VrgG2b acts directly as an effector and promotes internalization by interacting with the host gamma-tubulin ring complex []. It also elicits toxicity also in the bacterial periplasm and disrupts bacterial cell morphology. This toxicity is counteracted by a cognate immunity protein []. In addition, it allows the delivery of the Tle3 antibacterial toxin to target cells where it exerts its toxicity [].This entry represents DUF2345, which is found in the C-terminal region of VgrG2a/b from Pseudomonas aeruginosa. This domain, present in both proteins, folds as a β-prism [, ].