This entry represents ERM family of proteins.The ERM family consists of three closely-related proteins, ezrin, radixin and moesin []. Ezrin was first identified as a constituent of microvilli [], radixin as a barbed, end-capping actin-modulating protein from isolated junctional fractions [], and moesin as a heparin binding protein [], which is particularly important in immunity acting on both T and B-cells homeostasis and self-tolerance [, ]. Members of this family have been associated with axon-associated Schwann cell (SC) motility and the maintenance of the polarity of these cells []. A tumour suppressor molecule responsible for neurofibromatosis type 2 (NF2) is highly similar to ERM proteins and has been designated merlin (moesin-ezrin-radixin-like protein) []. ERM molecules contain 3 domains, an N-terminal globular domain, an extended α-helical domain and a charged C-terminal domain []. Ezrin, radixin and merlin also contain a polyproline region between the helical and C-terminal domains. The N-terminal domain is highly conserved, and is also found in merlin, band 4.1 proteins and members of the band 4.1 superfamily, designated the FERM domain []. ERM proteins crosslink actin filaments with plasma membranes. They co-localise with CD44 at actin filament plasma membrane interaction sites, associating with CD44 via their N-terminal domains and with actin filaments via their C-terminal domains []. The α-helical region is involved in intramolecular masking of protein-protein interaction sites which regulates the activity of this proteins [].
This entry represents the frizzled domain superfamily.The frizzled (fz) domain is an extracellular domain of about 120 amino acids.It was first identified in the alpha-1 chain of type XVIII collagen and in members of the Frizzled family of seven transmembrane (7TM) proteins which act as receptors for secreted Wingless (Wg)/Wnt glycoproteins []. In addition to these proteins, one or two copies of the fz domain are also found [, , , , ]in:The Frzb family; secreted frizzled-like proteins.Smoothened; another 7TM receptor involved in hedgehog signaling.Carboxpeptidase Z (CPZ).Transmembrane serine protease corin (atrial natriuretic peptide-converting enzyme).Two receptor tyrosine kinases (RTKs) subfamilies, the Ror family and the muscle-specific kinase (MuSK) family.As the fz domain contains 10 cysteines which are largely conserved, it has also been called cysteine-rich domain (CRD) []. The fz domain also contains several other highly conserved residues, for example, a basic amino acid follows C6, and a conserved proline residues lies four residues C-terminal to C9 []. The crystal structure of a fz domain shows that it is predominantly α-helical with all cysteines forming disulphide bonds. In addition to helical regions, two short β-strands at the N terminus form a minimal β-sheet with the second beta sheet passing through a knot created by disulphide bonds [].Several fz domains have been shown to be both necessary and sufficient for Wg/Wnt ligand binding, strongly suggesting that the fz domain is a Wg/Wnt interacting domain [, ].
Diol dehydratase (propanediol dehydratase) and glycerol dehydratase undergo concomitant, irreversible inactivation by glycerol during catalysis [, ]. This inactivation is mechanism-based and involves cleavage of the Co-C bond of the cobalamin cofactor, coenzyme B12 (AdoCbl), forming 5 -deoxyadenosine and a modified coenzyme []. Irreversible inactivation of the enzyme results from tight binding to the modified, inactive cobalamin [, ].The glycerol-inactivated enzyme undergoes rapid reactivation in the presence of free AdoCbl, ATP, and Mg2+(or Mn2+) []. Reactivation is mediated by a complex of two proteins: a large subunit (DdrA/PduG, ) and a small subunit (DdrB/PduH) [, ].The two subunits of the reactivating factor for glycerol dehydratase have been shown to form a tight complex that serves to reactivate the glycerol-inactivated holoenzyme, as well as O2-inactivated holoenzyme in vitro []. It is believed that this reactivating factor replaces an enzyme-bound, adenine-lacking inactive cobalamin with a free, adenine-containing active cobalamin [].PduG and PduH, part of the propanediol utilization pduoperon, are believed to have a similar function in the reactivation of propanediol dehydratase. PduG was also proposed, on the basis of genetic tests, to be a cobalamin adenosyltransferase involved in the conversion of inactive cobalamin (B12) to AdoCbl []. However, this function has since been shown to belong to another protein, PduO [].
Bestrophin is a 68kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterised by a depressed light peak in the electrooculogram []. VMD2 encodes a 585-amino acid protein with an approximate mass of 68kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of calcium-activated chloride channels (CaCC) [], indicating a direct role for bestrophin in generating the light peak [, , ]. Bestrophins are also permeable to other monovalent anions including bicarbonate, bromine, iodine, thiocyanate an nitrate [, ]. Structural analysis revealed thatN-terminal region of the proteins is highly conserved and sufficient for its CaCC activity. The C-terminal region has low sequence identity. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal domain altering the electrophysiological properties of the channel [, ].
Bestrophin is a 68kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterised by a depressed light peak in the electrooculogram []. VMD2 encodes a 585-amino acid protein with an approximate mass of 68kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of calcium-activated chloride channels (CaCC) [], indicating a direct role for bestrophin in generating the light peak [, , ]. Bestrophins are also permeable to other monovalent anions including bicarbonate, bromine, iodine, thiocyanate an nitrate [, ]. Structural analysis revealed that N-terminal region of the proteins is highly conserved and sufficient for its CaCC activity. The C-terminal region has low sequence identity. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal domain altering the electrophysiological properties of the channel [, ].This entry also includes uncharacterised proteins belonging to protein family UPF0187.
Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [, , ](also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication []. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) []. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans []. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [].
The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins []. It has been shown to bind RNA [, ]. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitroRNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently [].According to structural [, , ]analysis the KH domain can be separated in two groups. The first group or type-1 contain a β-α-α-β-β-α structure, whereas in the type-2 the two last β-sheet are located in the N-terminal part of the domain (α-β-β-α-α-β). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helices 1 and 2 in type-1 and between helices 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferase (); vertebrate Fragile X messenger ribonucleoprotein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).
This superfamily represents a β-barrel domain found at the C-terminal of alanine racemase () and in group IV pyridoxal-5'-phosphate (PLP)-dependent decarboxylases, such as eukaryotic ornithine decarboxylase (), arginine decarboxylase () and diaminopimelate decarboxylase (). These enzymes belong to the same structural family [].Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins containing this domain are found in both prokaryotic and eukaryotic proteins [, ]. The molecular structure of alanine racemase from Bacillus stearothermophilus (Geobacillus stearothermophilus) was determined by X-ray crystallography to a resolution of 1.9 A []. The alanine racemase monomer is composed of two domains, an eight-stranded α/β barrel at the N terminus, and a C-terminal domain essentially composed of β-strand. The pyridoxal 5'-phosphate (PLP) cofactor lies in and above the mouth of the α/β barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C terminus of the first β-strand of the α/β barrel.Eukaryotic ornithine decarboxylase (ODC) acts as a homodimer to produce putrescine (1,4-diaminobutane) from ornithine, where putrescine is the precursor of other polyamines in animals, plants, and bacteria. Arginine decarboxylase is also involved in putrescine biosynthesis. This is the first committed step in polyamine biosynthesis. Alanine racemase is a structurally homologous enzyme. Both proteins share a common α/β barrel that binds the cofactor via a Schiff base on the C-terminal end of the barrel [].Diaminopimelate decarboxylase (DapDC) catalyses the final step of lysine biosynthesis in bacteria.
This superfamily was originally identified in Drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene []. The protein is an integral member of the exon junction complex (EJC). The EJC is a multiprotein complex that is deposited on spliced mRNAs after intron removal at a conserved position upstream of the exon-exon junction, and transported to the cytoplasm where it has been shown to influence translation, surveillance, and localization of the spliced mRNA. It consists of four core proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP and is supposed to be a binding platform for more peripherally and transiently associated factors along mRNA travel. Mago and Y14 form a stable heterodimer that stabilizes the complex by inhibiting eIF4AIII's ATPase activity. Mago-Y14 heterodimer has been shown to interact with the cytoplasmic protein PYM, an EJC disassembly factor, and specifically binds to the karyopherin nuclear receptor importin 13 [, , , , , , , , , , , , , , , , ].The human homologue has been shown to interact with an RNA binding protein, ribonucleoprotein rbm8 () []. An RNAi knockout of the Caenorhabditis elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination []but the protein is also found in hermaphrodites and other organisms without a sexual differentiation.Structurally, Mago nashi has a beta(4)-α-β(2)-alpha fold arranged into two layers (alpha/beta) with an antiparallel β-sheet.
Homeobox domain (also known as homeodomain) proteins are transcription factors that share a related DNA binding homeodomain []. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins.The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC) [, , ].
Flavin-containing monooxygenases (FMOs) constitute a family of xenobiotic-metabolising enzymes []. Using an NADPH cofactor and FAD prosthetic group, these microsomal proteins catalyse the oxygenation of nucleophilic nitrogen, sulphur, phosphorus and selenium atoms in a range of structurally diverse compounds. FMOs have been implicated in the metabolism of a number of pharmaceuticals, pesticides and toxicants. In man, lack of hepatic FMO-catalysed trimethylamine metabolism results in trimethylaminuria (fish odour syndrome). Five mammalian forms of FMO are now known and have been designated FMO1-FMO5 [, ,, , ]. This is a recent nomenclature based on comparison of amino acid sequences, and has been introduced in an attempt to eliminate confusion inherent in multiple, laboratory-specific designations and tissue-based classifications []. Following the determination of the complete nucleotide sequence of Saccharomyces cerevisiae (Baker's yeast) [], a novel gene was found to encode a protein with similarity to mammalian monooygenases. In Aspergillus, flavin-containing monooxygenases ustF1 and ustF2 are components in the biosynthesis of the antimitotic tetrapeptide ustiloxin B, a secondary metabolite. The monooxygenases modify the side chain of the intermediate S-deoxyustiloxin H [].
This entry represents the C-terminal region of several eukaryotic and archaeal RuvB-like 1 (Pontin or TIP49a) and RuvB-like 2 (Reptin or TIP49b) proteins. The N-terminal domain contains the AAA ATPase, central region domain. In zebrafish, the liebeskummer (lik) mutation, causes development of hyperplastic embryonic hearts. lik encodes Reptin, a component of a DNA-stimulated ATPase complex. Beta-catenin and Pontin, a DNA-stimulated ATPase that is often part of complexes with Reptin, are in the same genetic pathways. The Reptin/Pontin ratio serves to regulate heart growth during development, at least in part via the beta-catenin pathway []. TBP-interacting protein 49 (TIP49) was originally identified as a TBP-binding protein, and two related proteins are encoded by individual genes, tip49a and b. Although the function of this gene family has not been elucidated, they are supposed to play a critical role in nuclear events because they interact with various kinds of nuclear factors and have DNA helicase activities. TIP49a has been suggested to act as an autoantigen in some patients with autoimmune diseases [].
RIN3, a member of the RIN (AKA Ras interaction/interference) family, have multifunctional domains including SH2 and proline-rich (PR) domains in the N-terminal region, and RIN-family homology (RH), VPS9 and Ras-association (RA) domains in the C-terminal region. RIN proteins function as Rab5-GEFs. RIN3 stimulates the formation of GTP-bound Rab31, a Rab5-subfamily GTPase, and forms enlarged vesicles and tubular structures, where it colocalizes with Rab31. Transferrin appears to be transported partly through the RIN3-positive vesicles to early endosomes. RIN3 interacts via its Pro-rich domain with amphiphysin II, which contains an SH3 domain and participates in receptor-mediated endocytosis. RIN3, a Rab5 and Rab31 GEF, plays an important role in the transport pathway from plasma membrane to early endosomes. Mutations in the region between the SH2 and RH domain of RIN3 specifically abolished its GEF action on Rab31, but not Rab5. RIN3 was also found to partially translocate the cation-dependent mannose 6-phosphate receptor from the trans-Golgi network to peripheral vesicles and that this is dependent on its Rab31-GEF activity. These data indicate that RIN3 specifically acts as a GEF for Rab31 []. This entry represents the SH2 domain of RIN3.
Proteins in this entry are EutA ethanolamine utilization proteins, reactivating factors for ethanolamine ammonia lyase, encoded by the ethanolamine utilization eut operon [].The holoenzyme of adenosylcobalamin-dependent ethanolamine ammonia-lyase (EutBC, , ), which is part of the ethanolamine utilization pathway [, , ], undergoes suicidal inactivation during catalysis as well as inactivation in the absence of substrate. The inactivation involves the irreversible cleavage of the Co-C bond of the coenzyme. The inactivated holoenzyme undergoes rapid and continuous reactivation in the presence of ATP, Mg2+, and free adenosylcobalamin in permeabilised cells (in situ), homogenate, and cell extracts of Escherichia coli. The EutA protein is essential for reactivation. It wasdemonstrated with purified recombinant EutA that both the suicidally inactivated and O2-inactivated holoethanolamine ammonia lyase underwent rapid reactivation in vitro by EutA in the presence of adenosylcobalamin, ATP, and Mg2+ []. The inactive enzyme-cyanocobalamin complex was also activated in situ and in vitro by EutA under the same conditions. Thus EutA is believed to be the only component of the reactivating factor for ethanolamine ammonia lyase. Reactivation and activation occur through the exchange of modified coenzyme for free intact adenosylcobalamin [].Bacteria that harbor the ethanolamine utilization pathway can use ethanolamine as a source of carbon and nitrogen. For more information on the ethanolamine utilization pathway, please see , .
Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells. Interleukin-11 (IL-11) is a pleiotropic cytokine that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of IL-11 []. The cytokine also possesses anti-inflammatory activity, and has been proposed as a therapeutic agent in the treatment of chronic inflammatory diseases, such as Crohn's disease and rheumatoid arthritis []. Although IL-11 was initially believed to be restricted to mammals, subsequent studies demonstrated it to be expressed in ray-finned fish [, ]. Two fish paralogues have now been identified, designated IL-11A and IL-11B []. This entry represents IL-11A, it is expressed in intestine, gills, spleen, head kidney, brain, skin and muscle []. However, unlike IL-11B, it is not expressed in peripheral blood leukocytes []. IL-11A is thought to play a role in host immune response to infection [].
Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells. Interleukin-11 (IL-11) is a pleiotropic cytokine that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of IL-11 []. The cytokine also possesses anti-inflammatory activity, and has been proposed as a therapeutic agent in the treatment of chronic inflammatory diseases, such as Crohn's disease and rheumatoid arthritis []. Although IL-11 was initially believed to be restricted to mammals, subsequent studies demonstrated it to be expressed in ray-finned fish [, ]. Two fish paralogues have now been identified, designated IL-11A and IL-11B []. This entry represents IL11-B, which is expressed at high levels in peripheral blood leukocytes []. It is thought to play a role in bacterial infection, and may be involved in antiviral responses [].
The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain -a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved [, ]. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 16 (also known as nerve growth factor receptor (NGFR) and p75NTR)) acts as a low affinity receptor for neurotrophins. The receptor mediates a variety of contradictory cellular functions, including cell survival or apoptosis, promotion or inhibition of axonal growth, and facilitation or attenuation of proliferation, depending on the cellular context []. The receptor may also play a role in inflammation, and has been implicated in the pathogenesis of asthma []. A single partial match was also found, , a translated human cDNA sequence that fails to match motifs 1 and 2.
The tify domain is a 36-amino acid domain only found among Embryophyta (land plants). It has been named after the most conserved amino acid pattern (TIF[F/Y]XG) it contains, but was previously known as the Zim domain. As the use of uppercase characters (TIFY) might imply that the domain is fully conserved across proteins, a lowercase lettering has been chosen in an attempt to highlight the reality of its natural variability. Based on the domain architecture, tify domain containing proteins can be classified into two groups. Group I is formed by proteins possessing a CCT (CONSTANS, CO-like, and TOC1) domain and a GATA-type zinc finger in addition to the tify domain. Group II contains proteins characterised by the tify domain but lacking a GATA-type zinc finger. Tify domain containing proteins might be involved in developmental processes and some of them have features that are characteristic for transcription factors: a nuclear localisation and the presence of a putative DNA-binding domain []. Some proteins known to contain a tify domain include: Arabidopsis thaliana GATA transcription factors (Zinc-finger protein expressed in Inflorescence Meristem, ZIM), a putative transcription factor involved in inflorescence and flower development [, ]. A. thaliana ZIM-like proteins (ZML) []. A. thaliana Protein TIFY 1-11 [].
There are two forms of Pix proteins: alpha Pix (also called Rho guanine nucleotide exchange factor (GEF) 6, 90Cool-2 or ARHGEF6) and beta Pix (GEF7, p85Cool-1 or ARHGEF7), which activate small GTPases by exchanging bound GDP for free GTP. betaPix contains an N-terminal SH3 domain, a RhoGEF/DH domain, a PH domain, a GIT1 binding domain (GBD), and a C-terminal coiled-coil (CC) domain []. It acts as a GEF for both Cdc42 and Rac1 [], and plays important roles in regulating neuroendocrine exocytosis, focal adhesion maturation, cell migration, synaptic vesicle localization, and insulin secretion [, , , ]. alphaPix differs in that it contains a calponin homology (CH) domain, which interacts with beta-parvin, N-terminal to the SH3 domain. alphaPix is an exchange factor for Rac1 and Cdc42 and mediates Pak activation on cell adhesion to fibronectin. Mutations in alphaPix can cause X-linked mental retardation. alphaPix also interacts with Huntington's disease protein (htt), and enhances the aggregation of mutant htt (muthtt) by facilitating SDS-soluble muthtt-muthtt interactions. The DH-PH domain of a Pix was required for its binding to htt. In the majority of Rho GEF proteins, the DH-PH domain is responsible for the exchange activity [, , , , ].This entry represents the PH domain of ARHGEF6 and ARHGEF7.
Deoxythymidine diphosphate (dTDP)-4-keto-6-deoxy-d-hexulose 3, 5-epimerase (RmlC, ) is involved in the biosynthesis of dTDP-l-rhamnose, which is an essential component of the bacterial cell wall, converting dTDP-4-keto-6-deoxy-D-glucose to dTDP-4-keto-L-rhamnose.The crystal structure of RmlC from Methanobacterium thermoautotrophicum was determined in the presence and absence of a substrate analogue. RmlC is a homodimer comprising a central jelly roll motif, which extends in two directions into longer β-sheets. Binding of dTDP is stabilised by ionic interactions to the phosphate group and by a combination of ionic and hydrophobic interactions with the base. The active site, which is located in the centre of the jelly roll, is formed by residues that are conserved in all known RmlC sequence homologues. The active site is lined with a number of charged residues and a number of residues with hydrogen-bonding potentials, which together comprise a potential network for substrate binding and catalysis. The active site is also lined with aromatic residueswhich provide favorable environments for the base moiety of dTDP and potentially for the sugar moiety of the substrate [].Also included in this family is the probable dTDP-4-oxo-2,6-dideoxy-D-glucose 3,5-epimerase from Streptomyces antibioticuswhich is required for the biosynthesis of the aglycone antibiotic oleandomycin [].
Members of this family of short peptides are precursors to thiolactone (unless Cys is replaced by Ser) cyclic autoinducer peptides, used in quorum-sensing systems in Gram-positive bacteria. The best characterised is the AgrD precursor, processed by the AgrB protein. Nearby proteins regularly encountered include a histidine kinase and a response regulator. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr [].
Saccharomyces cerevisiae strains containing the erg8-1 mutation are temperature sensitive for growth due to a defect in phosphomevalonate kinase, an enzyme of isoprene and ergosterol biosynthesis. Subcloning and DNA sequencing have defined the functional ERG8 regulon as an 850bp upstream region and an adjacent 1,272bp open reading frame. The deduced ERG8 protein contains 424 residues and shows no similarity to known proteins, except within a putative ATP-binding domain present in many kinases []. Enzymes that share the N-terminal Gly/Ser-rich putative ATP-binding region include galactokinase, homoserine kinase, mevalonate kinase and phosphomevalonate kinase. Homoserine kinase is a homodimeric enzyme involved in threonine biosynthesis. Sequence comparison of the yeast enzyme with the corresponding proteins from bacterial sources reveals the presence of several highly conserved regions, the pattern of occurrence of which suggests that the ancestral sequences might have been composed from separate (functional) domains. A block of similar residues, found towards the C terminus,is also present in many other proteins involved in threonine (or serine) metabolism; this motif may therefore represent the binding site for the hydroxy-amino acids. Limited similarity was detected between a motif conserved among the homoserine kinases and consensus sequences found in other mono- or dinucleotide-binding proteins [].
Chloroperoxidase (CPO), also known as Heme haloperoxidase, is a ~250 residue heme-containing glycoprotein that is secreted by various fungi. Chloroperoxidase was first identified in Caldariomyces fumago where it catalyses the hydrogen peroxide-dependent chlorination of cyclopentanedione during the biosynthesis of the antibiotic caldarioymcin. Additionally, Heme haloperoxidase catalyses the iodination and bromination of a wide range of substrates. Besides performing H2O2-dependent halogenation reactions, the enzyme catalyses dehydrogenation reactions. Chloroperoxidase also functions as a catalase, facilitating the decomposition of hydrogen peroxide to oxygen and water. Furthermore, chloroperoxidase catalyses P450-like oxygen insertion reactions. The capability of chloroperoxidase to perform these diverse reactions makes it one of the most versatile of all known heme proteins [, ].Despite functional similarities with other heme enzymes, chloroperoxidase folds into a novel tertiary structure dominated by eight helical segments []. Structurally, chloroperoxidase is unique, but it shares features with both peroxidases and P450 enzymes. As in cytochrome P450 enzymes, the proximal heme ligand is a cysteine, but similar to peroxidases, the distal side of the heme is polar. However, unlike other peroxidases, the normally conserved distal arginine is lacking and the catalytic acid base is a glutamic acid and not a histidine [].
TssK is an essential baseplate component of the type VI secretion system, which connects the membrane complex, the baseplate and the tail components [, , ].The structure of TssK was solved revealing the proteins organise into a tightly packed trimer [, ]. Each TssK monomer comprises three domains: an N-terminal β-sandwich domain, a linker and a four-helix-bundle middledomain, and a C-terminal domain. The N-terminal domain of TssK is structurally homologous to the shoulder domain of phage receptor-binding proteins, and the C-terminal domain binds the T6SS membrane complex [].This family includes TssK proteins and TssK homologues ImpJ and vasE.The type VI secretion system (T6SS) is a supra-molecular bacterial complex that resembles phage tails. It is a toxin delivery systems which fires toxins into target cells upon contraction of its TssBC sheath []. Thirteen essential core proteins are conserved in all T6SSs: the membrane associated complex TssJ-TssL-TssM, the baseplate proteins TssE, TssF, TssG, and TssK, the bacteriophage-related puncturing complex composed of the tube (Hcp), the tip/puncturing device VgrG, and the contractile sheath structure (TssB and TssC). Finally, the starfish-shaped dodecameric protein, TssA, limits contractile sheath polymerization at its distal part when TagA captures TssA [].
Avidin []is a minor constituent of egg white in several groups of oviparous vertebrates. Avidin, which was discovered in the 1920's, takes its name from the avidity with which it binds biotin. These two molecules bind so strongly that is extremely difficult to separate them. Streptavidin is a protein produced by Streptomyces avidinii which also binds biotin and whose sequence is evolutionary related to that of avidin.Avidin and streptavidin both form homotetrameric complexes of noncovalently associated chains. Each chain forms a very strong and specific non-covalent complex with one molecule of biotin. The three-dimensional structures of both streptavidin [, ]and avidin []have been determined and revealed them to share a common fold: an eight stranded anti-parallel β-barrel with a repeated +1 topology enclosing an internal ligand binding site.Fibropellins I and III []are proteins that form the apical lamina of the sea urchin embryo, a component of the extracellular matrix. These two proteins have a modular structure composed of a CUB domain (see), followed by a variable number of EGF repeats and a C-terminal avidin-like domain.
This region is found in Connector enhancer of kinase suppressor of ras 2 (CNK2), Connector enhancer of kinase suppressor of ras 3 (CNK3) and CNK3/IPCEF1 fusion protein. This domain of unknown function is situated between the PDZ and PH domains in CNK2 and CNK3/IPCEF1 proteins and after the PDZ domain in CNK3 (which does not have a PH domain). CNK2 is predominantly expressed in neural tissues, being critical for postsynaptic density morphology, implicated in X-linked intellectual disability (ID). CNK2 was first described as a regulator of Ras/MAPK signalling by binding to the Ras effector RAF that lead to further studies concluding that it act as a scaffold for multiple signal cascades. It is able to direct the localisation of regulatory proteins within the cell and influences the behaviour of important regulatory molecules [, ]. CNK3 regulates aldosterone-induced and epithelial sodium channel (ENaC)-mediated sodium transport through regulation of ENaC cell surface expression, acting as a scaffold protein that coordinates the assembly of an ENaC-regulatory complex (ERC) []. CNK3/IPCEF1 is required for hepatocyte growth factor (HGF)-dependent activation of Arf6 and HGF-stimulated cell migration [].
PGC-1-related coactivator (PRC or PPRC1) was first identified as a transcriptional coactivator that shares structural and functional features with PGC-1alpha. It belongs to the PGC-1 family []. Similar to other PGC-1 members, PRC has a function in growth-regulated mitochondrial biogenesis. Different from other PGC-1 family members, PRC mRNA is induced by serum growth factors in the absence of de novoprotein synthesis, which place the PRC gene in a class of immediate early or primary response genes []. In mice germ line, knock-out of PRC causes early embryonic lethality []. PRC is also required for the induction of an inflammatory/stress response to multiple metabolic insults [].The PGC-1 family members, including PGC-1alpha, PGC-1beta and PRC, are transcriptional coactivators that regulate cell metabolic processes ranging from mitochondrial biogenesis to oxidative respiration []. They target NRF-1, a transcription factor that binds to a palindromic sequence in the cytochrome c promoter and associate with the expression of many genes required for expression and function of the respiratory chain. They also bind to CREB (cAMP response element-binding protein) and ERRalpha (estrogen-related receptor alpha). However, knockout of individual components in the PGC-1 family can result in diverse phenotypes ranging from early embryonic lethality to complete viability with very mild effects on global mitochondrial content and function. They contain an N-terminal region with a nuclear receptor coactivator signature-LXXLL and a C-terminal region with a RS domain and a RNA recognition motif [].
This domain covers the NSP16 region of the coronavirus polyprotein. It was originally named NSP13 and later changed to NSP16 to distinguish it from the helicase region []. NSP16 is a 7-methylguanine-triphosphate-adenosine (m7GpppA)-specific, SAM-dependent 2'-O-MTase that has selective RNA binding properties and is a cap-0 binding protein. It contains a highly conserved catalytic tetrad (K-D-K-E) that is a hallmark of RNA 2'O-MTases [, , ]. NSP16 plays a key role in viral replication as it is involved in immune response evasion through the 2'-O-methylation of coronavirus RNA, which is essential for preventing recognition by the host. It mimics the human protein CMTr1 and, unlike CMTr1, it requires NSP10 as a binding partner to activate its enzymatic activity []. Structural analysis in NSP16 from SARS-CoV-2 identified a cryptic pocket (also present in other coronavirus such as SARS-CoV-1 and MERS) not present in human CMTr1, and the fact that NSP16 is one of the most conserved proteins of SARS-CoV-2 and related viruses, make it an interesting target for developing new antiviral treatment for COVID-19 and other diseases caused by coronaviruses [].
This entry represents a predicted periplasmic protein, called MoxJ or MxaJ and is required for methanol oxidation in Methylobacterium extorquens [].There are two differing opionions suggesting the role of this protein. The first is based on the homology of MoxJ, and suggests that it is the substrate-binding protein of an ABC transporter associated with methanol oxidation. The gene is also found in genomes with and only two or three genes away from a corresponding permease and ATP-binding cassette gene pair.The other opinion, is that MoxJ is an accessory factor or additional subunit of methanol dehydrogenase itself []. Mutational studies show a dependence of MoxJ for the expression of the PQQ-dependent two-subunit methanol dehydrogenase (MxaF and MxaI) in Methylobacterium extorquens, possibly acting as a chaperone for enzyme assembly or a third subunit. A homologous N-terminal sequence was found in Acetobacter methanolicus as a 32Kd third subunit [].It is thought that MoxJ may in fact be both, a component of a periplasmic enzyme that converts methanol to formaldehyde and a component of an ABC transporter that delivers the resulting formaldehyde to the cells interior.
This family represents Kae1 and its homologues from Archaea. They belong to the Kae1/TsaD family. Its partner kinase Bud32 is fused with it in about half of the known archaeal genomes []. The pair, which appears universal in the archaea, corresponds to EKC/KEOPS complex in eukaryotes []. The first characterised member of the Kae1/TsaD family was annotated as Gcp for O-sialoglycoprotein endopeptidase [], but this activity could not be confirmed []. Later, its homologue, Kae1 from Pyrococcus abyssi, has been shown to have DNA-binding properties and apurinic-endonuclease activity []. Members of this family have since been studied in yeast, archaea and bacteria resulting in sometimes conflicting data, several proposed functions and annotations but no definitive characterisation. For instance, some members have been linked to DNA maintenance in bacteria and mitochondria []and transcription regulation and telomere homeostasis in eukaryotes [, ], but their function remained unclear. Recent research indicates that this family is involved in the biosynthesis of N6-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs that read codons beginning with adenine [].
The KEOPS/EKC complex is a tRNA modification complex involved in the biosynthesis of N6-threonylcarbamoyladenosine (t6A), a universally conserved tRNA modification found on ANN-codon recognizing tRNAs []. In archaea and eukaryotes, KEOPS is composed of Kae1, a highly conserved protein present in bacteria, archaea and eukaryotes, kinase PRPK/Bud32, and three additional polypeptides (Pcc1, Cgi121, and Gon7) [, ]. This family represents Kae1 and its homologues from archaea and eukaryotes.The first characterised member of the Kae1/TsaD family was annotated as Gcp for O-sialoglycoprotein endopeptidase [], but this activity could not be confirmed []. Later, its homologue, Kae1 from Pyrococcus abyssi, has been shown to have DNA-binding properties and apurinic-endonuclease activity []. Members of this family have since been studied in yeast, archaea and bacteria resulting in sometimes conflicting data, several proposed functions and annotations but no definitive characterisation. For instance, some members have been linked to DNA maintenance in bacteria and mitochondria []and transcription regulation and telomere homeostasis in eukaryotes [, ], but their function remained unclear. Recent research indicates that this family is involved in the biosynthesis of N6-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs that read codons beginning with adenine [].
Gelsolin is an actin-modulating protein that severs F-actin, caps the barbed ends of actin filaments preventing monomer exchange, and promotes the nucleation step of actin polymerisation [, ]. It can be regulated by Ca2+ and phosphoinositides []. The interaction between gelsolin and tropomyosin modulates actin dynamics []. Gelsolin also plays a role in ciliogenesis []. The structure of gelsolin has been solved []. Villin is an actin-binding protein that is found in a variety of tissues. It is able to bind to the barbed end of actin filaments with high affinity and can sever filaments []. In addition, villin's activity is important for actin bundling in certain cell types []. It was first isolated as a major component of the core of intestinal microvilli [].Villin/gelsolin family includes other actin-binding proteins such as severin and supervillin []. Six large repeating segments occur in gelsolin and villin, and 3 similar segments in severin and fragmin. While the multiple repeats have yet to be related to any known function of the actin-severing proteins, the superfamily appears to have evolved from an ancestral sequence of 120 to 130 amino acid residues [].
Programmed cell death 10 protein (PDCD10/CCM3) is part of the CCM complex and is required for neuronal migration []. It also has roles outside of this complex [], it is crucial in vascularization and in angiogenesis as it functions in vessel permeability and stability []. PDCD10/CCM3 was originally discovered to be upregulated during granulocyte apoptosis and is thought to play a role in cell death []. However, a specific role for PDCD10 in cell survival is not clear as both pro-survival and pro-apoptotic effects have been reported [, ]. PDCD10/CCM3 contains an N-terminal dimerisation domain and a C-terminal focal adhesion targeting-homology (FAT-H) domain [].There are three CCM proteins: CCM1 (also known as KRIT1), CCM2 (also known as OSM and malcavernin) and CCM3 (also known as PDCD10). Mutations in the genes encoding CCM proteins cause cerebral cavernous malformations (CCMs), a disease characterised by dilated leaky blood vessels, especially in the neurovasculature, that result in increased risk of stroke, focal neurological defects and seizures. The CCM proteins can form a trimeric complex. They can also interact with a range of signaling, cytoskeletal and adaptor proteins that may account for their roles in a range of basic cellular processes including cell adhesion, migration, polarity and apoptosis [].
Formylmethanofuran:tetrahyromethanopterin formyltransferase (Ftr) is involved in C1 metabolism in methanogenic archaea, sulphate-reducing archaea and methylotrophic bacteria. It catalyses the following reversible reaction:N-formylmethanofuran + 5,6,7,8-tetrahydromethanopterin = methanofuran + 5-formyl-5,6,7,8-tetrahydromethanopterinFtr from the thermophilic methanogen Methanopyrus kandleri (optimum growth temperature 98 degrees C) is a hyperthermophilic enzyme that is absolutely dependent on the presence of lyotropic salts for activity and thermostability. The crystal structure of Ftr, determined to a reveals a homotetramer composed essentially of two dimers. Each subunit is subdivided into two tightly associated lobes both consisting of a predominantly antiparallel beta sheet flanked by alpha helices forming an alpha/beta sandwich structure. The approximate location of the active site was detected in a region close to the dimer interface []. Ftr from the mesophilic methanogen Methanosarcina barkeri and the sulphate-reducing archaeon Archaeoglobus fulgidus have a similar structure [].In the methylotrophic bacterium Methylobacterium extorquens, Ftr interacts with three other polypeptides to form an Ftr/cyclohydrolase complex which catalyses the hydrolysis of formyl-tetrahydromethanopterin to formate during growth on C1 substrates [].This entry represents the ferredoxin-like Ftr N-terminal domain.
The members of this family are similar to gene products 9 (gp9) and 10 (gp10) of bacteriophage T4. Both proteins are components of the viral baseplate []. Gp9 connects the long tail fibres of the virus to the baseplate and triggers tail contraction after viral attachment to a host cell. The protein is active as a trimer, with each monomer being composed of three domains. The N-terminal domain consists of an extended polypeptide chain and two alpha helices. The alpha1 helix from each of the three monomers in the trimer interacts with its counterparts to form a coiled-coil structure. The middle domain is a seven-stranded β-sandwich that is thought to be a novel protein fold. The C-terminal domain is thought to be essential for gp9 trimerisation and is organised into an eight- stranded antiparallel β-barrel, which was found to resemble the 'jelly roll' fold found in many viral capsid proteins. The long flexible region between the N-terminal and middle domains may be required for the function of gp9 to transmit signals from the long tail fibres []. Together with gp11, gp10 initiates the assembly of wedges that then go on to associate with a hub to form the viral baseplate [].
This domain (LRAT domain) is found in a variety of proteins, including lecithin retinol acyltransferase (LRAT), HRAS-like suppressors (HRASLS1-5) and proteins FAM84A and FAM84B. Acyltransferase LRAT is the main enzyme that catalyzes vitamin A esterification []. HRASLS enzymes are also referred to as LRAT-like proteins because of their sequence homology to LRAT [].The basic structural motif of the LRAT domain is composed of a four-strand antiparallel β-sheet and three α-helices. The longest α-helix (alpha3) is packed against the β-sheet, and the two other shorter α-helices are located on the sides. A highly conserved catalytic Cys, identified as the acylation site, is located near the N terminus of alpha3. This arrangement defines the active site location, which is embedded into a well defined groove formed by the extended loops between beta1-beta2, beta3-beta4, and the N terminus of the alpha3 helix. The side chain of the Cys is packed against a β-sheet core of the domain, placing it in close proximity to a conserved His from the beta2 strand. The β-sheet is spread open on one end allowing formation of a hydrogen bond between the His and the Cys. The third polar residue in this catalytic triad is a polar residue in the neighboring beta3 strand. The Cys residue was shown to act as a nucleophile and form a covalent thiol-acyl intermediate in the catalytic process [, , ].
The host selective cysteine rich necrotrophic effector Tox1(SnTox1) found in Parastagonospora nodorum is a necrotrophic effector that contains 6 cysteine residues, a common feature for some fungal avirulence effectors such as the Avr and ECP effectors from Cladosporium fulvum. The high content of cysteine residues and high stability suggest that SnTox1 may function in the plant apoplastic space which is abundant in plant defense components. Protein sequence analysis indicate that SnTox1 contains a C-terminal chitin binding (CB) like motif. Three-dimensional (3D) structure-based sequence alignment suggested that the putative CB motif in SnTox1 was more similar to those of plant-specific ChtBDs than to Avr4 proteins, which are related to invertebrate ChtBDs. Furthermore, SnTox1 contained all secondary-structure-related residues including the strictly conserved b-strand-forming 'CCS' motif found only in plant-specific ChtBD1 proteins [, ]. It interacts with the host Snn1 protein conferring susceptibility. Binding of SnTox1 to chitin in the fungal cell wall protects the pathogen from chitinase activity.This entry represents the putative chiting binding-like domain found in SnTox1 from Parastagonospora nodorum.
In general, ModA serves as initial receptors in the ABC transport of molybdate in eubacteria and archaea. Bacteria and archaea import molybdenum and tungsten from the environment in the form of the oxyanions molybdate (MoO(4) (2-)) and tungstate (WO(4) (2-)). After binding molybdate with high affinity, they interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis [].In contrast to the structure of the two ModA homologues from Escherichia coli and Azotobacter vinelandii, where the oxygen atoms are tetrahedrally arranged around the metal centre, the structure of Pyrococcus furiosus ModA/WtpA (PfModA) has shown that a binding site for molybdate and tungstate is where the central metal atom is in a hexacoordinate configuration. This octahedral geometry was rather unexpected [].The ModA proteins belong to the PBP2 superfamily of periplasmic binding proteins that differ in size and ligand specificity, but have similar tertiary structures consisting of two globular subdomains connected by a flexible hinge. They have been shown to bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap [, , ].This entry represents a domain found in the ModA protein from Azotobacter vinelandii and its homologues.
Spiders are widely acknowledged to produce potent and selective toxins. In addition to the conventional neurotoxins and cytotoxins, venom of lynx spiders (genus Oxyopes) was found to contain two-domain modular toxins named spiderines: OspTx1a, 1b, 2a and 2b [, , ]. Spiderines consist of two distinct modules separated by a short linker. The N-terminal part (~40 residues) contains no cysteine residues, is highly cationic, forms amphipathic alpha- helical structures in a membrane-mimicking environment, and shows potent cytolytic effects on cells of various origins. The short linker resembles closely the processing quadruplet motif (PQM), which is known to indicate the processing cleavage site in precursors of spider toxins and separate the prosequence from the mature chain. The C-terminal part (~60 residues) is a disulfide rich domain reticulated by five S-S bridges that is homologous to one-domain oxytoxins (OxyTx1 and OxyTx2) from Oxypes species. Oxytoxins are disulphide-rich polypeptides that contain five disulfide bridges and block L-, N- and P/Q-type voltage-sensitive calcium ion channels (VSCCs) []. The core of the oxytoxin-like domain is the inhibitor cystine knot (ICK) or knottin motif. The domain is stabilised by five disulfides and 13 hydrogen bonds. Two antiparallel β-strands form a short β-sheet, and there are two β-turns in the N-terminal part of of the domain. C1-C5, C2-C6, and C4-C9 disulfides contribute to the ICK motif, whereas C7-C8 stabilises the extended loop of the β-hairpin and C3-C10 staples the lengthy C-terminal of the domain to its core [].This entry represents the oxytoxin-type ICK domain.
This entry represents the RNA recognition motif (RRM) of synaptojanin-1. Synaptojanin-1 was originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis []. It also acts as an Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase, with a putative role in clathrin-mediated endocytosis [, ]. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p [], a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2 []. Synaptojanin-1 has two tissue-specific alternative splicing isoforms, synaptojanin-145 expressed in brain and synaptojanin-170 expressed in peripheral tissues. Synaptojanin-145 is very abundant in nerve terminals and may play an essential role in the clathrin-mediated endocytosis of synaptic vesicles []. In contrast to synaptojanin-145, synaptojanin-170 contains three unique asparagine-proline-phenylalanine (NPF) motifs in the C-terminal region, and may function as a potential bindingpartner for Eps15, a clathrin coat-associated protein acting as a major substrate for the tyrosine kinase activity of the epidermal growth factor receptor [].
Synaptojanin-1 was originally identified as one of the major Grb2-binding proteins that may participate in synaptic vesicle endocytosis []. It also acts as an Src homology 3 (SH3) domain-binding brain-specific inositol 5-phosphatase, with a putative role in clathrin-mediated endocytosis [, ]. Synaptojanin-1 contains an N-terminal domain homologous to the cytoplasmic portion of the yeast protein Sac1p [], a central inositol 5-phosphatase domain followed by a putative RNA recognition motif (RRM), and a C-terminal proline-rich region mediating the binding of synaptojanin-1 to various SH3 domain-containing proteins including amphiphysin, SH3p4, SH3p8, SH3p13, and Grb2 []. Synaptojanin-1 has two tissue-specific alternative splicing isoforms, synaptojanin-145 expressed in brain and synaptojanin-170 expressed in peripheral tissues. Synaptojanin-145 is very abundant in nerve terminals and may play an essential role in the clathrin-mediated endocytosis of synaptic vesicles []. In contrast to synaptojanin-145, synaptojanin-170 contains three unique asparagine-proline-phenylalanine (NPF) motifs in the C-terminal region, and may function as a potential binding partner for Eps15, a clathrin coat-associated protein acting as a majorsubstrate for the tyrosine kinase activity of the epidermal growth factor receptor [].
This entry represents the single capsid protein of infectious hypodermal and haematopoietic necrosis virus (IHHNV), found particularly in shrimp densovirus. Densoviruses are a subfamily of the parvoviruses. The capsid protein has an eight-stranded anti-parallel β-barrel 'jelly roll' motif similar to that found in many icosahedral viruses, including other parvoviruses. The N-terminal portion of the IHHNV coat protein adopts a 'domain-swappe' conformation relative to its twofold-related neighbour. The loops connecting the strands of the structurally conserved jelly roll motif differ considerably in structure and length from those of other parvoviruses. IHHNV was first reported as a highly lethal disease of juvenile shrimp in 1983, and has only one type of capsid protein that lacks the phospholipase A2 activity that has been implicated as a requirement during parvoviral host cell infection. The structure of recombinant virus-like particles, composed of 60 copies of the 37.5kDa coat protein is the smallest parvoviral capsid protein reported thus far. The small size of the PstDNV capsid protein makes the system attractive as a model for studying assembly mechanisms of icosahedral virus capsids [].
Interleukins (IL) are a group of cytokines that play an important role in the immune system. They modulate inflammation and immunity by regulating growth, mobility and differentiation of lymphoid and other cells. Interleukin-11 (IL-11) is a pleiotropic cytokine that stimulates megakaryocytopoiesis, resulting in increased production of platelets, as well as activating osteoclasts, inhibiting epithelial cell proliferation and apoptosis, and inhibiting macrophage mediator production. These functions may be particularly important in mediating the hematopoietic, osseous and mucosal protective effects of IL-11 []. The cytokine also possesses anti-inflammatory activity, and has been proposed as a therapeutic agent in the treatment of chronic inflammatory diseases, such as Crohn's disease and rheumatoid arthritis []. Although IL-11 was initially believed to be restricted to mammals, subsequent studies demonstrated it to be expressed in fish [, ]. Despite close similarity in gene structure and conservation of key amino acids between fish and mammalian IL-11, they share relatively low overall amino acid identity and may not necessarily be functionally analogous []. This entry represents interleukin-11 found in mammals.
The tumour necrosis factor (TNF) receptor (TNFR) superfamily comprises more than 20 type-I transmembrane proteins. Family members are defined based on similarity in their extracellular domain - a region that contains many cysteine residues arranged in a specific repetitive pattern []. The cysteines allow formation of an extended rod-like structure, responsible for ligand binding []. Upon receptor activation, different intracellular signalling complexes are assembled for different members of the TNFR superfamily, depending on their intracellular domains and sequences []. Activation of TNFRs can therefore induce a range of disparate effects, including cell proliferation, differentiation, survival, or apoptotic cell death, depending upon the receptor involved [, ]. TNFRs are widely distributed and play important roles in many crucial biological processes, such as lymphoid and neuronal development, innate and adaptive immunity, and maintenance of cellular homeostasis []. Drugs that manipulate their signalling have potential roles in the prevention and treatment of many diseases, such as viral infections, coronary heart disease, transplant rejection, and immune disease []. TNF receptor 8 (also known as CD30 and Ki-1 antigen) was originally described as a marker of Hodgkin's and Reed-Sternberg cells in Hodgkin's lymphoma. Expression of the receptor is largely restricted to virus-infected lymphocytes, neoplasms of lymphoid origin and a subset of activated T cells that produce Th2-type cytokines []. The receptor has pleiotropic biological functions, including inducement apoptosis and enhancement of cell survival [].
Dual specificity phosphatases (DUSPs) are members of the superfamily of protein tyrosine phosphatases [, ]. They remove the phosphate group from both phospho-tyrosine and phospho-serine/threonine residues. They are structurally similar to tyrosine-specific phosphatases but with a shallower active site cleft and a distinctive active site signature motif, HCxxGxxR [, , ]. They are characterized as VHR- [, ]or Cdc25-like [, ].In general, DUSPs are classified into the following subgroups []:Slingshot phosphatasesPhosphatase of regenerating liver (PRL)Cdc14 phosphatasesPhosphatase and tensin homologue deleted on chromosome 10 (PTEN)-like and myotubularin phosphatasesMitogen-activated protein kinase phosphatases (MKPs)Atypical DUSPsThe atypical DUSPs share a high degree of similarity with the MKP subgroup, but lack the N-terminal regulatory domain, which provides the substrate specificity towards the MAP kinases. These atypical-DUSPs form a heterogeneous group and have in common the presence of a single catalytic PTP domain. VHR was the first characterised member of this subfamily; its crystal structure is known [, ].The function of many atypical DUSPs remains unknown, although some have been related to regulation of MAP kinase pathways [, , ]. VHR has also been related to the control of cell-senescence []. The atypical DUSPs can be subdivided into two groups (termed A and B) on the basis of sequence similarity. Each of these subgroups is characterised by its own distinctive set of motifs, the functions of which are as yet unknown.This entry represents atypical dual specificity phosphatase subfamily A.
The ERM family consists of three closely-related proteins, ezrin, radixin and moesin [, ]. Ezrin was first identified as a constituent of microvilli, radixin as a barbed, end-capping actin-modulating protein from isolated junctional fractions, and moesin as a heparin-binding protein []. ERM proteins crosslink actin filaments with plasma membranes. They co-localise with CD44 at actin filament plasma membrane interaction sites, associating with CD44 via their N-terminal domains and with actin filaments via their C-terminal domains []. A tumour suppressor molecule responsible for neurofibromatosis type 2 (NF2) is highly similar to ERM proteins and has been designated merlin (moesin-ezrin-radixin-like protein) []. ERM molecules contain 3 domains, an N-terminal globular domain, an extended α-helical domain and a charged C-terminal domain () []. Ezrin, radixin and merlin also contain a polyproline linker region between the helical and C-terminal domains. The N-terminal domain is highly conserved and is also found in merlin, band 4.1 proteins and members of the band 4.1 superfamily, designated the FERM domain. This entry represents the α-helical domain, which is involved in intramolecular masking of protein-protein interaction sites, that regulate the activity of these proteins [, ].
The group of polyomaviruses is formed by the homonymous murine virus (Py) as well as other representative members such as the simian virus 40 (SV40) and the human BK and JC viruses []. Their large T antigen (T-ag) protein binds to and activates DNA replication from the origin of DNA replication (ori). Insofar as is known, the T-ag binds to the origin first as a monomer to its pentanucleotide recognition element. The monomers are then thought to assemble into hexamers and double hexamers, which constitute the form that is active in initiation of DNA replication. When bound to the ori, T-ag double hexamers encircle DNA []. T-ag is a multidomain protein that contains an N-terminal J domain, which mediates protein interactions (see , ), a central origin-binding domain (OBD), and a C-terminal superfamily 3 helicase domain (see , ) [].This entry represents the helicase domain of LTag, which assembles into a hexameric structure containing a positively charged central channel that can bind both single- and double-stranded DNA []. ATP binding and hydrolysis trigger large conformational changes which are thought to be coupled to the melting of origin DNA and the unwinding of duplex DNA []. These conformational changes cause the angles and orientations between regions of a monomer to alter, creating what was described as an "iris"-like motion in the hexamer. In addition to this, six beta hairpins on the channel surface move longitudinally along the central channel, possibly serving as a motor for pulling DNA into the LTag double hexamer for unwinding.
Chorismate synthase (CS; 5-enolpyruvylshikimate-3-phosphate phospholyase; 1-carboxyvinyl-3-phosphoshikimate phosphate-lyase; E.C. 4.2.3.5) catalyzes the seventh and final step in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity.Chorismate synthase from various sources shows a high degree of sequence conservation [, ]. It is a protein of about 360 to 400 amino-acid residues.Depending on the capacity of these enzymes to regenerate the reduced form of FMN, chorismate synthases are divided into two groups: enzymes, mostly from plants and eubacteria, that sequester CS from the cellular environment, are monofunctional, while those that can generate reduced FMN at the expense of NADPH, such as found in fungi and the ciliated protozoan Euglena gracilis, are bifunctional, having an additional NADPH:FMN oxidoreductase activity. Recently, bifunctionality of the Mycobacterium tuberculosis enzyme (MtCS) was determined by measurements of both chorismate synthase and NADH:FMN oxidoreductase activities. Since shikimate pathway enzymes are present in bacteria, fungi and apicomplexan parasites (such as Toxoplasma gondii, Plasmodium falciparum, and Cryptosporidium parvum) but absent in mammals, they are potentially attractive targets for the development of new therapy against infectious diseases such as tuberculosis (TB) [, , , , , , , , , ].This entry represents conserved regions from chorismate synthase that are rich in basic residues.
This entry represents fructose-1,6-bisphosphatase (FBPase), a critical regulatory enzyme in gluconeogenesis that catalyses the removal of 1-phosphate from fructose 1,6-bis-phosphate to form fructose 6-phosphate [, ]. It is involved in many different metabolic pathways and found in most organisms. FBPase requires metal ions for catalysis (Mg2+and Mn2+being preferred) and the enzyme is potently inhibited by Li+. The fold of fructose-1,6-bisphosphatase was noted to be identical to that of inositol-1-phosphatase (IMPase) []. Inositol polyphosphate 1-phosphatase (IPPase), IMPase and FBPase share a sequence motif (Asp-Pro-Ile/Leu-Asp-Gly/Ser-Thr/Ser) which has been shown to bind metal ions and participate in catalysis. This motif is also found in the distantly-related fungal, bacterial and yeast IMPase homologues. It has been suggested that these proteins define an ancient structurally conserved family involved in diverse metabolic pathways, including inositol signalling, gluconeogenesis, sulphate assimilation and possibly quinone metabolism [].In mammalian FBPase, a lysine residue has been shown to be involved in the catalytic mechanism []. The region around this residue is highly conserved and can be used as a signature pattern for FBPase and sedoheptulose-1,7-bisphosphatase (SBPase) an enzyme found plant chloroplasts and in photosynthetic bacteria that is functionally and structurally related to FBPase []. SBPase catalyses the hydrolysis of sedoheptulose 1,7-bisphosphate to sedoheptulose 7-phosphate, a step in the Calvin's reductive pentose phosphate cycle. This signature contains the active site lysine, however, it must be noted that, in some bacterialFBPase sequences, the active site lysine is replaced by an arginine.
Nuclear factor I (NF-I) or CCAAT box-binding transcription factor (CTF) [, , ](also known as TGGCA-binding proteins) are a family of vertebrate nuclear proteins which recognise and bind, as dimers, the palindromic DNA sequence 5'-TGGCANNNTGCCA-3'. This family was first described for its role in stimulating the initiation of adenovirus DNA replication []. In vertebrates there are four members NFIA, NFIB, NFIC, and NFIX and an orthologue from Caenorhabditis elegans has been described, called Nuclear factor I family protein (NFI-I) []. The CTF/NF-I proteins are individually capable of activating transcription and DNA replication, thus they function by regulating cell proliferation and differentiation. They are involved in normal development and have been associated with developmental abnormalities and cancer in humans []. In a given species, there are a large number of different CTF/NF-I proteins, generated both by alternative splicing and by the occurrence of four different genes. CTF/NF-1 proteins contain 400 to 600 amino acids. The N-terminal 200 amino-acid sequence, almost perfectly conserved in all species and genes sequenced, mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The C-terminal 100 amino acids contain the transcriptional activation domain. This activation domain is the target of gene expression regulatory pathways elicited by growth factors and it interacts with basal transcription factors and with histone H3 [].This entry represents the 200 amino-acid DNA-binding domain found in N-terminal of CTF/NF1 proteins. It mediates site-specific DNA recognition, protein dimerisation and Adenovirus DNA replication. The CTF/NF-I DNA-binding domain contains four conserved Cys residues, which are required for its DNA-binding activity [].
Although ATP is the most common phosphoryl group donor for kinases, certain hyperthermophilic archaea, such as Thermococcus litoralis and Pyrococcus furiosus, utilise unusual ADP-dependent glucokinases (ADPGKs) and phosphofructokinases (ADPPKKs) in their glycolytic pathways [, , ]. ADPGKs and ADPPFKs exhibit significant similarity, and form an ADP-dependent kinase (ADPK) family, which was tentatively named the PFKC family []. A ~460-residue ADPK domain is also found in a bifunctional ADP-dependent gluco/phosphofructo-kinase (ADP-GK/PFK) from Methanocaldococcus jannaschii (Methanococcus jannaschii) as well as in homologous proteins present in several eukaryotes [, ]. Structure determination for eukaryotic ADPGK revealed an overall similar fold to archaeal orthologues with some differences in secondary structural elements. In the nucleotide-binding loop of eukaryotic ADPGK there is a disulfide bond between conserved cysteines; one of the cysteines coordinating the AMP defines an apparently nucleotide-binding motif unique to eukaryotic ADPGKs. Mammalian enzymes are specific for glucose [].The whole structure of the ADPK domain can be divided into large and small α/β subdomains. The larger subdomain, which carries the ADP binding site, consists of a twisted 12-stranded β-sheet flanked on both faces by 13 α-helices and three 3(10) helices, forming an α/β 3-layer sandwich. The smaller subdomain, which covers the active site, forms an α/β two-layer structure containing 5 bβ-strands and four α-helices. The ADP molecule is buried in a shallow pocket in the large subdomain. The binding of substrate sugar induces a structural change, the small domain closing to form a complete substrate sugar binding site [, , ].
Dual specificity phosphatases (DUSPs) are members of the superfamily of protein tyrosine phosphatases [, ]. They remove the phosphate group from both phospho-tyrosine and phospho-serine/threonine residues. They are structurally similar to tyrosine-specific phosphatases but with a shallower active site cleft and a distinctive active site signature motif, HCxxGxxR [, , ]. They are characterized as VHR- [, ]or Cdc25-like [, ].In general, DUSPs are classified into the following subgroups []:Slingshot phosphatasesPhosphatase of regenerating liver (PRL)Cdc14 phosphatasesPhosphatase and tensin homologue deleted on chromosome 10 (PTEN)-like and myotubularin phosphatasesMitogen-activated protein kinase phosphatases (MKPs)Atypical DUSPsThe atypical DUSPs share a high degree of similarity with the MKP subgroup, but lack the N-terminal regulatory domain, which provides the substrate specificity towards the MAP kinases. These atypical-DUSPs form a heterogeneous group and have in common the presence of a single catalytic PTP domain. VHR was the first characterised member of this subfamily; its crystal structure is known [, ].The function of many atypical DUSPs remains unknown, although some have been related to regulation of MAP kinase pathways [, , ]. VHR has also been related to the control of cell-senescence []. The atypical DUSPs can be subdivided into two groups (termed A and B) on the basis of sequence similarity. Each of these subgroups is characterised by its own distinctive set of motifs, the functions of which are as yet unknown.This entry also includes DUSP1 which does not belong to the atypical DUSP family.
The homeobox domain or homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including vertebrates [, ]. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two α-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure.The motif is very similar in sequence and structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeotic proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereo-chemical requirement for glycine in the turn which is needed to avoid steric interference of the β-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, while for many of the homeotic and other DNA-binding proteins the requirement is relaxed.
Homeodomain proteins are transcription factors that share a related DNA-binding homeodomain []. The homeodomain was initially identified in Drosophila melanogaster (Fruit fly) homeotic and segmentation proteins, but is well conserved throughout metazoans [, ]. The homeodomain binds DNA through a helix-turn-helix (HTH) structure, consisting of approximately 20 residues []. The HTH motif is comprised of two α-helices that make intimate contacts with the DNA; the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions. These interactions occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure and is joined to the second through a short turn.Most proteins which contain a homeobox domain can be classified [, ],on the basis of their sequence characteristics, into three subfamilies, engrailed, antennapedia andpaired. A number of different proteins contain homeodomains, including Drosophila engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and Hox proteins. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The homeodomain motif is very similar in sequence identity and structure to domains in other DNA-binding proteins, including recombinases, GARP response regulators, human telomeric protein, AraC type transcriptional activator and tetracycline repressor [, , ].This entry represents a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain and forms a kind of a signature pattern for this subfamily of proteins [].
Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila [].In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human [].The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, , there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity [].
The Histidine Triad (HIT) motif, His-x-His-x-His-x-x (x, ahydrophobic amino acid) was identified as being highly conserved in a variety of organisms []. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HITsuperfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles []. Hint homologues including rabbit Hint and yeastHnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 andAMP-lysineto AMP plus the amine product and function as positive regulatorsof Cdk7/Kin28 in vivo []. Fhit homologues are diadenosine polyphosphate hydrolases []and function as tumour suppressors in human and mouse []though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis []. The third branch of the HIT superfamily, which includesGalT homologues, contains a related His-X-His-X-Gln motif and transfersnucleoside monophosphate moieties to phosphorylated second substrates ratherthan hydrolysing them [].The bovine protein kinase C inhibitor, PKCI-1, is an inhibitor protein that binds zinc without the use of zinc-finger motifs []. Each protein molecule binds one zinc ion via a novel binding site containing 3 closely-spaced histidine residues []. This region, referred to as the histidine triad (HIT) [], has been identified in various prokaryotic and eukaryotic proteins of uncertain function [].The signature pattern used in this entry contains the region of the histidine triad and includes the three conserved histidine residues which are thought to bind the zinc ion.
This family includes plant serine/threonine receptor-like kinases related to CRINKLY4 (CR4), a protein involved in developmental processes in plant and endosperm that was first isolated in maize [, ]. Mutations in this protein affects the cell wall thickness and structure, cuticle formation, and vesicle trafficking, and tumor like outgrowths, with similar effects seen in rice [, , ]. Arabidopsis thaliana contains an orthologue of CR4, ACR4, and four CRINKLY4-related proteins (CRR or CCR) AtCRR1, AtCRR2, AtCRR3 and AtCRR4 (also known as CRINKLY 4-related kinase 1, AtCRK1) []. Phylogenetic analysis showed that the CR4 family of receptor kinases can be divided in three clades, one including CR4, CCR1 and CCR2, a second including CCR3 and CCR4 family members, and a third and more distant clade including members from algae and Selaginella moellendorffii sequences with transmembrane and/or kinase domains []. Kinase assays showed that ACR4 is an active serine/threonine kinase, while CCR1 and CCR2 are nearly inactive in autophosphorylation assays []. CR4 family are characterised by the presence of seven 'crinkly' repeats in the extracellular part which is required both for signalling and normal protein internalisation, including a conserved C(X~10)CWG sequence motif. The Cys residues in the extracellular 'crinkly' repeat domain are likely to form stabilizing disulfide bridges []. Another feature of the CR4 family is that the extracellular domain shows homology to the three Cys-rich repeats of the TUMOR NECROSIS FACTOR RECEPTOR (TNFR) extracellular domain [].This family represents the CRINKLE4-related proteins CCR3 and CCR4.
This domain corresponds to the mature part of the Ecp2 effector protein from the tomato pathogen Cladopsorium fulvum. Effectors are low molecular weight proteins that are secreted by bacteria, oomycetes and fungi to manipulate their hosts and adapt to their environment. Ecp2 is a 165 amino acid secreted protein that was originally identified as a virulence factor in C. fulvum, since disruption reduces virulence of the fungus on tomato plants. It has been recently determined that Ecp2 is a member of a novel, widely distributed and highly diversified within the fungal kingdom, multigene superfamily which have been designated Hce2, for Homologs of C. fulvum Ecp2 effector. Although Ecp2 is present in most organisms as a small secreted protein, the mature part of this protein can be found fused to other protein domains, including the fungal Glycoside hydrolase family 18 () and other, unknown, protein domains. The intrinsic function of Ecp2 remains unknown but it is postulated that it is a necrosis-inducing factor in plants that serves pathogenicity on the host [].
Eukaryotic NAC, an abundant heterodimer composed of two homologous subunits, reversibly binds eukaryotic ribosomes and is located in direct proximity to nascent polypeptides as they emerge from the ribosome []. Despite being implicated in diverse cellular functions, its role in vivois still not completely understood. It is thought that NAC may function as a shield protecting the nascent chain from inappropriate interactions with cytosolic factors, and could regulate further translation of the polypeptide through its interaction with the ribosome.Archaeal NAC is a homodimer which appears to be functionally analogous to the eukaryotic form, associating with ribosomes and contacting the nascent polypeptide chain emerging on the ribosome []. It has two domains; the NAC domain, which it shares with eukaryotic NAC, and a C-terminal UBA domain also found on the alpha, but not beta, subunit of eukaryotic NAC. The NAC domain forms a six-stranded β-barrel structure which shows some similarities to the OB fold. This domain appears to be responsible for dimerisation, nucleic acid binding, and nascent polypeptide binding. The UBA domain is typical of ubiquitin or polyubquitin-binding proteins; its physiological role is unclear but it was suggested that it may compete with the proteosome for ubiquitin, thus inhibiting ubiquitin-mediated protein degradation.
Analysis of the metabolome of a number of methanogenic archaea revealed that they produced a number of unusual compounds of low molecular weight, one of which was cyclic 2,3-diphosphoglycerate (cDPG). This has been found to occur in several methanogens in high concentrations of up to 1 Molar. The highest intracellular concentrations of cDPG were detected in the hyperthermophilic methanogens Methanothermus fervidus (optimal growth temperature, 83 degrees C) and Methanopyrus kandleri (optimal growth temperature, 98 degrees C). Additionally, the intracellular concentration of cDPG increases with temperature up to the optimal growth temperature in M. fervidus. It has been proposed that cDPG is either used an energy store and/or is involved in the process of thermoadaption []. In M. fervidus cDPG is synthesized in two steps starting from 2-phosphoglycerate. In the first reaction, the 2-phosphoglycerate kinase (2PGK) phosphorylates 2-PG using ATP, resulting in 2,3-diphosphoglycerate (2,3-DPG) and ADP. The cDPG synthetase (cDPGS) catalyses the subsequent intramolecular cyclization of 2,3-DPG to cDPG, consuming a second molecule of ATP. Both 2PGK and cDPGS activities have been demonstrated in other cDPG-containing methanogens: Methanobacterium bryantii, Methanobacterium thermoautotrophicum and M. kandleri [].
Proteins containing this domain are a family of phosphoinositide phosphatases with substrates that include phosphatidylinositol-4,5-diphosphate and phosphatidylinositol-3,4,5-trisphosphate. This family is conserved in deuterostomes; VSP was first identified as a sperm flagellar plasma membrane protein in Ciona intestinalis []. Gene duplication events in primates resulted in the presence of paralogs, transmembrane phosphatase with tensin homology (TPTE) and TPTE2, that retain protein domain architecture but, in the case of TPTE, have lost catalytic activity. TPTE, also called cancer/testis antigen 44 (CT44), may play a role in the signal transduction pathways of the endocrine or spermatogenic function of the testis. TPTE2, also called TPTE and PTEN homologous inositol lipid phosphatase (TPIP), occurs in several differentially spliced forms; TPIP alpha displays phosphoinositide 3-phosphatase activity and is localized on the endoplasmic reticulum, while TPIP beta is cytosolic and lacks detectable phosphatase activity [, ]. VSP/TPTE proteins contain an N-terminal voltage sensor consisting of four transmembrane segments, a protein tyrosine phosphatase (PTP)-like phosphoinositide phosphatase catalytic domain, followed by a regulatory C2 domain [].
Lectins are structurally diverse proteins that bind to specific carbohydrates. This family includes the VIP36 and ERGIC-53 lectins. These two proteins were the first members of the family of animal lectins similar to the leguminous plant lectins []. The alignment for this family is towards the N terminus, where the similarity of VIP36 and ERGIC-53 is greatest. Although they have been identified as a family of animal lectins, this alignment also includes yeast sequences[]. ERGIC-53 is a 53kDa protein, localised to the intermediate region between the endoplasmic reticulum and the Golgi apparatus (ER-Golgi-Intermediate Compartment, ERGIC). It was identified as a calcium-dependent, mannose-specific lectin []. Its dysfunction has been associated with combined factors V and VIII deficiency, suggesting an important and substrate-specific role for ERGIC-53 in the glycoprotein-secreting pathway [, ].The L-type lectin-like domain has an overall globular shape composed of a β-sandwich of two major twisted antiparallel β-sheets. The β-sandwich comprises a major concave β-sheet and a minor convex β-sheet, in a variation of the jelly roll fold [, , , ].
The SGT1-specific (SGS) domain is a module of ~90 amino acids, which wasinitially identified in eukaryotic Sgt1 proteins []. It was latter also found in calcyclin-binding proteins []. The SGS domain has been shown to bind to proteins of the S100 family, which are thought to function as sensors of calcium ion concentration in the cell [].In budding yeasts, Sgt1 is required for both SCF (Skp1p/Cdc53p-Cullin-F-box)-mediated ubiquitination, cyclic AMP pathway activity and kinetochore function []. Its Schizosaccharomyces pombe homologue, Git7, is required for glucose and cyclic AMP signaling, cell wall integrity, and septation []. Its two homologues in Arabidopsis, SGT1a and SGT1b, can complement two yeast temperature-sensitive sgt1 mutant alleles, suggesting that fundamental cellular function(s) of yeast SGT1in SCF-mediated protein ubiquitylation. Moreover, SGT1a and SGT1b can act as cochaperones with HSP90 and HSC70 and function in regulating multiple resistance (R) genes and environmental responses [, , , ]. The SGS domain of SGT1 is a key determinant of the HSC70-SGT1 association [].Calcyclin (S100A6) is a member of the S100A family of calcium binding proteins and appears to play a role in cell proliferation [].
This family consists of several eukaryotic transcription elongation Spt5 proteins. These proteins contain two copies of a domain (Supt5; ) that is characteristic of proteins involved in chromatin regulation. An NGN domain separates the Supt5 domains. In yeast Spt5 protein, this domain possesses a RNP-like fold and it is thought to confer affinity for Spt4 protein. Supt5 domains are followed by four to five copies of a KOW domain (), present in many ribosomal proteins.Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans, including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles [].This information was partially derived from InterPro ().
Human HSPA4 (also known as 70kDa heat shock protein 4, APG-2, HS24/P52, hsp70 RY, and HSPH2) responds to acidic pH stress, is involved in the radioadaptive response, is required for normal spermatogenesis and is overexpressed in hepatocellular carcinoma [, , ]. It participates in a pathway along with NBS1 (Nijmegen breakage syndrome 1, also known as p85 or nibrin), heat shock transcription factor 4b (HDF4b), and HSPA14 (belonging to a different HSP70 subfamily) that induces tumor migration, invasion, and transformation []. HSPA4 expression in sperm was increased in men with oligozoospermia, especially in those with varicocele []. HSPA4 belongs to the 105/110kDa heat shock protein (HSP105/110) subfamily of the HSP70-like family []. HSP105/110s are believed to function generally as co-chaperones of HSP70 chaperones, acting as nucleotide exchange factors (NEFs), to remove ADP from their HSP70 chaperone partners during the ATP hydrolysis cycle. HSP70 chaperones assist in protein folding and assembly, and can direct incompetent 'client' proteins towards degradation. Like HSP70 chaperones, HSP105/110s have an N-terminal nucleotide-binding domain (NBD) and a C-terminal substrate-binding domain (SBD) [].This entry represents the N-terminal nucleotide-binding domain of HSPA4.
Homeodomain proteins are transcription factors that share a related DNA-binding homeodomain []. The homeodomain was initially identified in Drosophila melanogaster (Fruit fly) homeotic and segmentation proteins, but is well conserved throughout metazoans [, ]. The homeodomain binds DNA through a helix-turn-helix (HTH) structure, consisting of approximately 20 residues []. The HTH motif is comprised of two α-helices that make intimate contacts with the DNA; the second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions. These interactions occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure and is joined to the second through a short turn.Most proteins which contain a homeobox domain can be classified [, ],on the basis of their sequence characteristics, into three subfamilies, engrailed, antennapedia andpaired. A number of different proteins contain homeodomains, including Drosophila engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and Hox proteins. Hox genes encode homeodomain-containing transcriptional regulators that operate differential genetic programs along the anterior-posterior axis of animal bodies []. The homeodomain motif is very similar in sequence identity and structure to domains in other DNA-binding proteins, including recombinases, GARP response regulators, human telomeric protein, AraC type transcriptional activator and tetracycline repressor [, , ].This entry identifies a conserved region of some 20 amino-acid residues, specific to engrailed proteins, located at the C-terminal of the 'homeobox' domain; the specific function of these residues is unclear.
Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.Dynein is composed of a number of ATP-binding large subunits (see ), intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family [, , ]. Proteins in this family act as one of several non-catalytic accessory components of the cytoplasmic dynein 1 complex that are thought to be involved in linking dynein to cargos and to adapter proteins that regulate dynein function and may play a role in changing or maintaining the spatial distribution of cytoskeletal structures. In yeast, it was identified as a component of the nuclear pore complex where it may contribute to the stable association of the Nup82 subcomplex with the nuclear pore complex [].Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer α-β core structure consisting of beta-alpha(2)-beta-X-beta(2) [, ].
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Yersinia spp. secrete effector proteins called YopB and YopD that facilitate the spread of other translocated proteins through the type III needle and the host cell cytoplasm []. In turn, the transcription of these moieties is thought to be regulated by another gene, lcrV, found on the Yops virulon that encodes the entire type III system []. The product of this gene, LcrV protein, also regulates the secretion of YopD through the type III translocon [], and itself acts as a protective "V"antigen for Yersinia pestis, the causative agent of plague [].Recently, a homologue of the Y. pestis LcrV protein (PcrV) was found in Pseudomonas aeruginosa, an opportunistic pathogen. In vivo studies using mice found that immunisation with the protein protected burned animals from infection by P. aeruginosa, and enhanced survival. In addition, it is speculated that PcrV determines the size of the needle pore for type III secreted effectors [].
The zona occuldens proteins (ZO-1, ZO-2 and ZO-3) are a family of tight junction associated proteins that function as cross-linkers, anchoring the TJ strand proteins to the actin-based cytoskeleton []. Each protein contains three PDZ (postsynaptic density, disc-large, ZO-1) domains, a single SH3 (Src Homology-3) domain and a GK (guanylate kinase) domain, the presence of which identifies them as members of the membrane-associated guanylate kinase (MAGUK) protein family. They also share an acidic domain at the C-terminal region of the molecules not found in other MAGUK proteins. It has been demonstrated that the first PDZ domain is involved in binding the C-terminal -Y-V motif of claudins []. By contrast, the occludin-binding domain of ZO-1 has been shown to lie in the GK and acidic domains []. Although the precise location of the actin-binding motif has not been elucidated, it appears to be within the C-terminal half of the molecules, since transfection of this region into fibroblasts induces co-localisation of ZO-1 and ZO-2 with actin fibres.This entry represents ZO-2, which was first identified as a 160kDa protein that co-immunoprecipitates with ZO-1 []. It shares ~65% overall similarity with ZO-1 and ZO-3 proteins, with highest levels of similarity in the MAGUK and acid domains. In vitro binding studies indicate that ZO-2 may interact directly with ZO-1 through its second PDZ domain, although it does not appear to bind directly to ZO-3.
The zona occuldens proteins (ZO-1, ZO-2 and ZO-3) are a family of tight junction associated proteins that function as cross-linkers, anchoring the TJ strand proteins to the actin-based cytoskeleton []. Each protein contains three PDZ (postsynaptic density, disc-large, ZO-1) domains, a single SH3 (Src Homology-3) domain and a GK (guanylate kinase) domain, the presence of which identifies them as members of the membrane-associated guanylate kinase (MAGUK) protein family. They also share an acidic domain at the C-terminal region of the molecules not found in other MAGUK proteins. It has been demonstrated that the first PDZ domain is involved in binding the C-terminal -Y-V motif of claudins []. By contrast, the occludin-binding domain of ZO-1 has been shown to lie in the GK and acidic domains []. Although the precise location of the actin-binding motif has not been elucidated, it appears to be within the C-terminal half of the molecules, since transfection of this region into fibroblasts induces co-localisation of ZO-1 and ZO-2 with actin fibres.This entry represents ZO-3, which was first identified as a 130kDa protein that co-immunoprecipitates with ZO-1 []. It shares ~65% overall similarity with ZO-1 and ZO-2 proteins, with highest levels of similarity in the MAGUK and acid domains. In vitro binding studies indicate that ZO-3 may interact directly with ZO-1 through its second PDZ domain, although it does not appear to bind directly to ZO-2 [].
TASK is a member of the TWIK-related (two P-domain) K+channel familyidentified in human tissues []. It is widely distributed, being particularly abundant in the pancreas and placenta, but it is also found inthe brain, heart, lung and kidney. Its amino acid identity to TWIK-1 and TREK-1 is rather low, being about 25-28%. However, it is thought to share the same topology of four TM segments, with two P-domains. TASK is very sensitive to variations in extracellular pH in the physiological range, changing from fully-open to closed in approximately 0.5 pH units around pH 7.4. Thus, it may well be a biological sensor of external pH variations.Potassium channel subfamily K member 3 (KCNK3, also known as TASK-1) was the first member of the TASK family to be cloned. It is widelydistributed, being particularly abundant in the pancreas and placenta, but is also found in the brain, heart, lung and kidney. In addition to the maintenance of the resting membrane potential, it is also involved in K+ transport associated with recycling/secretion and the modulation of electrical activity of excitable cells [].
Avidin []is a minor constituent of egg white in several groups of oviparous vertebrates. Avidin, which was discovered in the 1920's, takes its name from the avidity with which it binds biotin. These two molecules bind so strongly that is extremely difficult to separate them. Streptavidin is a protein produced by Streptomyces avidinii which also binds biotin and whose sequence is evolutionary related to that of avidin.Avidin and streptavidin both form homotetrameric complexes of noncovalently associated chains. Each chain forms a very strong and specific non-covalent complex with one molecule of biotin. The three-dimensional structures of both streptavidin [, ]and avidin []have been determined and revealed them to share a common fold: an eight stranded anti-parallel β-barrel with a repeated +1 topology enclosing an internal ligand binding site.Fibropellins I and III []are proteins that form the apical lamina of the sea urchin embryo, a component of the extracellular matrix. These two proteins have a modular structure composed of a CUB domain (see), followed by a variable number of EGF repeats and a C-terminal avidin-like domain.
Avidin []is a minor constituent of egg white in several groups of oviparous vertebrates. Avidin, which was discovered in the 1920's, takes its name from the avidity with which it binds biotin. These two molecules bind so strongly that is extremely difficult to separate them. Streptavidin is a protein produced by Streptomyces avidinii which also binds biotin and whose sequence is evolutionary related to that of avidin.Avidin and streptavidin both form homotetrameric complexes of noncovalently associated chains. Each chain forms a very strong and specific non-covalent complex with one molecule of biotin. The three-dimensional structures of both streptavidin [, ]and avidin []have been determined and revealed them to share a common fold: an eight stranded anti-parallel β-barrel with a repeated +1 topology enclosing an internal ligand binding site.Fibropellins I and III []are proteins that form the apical lamina of the sea urchin embryo, a component of the extracellular matrix. These two proteins have a modular structure composed of a CUB domain (see), followed by a variable number of EGF repeats and a C-terminal avidin-like domain.
Liprin-alpha-3 is a member of the LAR (leukocyte common antigen-related) protein tyrosine phosphatase-interacting protein (liprin) family []. Liprin-alpha family proteins in mammals are expressed at high levels in the brain and are engaged in high-affinity interactions with many presynaptic active zone proteins []. Liprin-alpha-3 is one of the predominant Liprin isoforms in the hippocampus [, ]. It interacts with RIM1-alpha and other active zone proteins to form a protein scaffold in the presynaptic nerve terminal [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains. The N-terminal coiled coils of liprin-alpha act as binding regions for several synaptic protein, while the SAM repeats can bind to both phosphatases and protein kinases []. The autophosphorylation of liprin regulates its association with LAR []. Interestingly, all Liprin-alpha genes are subject to alternative splicing, which is regulated in a developmental manner []. The structure of the human CASK/liprin-alpha/liprin-beta ternary complex has been revealed [].
Liprin-alpha-2 is a member of the LAR (leukocyte common antigen-related) protein tyrosine phosphatase-interacting protein (liprin) family []. Liprin-alpha family proteins in mammals are expressed at high levels in the brain and are engaged in high-affinity interactions with many presynaptic active zone proteins []. Liprin-alpha-2 is one of the predominant Liprin isoforms in the hippocampus [, ]. It organises presynaptic ultrastructure and controls synaptic output by regulating the synaptic vesicles pool size [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains. The N-terminal coiled coils of liprin-alpha act as binding regions for several synaptic protein, while the SAM repeats can bind to both phosphatases and protein kinases []. The autophosphorylation of liprin regulates its association with LAR []. Interestingly, all Liprin-alpha genes are subject to alternative splicing, which is regulated in a developmental manner []. The structure of the human CASK/liprin-alpha/liprin-beta ternary complex has been revealed [].
This entry represents liprin-beta mostly from invertebrates. Drosophila liprin-beta interact with liprin-alpha and is required for NMJ (neuromuscular junction) growth []. C. elegans liprin-beta, also known as hlb-1, or liprin-beta homolog, is a regulator for the organisation and function of neuromuscular junctions []. Hlb-1 may be involved in the ageing control in C. elegans [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. In C. elegans and Drosophila, there are only one liprin-alpha (known as Syd-2 and Dliprin-alpha) and one liprin-beta. Another liprin family member, liprin-gamma, has also been identified in Drosophila []. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains []. The N-terminal coiled coils of liprin-alpha mediate interactions with adapter proteins at the presynaptic active zone, while the SAM repeats bind proteins such as LAR receptor tyrosine phosphatase [].
This entry represents liprin-alpha, mostly from invertebrates. In C. elegans, where it is also known as syd-2, it regulates the differentiation of presynaptic termini []and plays an important role in presynaptic active zone formation []. A conserved coiled-coil LH1 domain in syd-2 is essential for its self-assembly during presynaptic assembly [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. In C. elegans and Drosophila, there are only one liprin-alpha (known as Syd-2 and Dliprin-alpha) and one liprin-beta. Another liprin family member, liprin-gamma, has also been identified in Drosophila []. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains []. The N-terminal coiled coils of liprin-alpha mediate interactions with adapter proteins at the presynaptic active zone, while the SAM repeats bind proteins such as LAR receptor tyrosine phosphatase [].
Liprin-alpha-4 is a member of the LAR (leukocyte common antigen-related) protein tyrosine phosphatase-interacting protein (liprin) family []. Liprin-alpha family proteins in mammals are expressed at high levels in the brain and are engaged in high-affinity interactions with many presynaptic active zone proteins []. Liprin-alpha-4 is most abundant at parallel fibre-Purkinje cell synapses in the molecular layer of the hippocampus []. Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions inaxon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains. The N-terminal coiled coils of liprin-alpha act as binding regions for several synaptic protein, while the SAM repeats can bind to both phosphatases and protein kinases []. The autophosphorylation of liprin regulates its association with LAR []. Interestingly, all Liprin-alpha genes are subject to alternative splicing, which is regulated in a developmental manner []. The structure of the human CASK/liprin-alpha/liprin-beta ternary complex has been revealed [].
Liprin-beta-1 is a member of the LAR (leukocyte common antigen-related) protein tyrosine phosphatase-interacting protein (liprin) family []. Liprin-beta-1 interacts with metastasis-associated protein S100A4 (Mts1), and this interaction results in the inhibition of liprin-beta-1 phosphorylation by protein kinase C and protein kinase CK2 in vitro []. In Xenopus, it plays a role in the maintenance of lymphatic vessel integrity [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains. The N-terminal coiled coils of liprin-alpha act as binding regions for several synaptic protein, while the SAM repeats can bind to both phosphatases and protein kinases []. The autophosphorylation of liprin regulates its association with LAR []. Interestingly, all Liprin-alpha genes are subject to alternative splicing, which is regulated in a developmental manner []. The structure of the humanCASK/liprin-alpha/liprin-beta ternary complex has been revealed [].
Liprin-alpha-1 (LIP1, PPFIA1) is a member of the LAR (leukocyte common antigen-related) protein tyrosine phosphatase-interacting protein (liprin) family []. In mast cells, liprin-alpha-1 facilitates exocytosis and cell spreading []. The PPFIA1 gene is a candidate gene for acute lung injury risk following major trauma [].Liprin was originally identified as binding partners of the receptor protein tyrosine phosphatase LAR (leukocyte common antigen-related), which functions in axon guidance and mammary gland development []. In vertebrates, there are two families of liprins, liprin-alpha and liprin-beta, which have four (alpha1-4) and two (beta1-2) members. Liprins contain an N-terminal coiled-coil domain and a C-terminal liprin homology (LH) region comprised of three sterile alpha motif (SAM) domains. The N-terminal coiled coils of liprin-alpha act as binding regions for several synaptic protein, while the SAM repeats can bind to both phosphatases and protein kinases []. The autophosphorylation of liprin regulates its association with LAR []. Interestingly, all Liprin-alpha genes are subject to alternative splicing, which is regulated in a developmental manner []. The structure of the human CASK/liprin-alpha/liprin-beta ternary complex has been revealed [].
Map (MHC class II analogous protein), also known as eap (extracellularadherence protein) and p70, is found in Staphylococcus aureus. It is a cell-wall associated protein, which is capable of binding to a number of different extracellular matrix glycoploteins and plasma proteins, and to the cell surface of S. aureus. Besides the broad binding specificity, map has been shown to be important in the adherence to and internalization of S. aureus by eukaryotic cells as well as being capable ofmodulating inflammatory response through its interactions with ICAM-1 (intercellular adhesion molecule-1), although its biological role in vivo remains to date unclear [].The protein consists of a signal peptide followed by a unique sequence of about 20 amino acids and four to six repeated MAP domains of 110-amino acid residues. Within each repeat there is a subdomain consisting of 31 residues that was found to be highly homologous to the N-terminal beta-chain of many MHC class II molecules [].This entry represents the MAP domain. The crystal structure of this domain has been solved and shows a core fold that is comprised of an α-helix lying diagonally across a five-stranded, mixed β-sheet. This structure is very similar to the C-terminal domain of bacterial superantigens [].
The transport of peptides into cells is a well-documented biological phenomenon which is accomplished by specific, energy-dependent transporters found in a number of organisms as diverse as bacteria and humans. The amino acid/peptide transporter family of proteins is distinct from the ABC-type peptide transporters and was uncovered by sequence analysis of a number of recently discovered peptide transport proteins []. This family consists of bacterial proton-dependent oligopeptide transporters, although they are found in yeast, plants and animals. They function by proton symport in a 1:1 stoichiometry, which is variable in different species. Structurally, these transporters present a conserved architecture consisting of 14 transmembrane α-helices with N-terminal and C-terminal six-helix bundles connected by two transmembrane α-helices (HA and HB) [].Dipeptide and tripeptide permease A (DtpA) is a proton-dependent permease that transports di- and tripeptides, and structurally related peptidomimetics such as aminocephalosporins into the cell [, , ]. The protein shows a distinct preference for dipeptides and tripeptides composed of L-amino acids, and discriminates dipeptides on the basis of the position of charges within the substrate.
Formin homology (FH) proteins play a crucial role in the reorganisation of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis []. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains []. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other [], and may also act to inhibit actin polymerisation []. The FH3 domain () is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) () required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).This superfamily represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains [].
The short-chain dehydrogenases/reductases family (SDR) [, ]is a very large family of enzymes, most of which are known to be NAD- or NADP-dependent oxidoreductases. As the first member of this family to be characterised was Drosophila alcohol dehydrogenase, this family used to be called [, , ]'insect-type', or 'short-chain' alcohol dehydrogenases. Most members of this family are proteins of about 250 to 300 amino acid residues. Most dehydrogenases possess at least 2 domains [], the first binding the coenzyme, often NAD, and the second binding the substrate. This latter domain determines the substrate specificity and contains amino acids involved in catalysis. Little sequence similarity has been found in the coenzyme binding domain although there is a large degree of structural similarity, and it has therefore been suggested that the structure of dehydrogenases has arisen through gene fusion of a common ancestral coenzyme nucleotide sequence with various substrate specific domains [].This entry contains a signature pattern for this family of proteins which covers one of the best conserved regions. It includes two perfectly conserved residues, a tyrosine and a lysine. The tyrosine residue participates in the catalytic mechanism.
The MHYT domain is an about 190-residue domain, which was named after itsconserved amino acid pattern. The MHYT domain consists of six predictedtransmembrane (TM) segments, connected by short arginine-rich cytoplasmicloops and periplasmic loops that are also rich in charged amino acid residues.Three of its TM segments have a very similar amino acid sequence motif withhighly conserved methionine, histidine and tyrosine residues located near theouter face of the cytoplasmic membrane.The MHYT domain has been found in several phylogenetically distant bacteria, either as a separate, single domain protein, or fused to a LytTR-type DNA-binding helix-turn-helix, or fused to signaling domains, such as histidine kinase, GGDEF, EAL, and PAS. It has been proposed that the MHYT domain serves as a sensor domain in some bacterial two-component signal transduction systems as well as in a variety of other bacterial proteins. A model of the membrane topology of the MHYT domain indicates that its conserved residues could coordinate one or two copper ions, suggesting a role in sensing oxygen, CO or NO [].
This is a membrane localization domain found in multiple families of bacterial toxins including all of the clostridial glucosyltransferase toxins and various MARTX toxins (multifunctional-autoprocessing RTX toxins) []. In the Pasteurella multocida toxin (PMT, also known as dermonecrotic toxin) C-terminal fragment, structural analysis have indicated that the C1 domain possesses a signal that leads the toxin to the cell membrane. Furthermore, the C1 domain was found to structurally resemble the phospholipid-binding domain of C. difficile toxin B []. Functional studies in Vibrio cholerae indicate that the subdomain at the N terminus of RID (Rho-inactivation domain), homologous to the membrane targeting C1 domain of Pasteurella multocida toxin, is a conserved membrane localization domain essential for proper localization []. The Rho-inactivation domain (RID) of MARTX (Multifunctional Autoprocessing RTX toxin) is responsible for inactivating the Rho-family of small GTPases in Vibrio cholerae. It is a bacterial toxin that self-process by a cysteine peptidase mechanism []. The Vibrio cholerae RTX toxin is an autoprocessing cysteine protease whose activity is stimulated by the intracellular environment []. This cysteine peptidase belongs to MEROPS peptidase family G6.
This entry represents serine peptidases that belong to MEROPS petidase family S1A (clan PA(S)). They are a novel enzyme with three tandem serine protease domains in a single polypeptide chain. Polyserase-2 (polyserine protease-2) is the second identified human enzyme with several tandem serine protease domains. The first serine protease domain contains all characteristic features of these enzymes, whereas the second and third domains lack one residue of the catalytic triad of serine proteases and are predicted to be catalytically inactive. Both full-length polyserase-2 and its first serine protease domain hydrolyse synthetic peptides used for assaying serine proteases. The activity of the isolated domain was greater than that of the entire protein, suggesting that the two catalytically inactive serine protease domains of polyserase-2 may modulate the activity of the first domain. Polyserase-2 is expressed in foetal kidney, adult skeletal muscle, liver, placenta, prostate, and heart, and tumour cell lines derived from lung and colon adenocarcinomas. In contrast to polyserase-1, this protein is a secreted enzyme whose three protease domains remain as an integral parts of a single polypeptide chain [].
Amphiphysins belong to the expanding BAR (Bin-Amphiphysin-Rvsp) family proteins, all members of which share a highly conserved N-terminal BAR domain, which has predicted coiled-coil structures required for amphiphysin dimerisation and plasma membrane interaction []. Almost all members also share a conserved C-terminal Src homology 3 (SH3) domain, which mediates their interactions with the GTPase dynamin and the inositol-5'-phosphatase synaptojanin 1 in vertebrates and with actin in yeast. The central region of all these proteins is most variable. In mammals, the central region of amphiphysin I and amphiphysin IIa contains a proline-arginine-rich region for endophilin binding and a CLAP domain, for binding to clathrin and AP-2. The interactions mediated by both the central and C-terminal domains arebelieved to be modulated by protein phosphorylation [, ].Amphiphysins are proteins that are thought to be involved in clathrin-mediated endocytosis, actin function, and signalling pathways. In vertebrates, amphiphysins may regulate, but are not essential for clathrin-mediated endocytosis of SVs. However, in Drosophila amphiphysin is not involved at all in SV endocytosis but is required for T-tubule structure and excitation-contraction coupling muscles and plays a role in membrane morphogenesis in developing photoreceptors and a variety of other cells [].Amphiphysin 2 was the second amphiphysin family member found in mammals.The gene encoding it has been found to be alternatively spliced. Thevarious products have been named: BIN-1, Sh3P9, BRAMP-2 and ALP-1. Theyhave different distribution patterns, with the largest form (~95 kD) beingexpressed solely in the brain, where it shares a very similar (if notidentical) distribution pattern to amphiphysin 1 [].
This entry represents the Rab29/Rab38/Rab32 subfamily. They are members of the Rab family of small GTPases. Human Rab32 was first identified in platelets but it is expressed in a variety of cell types, where it functions as an A-kinase anchoring protein (AKAP) []. Rab32 and closely related Rab38 are functionally redundant regulators of melanosomal protein trafficking and melanocyte pigmentation [, ].Ras-related protein Rab29 (also known as RAB7L) is related to Rab32 and Rab38. Rab29 regulates phagocytosis and traffic from the Golgi to the lysosome []. It is associated with trans-Golgi network (TGN) and is essential for maintaining the integrity of the TGN. Together with LRRK2, it plays a role in the retrograde trafficking pathway for recycling proteins, such as mannose 6 phosphate receptor (M6PR), between lysosomes and the Golgi apparatus in a retromer-dependent manner []. Rab29, Rab32 and Rab38 can be cleaved by GtgE, which is an effector protein from Salmonella Typhimurium that modulates trafficking of the Salmonella-containing vacuole (SCV) []. By targeting these GTPases, GtgE allows survival of the pathogen by preventing the delivery of antimicrobial factors to the SCV [].
Proteins in this entry are members of the radical SAM superfamily of enzymes that utilise an iron-sulphur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions []. This group of proteins are frequently encoded in the same locus as squalene-hopene cyclase (SHC, ) and other proteins associated with the biosynthesis of hopanoid natural products. The linkage between SHC and this radical SAM enzyme is strong; one is nearly always observed in the same genome where the other is found. A hopanoid biosynthesis locus was described in Zymomonas mobilis consisting of the genes for HpnA-E and SHC (HpnF) []. Continuing past SHC are the genes for a phosphorylase enzyme (ZMO0873, i.e. HpnG, ) and this radical SAM enzyme (ZMO0874) which we name here HpnH. Granted, in Z. mobilis, HpnH is in a convergent orientation with respect to HpnA-G, but one gene beyond HpnH and running in the same convergent direction is IspH (ZM0875, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase), an essential enzyme of IPP biosynthesis and therefore essential for the biosynthesis of hopanoids. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift.
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Proteins in this entry contain glycosyl transferase family 2 domains which are responsible, generally, for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. These proteins are often encoded in the same genetic locus as squalene-hopene cyclase genes, and are never associated with genes for the metabolism of phytoene. Indeed, proteins in this entry appear to never be encoded in a genome lacking squalene-hopene cyclase (SHC), although not all genomes encoding SHC have this glycosyl transferase. In the organism Zymomonas mobilis the linkage of this protein to hopanoid biosynthesis has been noted and it was named HpnB []. Hopanoids are known to feature polar glycosyl head groups in many organisms.
Diol dehydratase () and glycerol dehydratase () are two iso-functional enzymes that can each catalyse the conversion of 1,2-propanediol, 1,2-ethanediol and glycerol to the corresponding deoxy aldehydes (propionaldehyde, acetaldehyde and 3-hydroxypropionaldehyde, respectively). This reaction proceeds by a radical mechanism involving coenzyme B12 (adenosylcobalamin, AdoCbl) as an essential cofactor. Even though they catalyse the same reaction, these two enzymes (1) differ in their substrate preferences (diol dehydratase has a higher affinity for 1,2-propanediol and glycerol dehydratase for glycerol []); (2) they participate in different pathways (dihydroxyacetone [DHA]pathway for glycerol dehydratase and 1,2-propanediol degradation pathway for diol dehydratase); and (3) in those organisms where both enzymes are produced (such as Klebsiella and Citrobacter), the genes for them are independently regulated: glycerol dehydratase is induced when Klebsiella pneumoniae grows in glycerol-containing medium, whereas diol dehydratase is fully induced when it grows in propane-1,2-diol-containing medium, but only slightly in the glycerol medium [, ]. Crystal structures, mechanism of action and structure-function relationship with the coenzyme B12 have been extensively studied for these enzymes []. Diol/glycerol dehydratases undergo inactivation during catalysis and require a reactivating factor. Propanediol dehydratase was found to be associated with and is believed to be encased in the proteinaceous shell of polyhedral organelles [].Both diol dehydratase and glycerol dehydratase comprise three subunits: PduC/PduD/PduE or PddA/PddB/PddC for propanediol dehydratase, and GldA/Gld/B/GldC or DhaB/DhaC/DhaE for glycerol dehydratase. This entry represents the small subunit PduE/PddC/GldC/DhaE.
Diol dehydratase () and glycerol dehydratase () are two iso-functional enzymes that can each catalyse the conversion of 1,2-propanediol, 1,2-ethanediol and glycerol to the corresponding deoxy aldehydes (propionaldehyde, acetaldehyde and 3-hydroxypropionaldehyde, respectively). This reaction proceeds by a radical mechanism involving coenzyme B12 (adenosylcobalamin, AdoCbl) as an essential cofactor. Even though they catalyse the same reaction, these two enzymes (1) differ in their substrate preferences (diol dehydratase has a higher affinity for 1,2-propanediol and glycerol dehydratase for glycerol []); (2) they participate in different pathways (dihydroxyacetone [DHA]pathway for glycerol dehydratase and 1,2-propanediol degradation pathway for diol dehydratase); and (3) in those organisms where both enzymes are produced (such as Klebsiella and Citrobacter), the genes for them are independently regulated: glycerol dehydratase is induced when Klebsiella pneumoniae grows in glycerol-containing medium, whereas diol dehydratase is fully induced when it grows in propane-1,2-diol-containing medium, but only slightly in the glycerol medium [, ]. Crystal structures, mechanism of action and structure-function relationship with the coenzyme B12 have been extensively studied for these enzymes []. Diol/glycerol dehydratases undergo inactivation during catalysis and require a reactivating factor. Propanediol dehydratase was found to be associated with and is believed to be encased in the proteinaceous shell of polyhedral organelles [].Both diol dehydratase and glycerol dehydratase comprise three subunits: PduC/PduD/PduE []or PddA/PddB/PddC []for propanediol dehydratase, and GldA/Gld/B/GldC or DhaB/DhaC/DhaE for glycerol dehydratase.This entry represents the medium subunit PduD/PddB/GldB/DhaC.
This entry includes archaeal exosome complex component Csl4 and its homologues from eukaryotes. In budding yeast, Csl4 is also known as Ski4 due to itssuperkiller (SKI) phenotype first described as a more efficient ability to kill sensitive non-killer yeast strains [, ]. Later, it was found to be part of the yeast exosome complex involved in 3'-5' RNA processing and degradation in both the nucleus and the cytoplasm [].Csl4 is a non-catalytic component of the exosome, a complex involved in RNA processing and degradation [, ]. The exact composition of the exosome varies, depending on the organism or the subcellular localization, but in all cases it is composed of a ring-shaped core made of three heterodimers (Rrp41p/Rrp45p, Rrp43p /Rrp46p, Rrp42p/Mtr3p) stabilized by the presence of three other proteins (Csl4/Ski4, Rrp4p, Rrp40p) []. The presence of different proteins in the cap may enable interactions with different substrates. It has been shown that the archaeal DnaG protein needs Csl4 for binding to the exosome. DnaG is a poly(A)-binding protein and enhances the degradation of adenine-rich transcripts by the Csl4-exosome [].
Avidin []is a minor constituent of egg white in several groups of oviparous vertebrates. Avidin, which was discovered in the 1920's, takes its name from the avidity with which it binds biotin. These two molecules bind so strongly that is extremely difficult to separate them. Streptavidin is a protein produced by Streptomyces avidinii which also binds biotin and whose sequence is evolutionary related to that of avidin. Avidin and streptavidin both form homotetrameric complexes of noncovalently associated chains. Each chain forms a very strong and specific non-covalent complex with one molecule of biotin.The three-dimensional structures of both streptavidin [, ]and avidin []have been determined and revealed them to share a common fold: an eightstranded anti-parallel β-barrel with a repeated +1 topology enclosing aninternal ligand binding site.Fibropellins I and III []are proteins that form the apical lamina of the seaurchin embryo, a component of the extracellular matrix. These two proteinshave a modular structure composed of a CUB domain (see), followedby a variable number of EGF repeats and a C-terminal avidin-like domain. This entry represents this avidin-like domain.
The SANT domain is a motif of ~50 amino acids present in proteins involved in chromatin-remodelling and transcription regulation. This eukaryotic domain was identified in nuclear receptor co-repressors and named after switching-defective protein 3 (Swi3), adaptor 2 (Ada2), nuclear receptor co-repressor (N-CoR) and transcription factor (TF)IIIB []. Although SANT domains show remarkable sequence and structural similarity to the DNA-binding helix-turn-helix (HTH) domain of the myb-like tandem repeat, their function is not DNA binding. Instead, SANT domains are protein-protein interaction modules and some can bind to histone tails (e.g. in Ada2 and SMRT). The SANT domain has been proposed to function as a histone-interaction module that couples histone-tail binding to enzyme catalysis for the remodelling of nucleosomes [, ].SANT domains are found in combination with other domains, such as the SWIRM domain (), the ZZ-type zinc finger (see ), the C2H2-type zinc finger, the GATA-type zinc finger (), the MPN-domain and DEAH ATP-helicase domain. The 3-dimensional structure of the SANT domain forms three alpha helices []similar to the DNA-binding myb-type HTH domain. Because of the strong resemblance, the SANT domain can also be detected as a myb-like "DNA-binding"domain. Most SANT domains have acidic amino acids at the start of helix 2 and in helix 3, while myb-like DNA-binding domains have more positively charged residues, in particular in their third 'recognition' helix. The bulky aromatic and hydrophobic residues in the centre of helix 3 that are incompatible with DNA contacts of myb-like DNA-binding domains form another distinguishing property of SANT domains.
Metallothioneins (MT) are small proteins that bind heavy metals, such as zinc, copper, cadmium, nickel, etc. They have a high content of cysteine residues that bind the metal ions through clusters of thiolate bonds [, ]. An empirical classification into three classes has been proposed by Fowler and coworkers []and Kojima []. Members of class I are defined to include polypeptides related in the positions of their cysteines to equine MT-1B, and include mammalian MTs as well as from crustaceans and molluscs. Class II groups MTs from a variety of species, including sea urchins,fungi, insects and cyanobacteria. Class III MTs are atypical polypeptides composed of gamma-glutamylcysteinyl units [].This original classification system has been found to be limited, in the sense that it does not allow clear differentiation of patterns of structural similarities, either between or within classes. Subsequently, a new classification was proposed on the basis of sequence similarity derived from phylogenetic relationships, which basically proposes an MT family for each main taxonomic group of organisms []. This superfamily represents the structural domain of metallothioneins from both eukaryotes and prokayotes []. These proteins have a metal (iron)-bound fold that is duplicated, consisting of clear structural/sequence repeats.
RuvB-like helicase 1 (RUVBL1 or TIP49a; ) has single-stranded DNA-stimulated ATPase and ATP-dependent DNA helicase (3' to 5') activity. It forms a homohexamer which is critical for ATP hydrolysis, and forms a dodecamer with RUVBL2 [, ]. RUVBL1 is an essential cofactor for oncogenic transformation by c-Myc []. It is a component of the NuA4 histone acetyltransferase [], BAF53 [], MLL1/MLL [], INO80 []and R2TP []complexes.The NuA4 histone acetyltransferase complex (also known as the TRRAP/TIP60-containing histone acetyltransferase complex) acetylates nucleosomal histones H4 and H2A thereby activating selected genes for transcription and is a a key regulator of transcription, cellular response to DNA damage and cell cycle control []. In yeast, where the complex was first identified, NuA4 consists of at least ACT1, ARP4, YAF9, VID21, SWC4, EAF3, EAF5, EAF6, EAF7, EPL1, ESA1, TRA1 and YNG2 []. In humans, the complex is composed of the histone acetyltransferase KAT5 (also known as TIP60) plus the subunits EP400, TRRAP/PAF400, BRD8/SMAP, EPC1, MAP1/DNMAP1, RUVBL1/TIP49, RUVBL2, ING3, actin, ACTL6A/BAF53A, MORF4L1/MRG15, MORF4L2/MRGX, MRGBP, YEATS4/GAS41, VPS72/YL1 and MEAF6 [].
RuvB-like helicase 2 (RUVBL2 or TIP49b; ) has single-stranded DNA-stimulated ATPase and ATP-dependent DNA helicase (3' to 5') activity. It forms a homohexamer which is critical for ATP hydrolysis, and forms a dodecamer with RUVBL1 [, ]. RUVBL2 is an essential cofactor for oncogenic transformation by c-Myc []. It is a component of the NuA4 histone acetyltransferase [], BAF53 [], MLL1/MLL [], INO80 []and R2TP []complexes.The NuA4 histone acetyltransferase complex (also known as the TRRAP/TIP60-containing histone acetyltransferase complex) acetylates nucleosomal histones H4 and H2A thereby activating selected genes for transcription and is a a key regulator of transcription, cellular response to DNA damage and cell cycle control []. In yeast, where the complex was first identified, NuA4 consists of at least ACT1, ARP4, YAF9, VID21, SWC4, EAF3, EAF5, EAF6, EAF7, EPL1, ESA1, TRA1 and YNG2 []. In humans, the complex is composed of the histone acetyltransferase KAT5 (also known as TIP60) plus the subunits EP400, TRRAP/PAF400, BRD8/SMAP, EPC1, MAP1/DNMAP1, RUVBL1/TIP49, RUVBL2, ING3, actin, ACTL6A/BAF53A, MORF4L1/MRG15, MORF4L2/MRGX, MRGBP, YEATS4/GAS41, VPS72/YL1 and MEAF6 [].
This entry represents reverse gyrases found in both bacteria and archaea. Reverse gyrase, a fusion of a type I topoisomerase domain and a helicase domain, introduces positive supercoiling to increase the melting temperature of DNA double strands. Generally, these gyrases are encoded as a single polypeptide. An exception was found in Methanopyrus kandleri, where enzyme is split within the topoisomerase domain, yielding a heterodimer of gene products designated RgyB and RgyA. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA []. Type I topoisomerases can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). Most of the Type I topoisomerases are ATP-independent and are responsible for relaxing positively and/or negatively supercoiled DNA. Reverse gyrase is a unique type IA topoisomerase in that it requires ATP and can introduce positive supercoils into DNA.