The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This entry represents the ePHD finger of Histone-lysine N-methyltransferase 2B (KMT2B), also known as MLL2. KMT2B is a second human homologue of Drosophila trithorax, located on chromosome 19 [, ]. It belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2) and is vital for normal mammalian embryonic development []. KMT2B functions as the catalytic subunit in the MLL2 complex, which contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex. The MLL2 complex is highly active and specific for histone 3 lysine 4 (H3K4) methylation, which stimulates chromatin transcription in a SAM- and H3K4-dependent manner []. Moreover, KMT2B plays a critical role in memory formation by mediating hippocampal H3K4 di- and trimethylation []. It is also required for RNA polymerase II association and protection from DNA methylation at the MagohB CpG island promoter []. KMT2B contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, this ePHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.
Cell division cycle and apoptosis regulator protein 1 (CCAR1) associates with components of the Mediator and p160 coactivator complexes that play a role as intermediaries transducing regulatory signals from upstream transcriptional activator proteins to basal transcription machinery. CCAR1 also functions as a p53 coactivator and regulates expression of key proliferation-inducing genes [].Cell cycle and apoptosis regulator protein 2 (CCAR2, also known as DBC-1) regulates biological processes such as transcription, heterochromatin formation, metabolism, mRNA splicing, apoptosis, and cell proliferation []. It is a core component of the DBIRD complex, which affects local transcript elongation rates and alternative splicing of a large set of exons embedded in (A + T)-rich DNA regions []. It binds to SIRT1 and is a negative regulator of SIRT1 []. DBC-1 has been implicated in tumorigenesis [].This entry also includes protein SHORT ROOT IN SALT MEDIUM 1 (RSA1, also known as EMB1579) from Arabidopsis. It regulates the transcription of several genes involved in the detoxification of reactive oxygen species generated by salt stress and the SOS1 gene that encodes a plasma membrane Na(+)/H(+) antiporter essential for salt tolerance []. RSA1 is localised to the nucleus and the loss of function of RSA1 affects global transcription and mRNA splicing [].
This is the bZIP domain found in plant transcription factors with similarity to Oryza sativa RF2a and RF2b, which are important for plant development. RF2a and b interact, as homodimers or heterodimers, with each other, and activate transcription from the RTBV (rice tungro bacilliform virus) promoter, which is regulated by sequence-specific DNA-binding proteins that bind to the essential cis element BoxII. They show differences in binding affinities to BoxII, expression patterns in different rice organs, and subcellular localisation. Transgenic rice with increased RF2a and RF2b display increased resistance to rice tungro disease (RTD) with no impact on plant development [, ].bZIP domains from Arabidopsis have been classified into 11 groups (groups A-I and S), the ones included in this entry belong to group I such as VIP1 or PosF21 (also known as bZIP transcription factor 59) [, , , ].bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription [].
This domain can be found in plant bZIP transcription factors, including Arabidopsis thaliana G-box binding factor 1 (GBF1) [], Zea mays Opaque-2 []and Ocs element-binding factor 1 (OCSBF-1) [], Triticum aestivum Histone-specific transcription factor HBP1 (or HBP-1a) [], Petroselinum crispum Light-inducible protein CPRF3 and CPRF6, and Nicotiana tabacum BZI-3 [], among many others []. Opaque-2 plays a role in affecting lysine content and carbohydrate metabolism, acting indirectly on starch/amino acid ratio []. bZIP G-box binding factors (GBFs) contain an N-terminal proline-rich domain in addition to the bZIP domain. GBFs are involved in developmental and physiological processes in response to stimuli such as light or hormones [, ]. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes. The bZIP structural motif contains a basic region and a leucine zipper, composed of alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization of leucine zippers creates a pair of the adjacent basic regions that bind DNA and undergo conformational change. Dimerization occurs in a specific and predictable manner resulting in hundreds of dimers having unique effects on transcription [].
The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in plexins [], receptors for multiple classes of semaphorins, in hepatocyte growth factor receptor, and in viral proteins [].The Sema domain is characterised by a conserved set of cysteine residues, which form four disulphide bonds to stabilise the structure. The Sema domain fold is a variation of the beta propeller topology, with seven blades radially arranged around a central axis. Each blade contains a four-stranded (strands A to D) antiparallel beta sheet. The inner strand of each blade (A) lines the channel at the centre of the propeller, with strands B and C of the same repeat radiating outward, and strand D of the next repeat forming the outer edge of the blade. The large size of the Sema domain is not due to a single inserted domain but results from the presence of additional secondary structure elements inserted in most of the blades. The Sema domain uses a 'loop and hook' system to close the circle between the first and the last blades. The blades are constructed sequentially with an N-terminal β-strand closing the circle by providing the outermost strand (D) of the seventh (C-terminal) blade. The β-propeller is further stabilised by an extension of the N terminus, providing an additional, fifth β-strand on the outer edge of blade 6 [, , ].
Arf6 (ADP ribosylation factor 6) proteins localize to the plasma membrane, where they perform a wide variety of functions. In its active, GTP-bound form, Arf6 is involved in cell spreading, Rac-induced formation of plasma membrane ruffles, cell migration, wound healing, and Fc-mediated phagocytosis. Arf6 appears to change the actin structure at the plasma membrane by activating Rac, a Rho family protein involved in membrane ruffling. Arf6 is required for and enhances Rac formation of ruffles. Arf6 can regulate dendritic branching in hippocampal neurons, and in yeast it localizes to the growing bud, where it plays a role in polarized growth and bud site selection. In leukocytes, Arf6 is required for chemokine-stimulated migration across endothelial cells. Arf6 also plays a role in down-regulation of beta2-adrenergic receptors and luteinizing hormone receptors by facilitating the release of sequestered arrestin to allow endocytosis. Arf6 is believed to function at multiple sites on the plasma membrane through interaction with a specific set of GEFs, GAPs, and effectors []. Arf6 has been implicated in breast cancer and melanoma cell invasion [, ], and in actin remodelling at the invasion site of Chlamydia infection [].It's worth noting that the Arf6 homologue in Saccharomyces cerevisiae is known as Arf3 [].
The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes [, ]. These proteins are classified as v-SNAREs and t-SNAREs based on their localisation on vesicle or target membrane while another classification scheme defines R-SNAREs and Q-SNAREs, as based on the conserved arginine or glutamine residue in the centre of the SNARE motif []. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.The N- and C-terminal coiled-coil domains of members of the SNAP-25 family and the most C-terminal coiled-coil domain of the syntaxin family are related to each other and form a new homology domain of approximately 60 amino acids. This domain is also found in other known proteins involved in vesicular membrane traffic, some of which belong to different protein families [].
Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine []. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set [, ].The domains MsrA and MsrB reduce different epimeric forms of methionine sulphoxide. This group represent MsrA, the crystal structure of which has been determined in a number of organisms. In Mycobacterium tuberculosis, the MsrA structure has been determined to 1.5 Angstrom resolution []. In contrast to the three catalytic cysteine residues found in previously characterised MsrA structures, M. tuberculosis MsrA represents a class containing only two functional cysteine residues. The overall structure shows no resemblance to the structures of MsrB () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. In a number of pathogenic bacteria including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis a thioredoxin domain is fused to the N terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.
Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine []. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set [, ].The domains: MsrA and MsrB, reduce different epimeric forms of methionine sulphoxide. This group represents MsrB, the crystal structure of which has been determined to 1.8A []. The overall structure shows no resemblance to the structures of MsrA () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. Unlike the MsrA domain, the MsrB domain activates the cysteine or selenocysteine nucleophile through a unique Cys-Arg-Asp/Glu catalytic triad. The collapse of the reaction intermediate most likely results in the formation of a sulphenic or selenenic acid moiety. Regeneration of the active site occurs through a series of thiol-disulphide exchange steps involving another active site Cys residue and thioredoxin.In a number of pathogenic bacteria, including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis, a thioredoxin domain is fused to the N terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.
Bacteria synthesise a set of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins [, ]. Examples include the HU protein in Escherichia coli which is a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae []. The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control []is found in enterobacteria and viral proteins include the African Swine fever virus protein Pret-047 (also known as A104R or LMW5-AR) [].The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilising it from denaturation under extreme environmental conditions. The structure is known for one of these proteins []. The protein exists as a dimer and two "β-arms"function as the non-specific binding site for bacterial DNA.
Temperature downshift produces a number of changes in cellular physiology including decreased membrane fluidity, reduced mRNA transcription and translation due to the stabilisation of secondary structures, inefficient folding of some proteins, and reduced enzyme activity []. In response to this, bacteria produce a set of proteins, known as the cold-shock proteins (CSPs), to counteract these harmful effects.This entry represents a family of small CSPs consisting of CspA, one of the first cold-shock proteins identified, and its homologues. Note that while some members of this family are induced during cold-shock, some are either constitutively expressed or induced by other stresses such as nutrient starvation. While information about the physiological functions of the CSPs is limited, their structural properties have been well studied [, , , ]. These proteins have a five-stranded β-barrel structural fold and are part of the wider oligonucleotide/oligosaccharide-binding (OB family). They preferentially bind pyrimidine-rich regions of single-stranded RNA and DNA with high affinity, but not double-stranded RNA or DNA. Thus it is postulated that CSPs may act as RNA chaperones by destabilising RNA secondary structures. Experimental evidence suggests that they bind mRNA and regulate ribosomal translation, rate of mRNA dergadation and termination of transcription; functions that are important during normal growth as well as cold shock.
Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformationof this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].This entry represents predicted bifunctional nitroreductase/nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferases.
Steroid or nuclear hormone receptors (NRs) constitute an important super-family of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, celldifferentiation and homeostasis. Members of the superfamily include thesteroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminalligand-binding domains, these nuclear receptors contain a highly-conserved,N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand,steroid hormone receptors are thought to be weakly associated with nuclearcomponents; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of thembeing implicated in diseases such as cancer, diabetes, hormone resistancesyndromes, etc. While several NRs act as ligand-inducible transcriptionfactors, many do not yet have a defined ligand and are accordingly termed "orphan"receptors. During the last decade, more than 300 NRs have beendescribed, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The vitamin D receptor (VDR) mediates the signal of 1-a,25-dihydroxyvitamin D3 by binding to vitamin D responsive elements - it functions either as a homodimer, or as a heterodimer of vitamin D and retinoid X receptorsubunits. Deficiency of VDR causes type IIA rickets [].
An impressive property of mussels is their ability to stick to wet surfaces.Exactly how they do this is unclear, but they are known to exploit bundlesof threads, each of which has a fibrous collagenous core coated with adhesive proteins []. These proteins are able to displace water from a wet surface and then set to form tight junctions.The adhesive protein of Mytilus coruscus (Sea mussel) contains 848 amino acids, including a 20-residue signal peptide, a 21-residue non-repetitive linker and a repetitive domain that constitutes the bulk of the protein. The representative repeat motif of this domain, YKPK(I/P)(S/T)YPP(T/S), is similar to that of Mytilus galloprovincialis (Mediterranean mussel). The codon usage patterns for the same amino acids differ in different positions of the decapeptide motif[, ]. Almost identical nucleotide sequences appear several times in the repetitive region, suggesting that mussel adhesive protein genes have evolved through repeat duplication []. The repeat motif is reminiscent of repeat units found in extensins, a group of plant proteins involved in the strengthening of the cell wall in response to mechanical stress.
The Sema domain occurs in semaphorins, which are a large family of secreted and transmembrane proteins, some of which function as repellent signals during axon guidance. Sema domains also occur in plexins [], receptors for multiple classes of semaphorins, in hepatocyte growth factor receptor, and in viral proteins [].The Sema domain is characterised by a conserved set of cysteine residues, which form four disulphide bonds to stabilise the structure. The Sema domain fold is a variation of the beta propeller topology, with seven blades radially arranged around a central axis. Each blade contains a four-stranded (strands A to D) antiparallel beta sheet. The inner strand of each blade (A) lines the channel at the centre of the propeller, with strands B and C of the same repeat radiating outward, and strand D of the next repeat forming the outer edge of the blade. The large size of the Sema domain is not due to a single inserted domain but results from the presence of additional secondary structure elements inserted in most of the blades. The Sema domain uses a 'loop and hook' system to close the circle between the first and the last blades. The blades are constructed sequentially with an N-terminal β-strand closing the circle by providing the outermost strand (D) of the seventh (C-terminal) blade. The β-propeller is further stabilised by an extension of the N terminus, providing an additional, fifth β-strand on the outer edge of blade 6 [, , ].
This entry represents the archaeal-type nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferases.Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].
Peptide methionine sulphoxide reductase (Msr) reverses the inactivation of many proteins due to the oxidation of critical methionine residues by reducing methionine sulphoxide, Met(O), to methionine []. It is present in most living organisms, and the cognate structural gene belongs to the so-called minimum gene set [, ].The domains MsrA and MsrB reduce different epimeric forms of methionine sulphoxide. This group represent MsrA, the crystal structure of which has been determined in a number of organisms. In Mycobacterium tuberculosis, the MsrA structure has been determined to 1.5 Angstrom resolution []. In contrast to the three catalytic cysteine residues found in previously characterised MsrA structures, M. tuberculosis MsrA represents a class containing only two functional cysteine residues. The overall structure shows no resemblance to the structures of MsrB () from other organisms; though the active sites show approximate mirror symmetry. In each case, conserved amino acid motifs mediate the stereo-specific recognition and reduction of the substrate. In a number of pathogenic bacteria including Neisseria gonorrhoeae, the MsrA and MsrB domains are fused; the MsrA being N-terminal to MsrB. This arrangement is reversed in Treponema pallidum. In N. gonorrhoeae and Neisseria meningitidis a thioredoxin domain is fused to the N terminus. This may function to reduce the active sites of the downstream MsrA and MsrB domains.
The homocysteine (Hcy) binding domain is an ~300-residue module which is found in a set of enzymes involved in alkyl transfer to thiols:Prokaryotic and eukaryotic B12-dependent methionine synthase (MetH) (EC 2.1.1.13), a large, modular protein that catalyses the transfer of a methyl group from methyltetrahydrofolate (CH3-H4folate) to Hcy to form methionine, using cobalamin as an intermediate methyl carrier.Mammalian betaine-homocysteine S-methyltransferase (BHMT) (EC 2.1.1.5). It catalyzes the transfer of a methyl group from glycine betaine to Hcy, forming methionine and dimethylglycine.Plant selenocysteine methyltransferase (EC 2.1.1.-).Plant and fungal AdoMet homocysteine S-methyltransferases (EC 2.1.1.10).The Hcy-binding domain utilises a Zn(Cys)3 cluster to bind and activate Hcy. It has been shown to form a (beta/alpha)8 barrel. The Hcy binding domain barrel is distorted to form the metal- and substrate-binding sites. To accommodate the substrate, strands 1 and 2 of the barrel are loosely joined by nonclassic hydrogen bonds; to accommodate the metal, strands 6 and 8 are drawn together and strand 7 is extruded from the end of the barrel. The cysteines ligating the catalytic zinc atom are located at the C-terminal ends of strands 6 and 8 [, ].
The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to DnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of DnaK, which interacts stably with the polypeptide substrate []. Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation [].DnaJ consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acid residues, a glycine and phenylalanine-rich domain ('G/F' domain), a central cysteine rich domain (CR-type zinc finger) containing four repeats of a CXXCXGXG motif which can coordinate two zinc atom and a C-terminal domain (CTD) [].This entry represents the central cysteine-rich (CR) domain of DnaJ proteins. This central cysteine rich domain (CR-type zinc finger) has an overall V-shaped extended β-hairpin topology and contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DnaJ cysteine rich domain and various hydrophobic peptides has been found [].
The haloacid dehydrogenase (HAD) superfamily includes phosphatases, phosphonatases, P-type ATPases, beta-phosphoglucomutases, phosphomannomutases, and dehalogenases, which are involved in a variety of cellular processes ranging from amino acid biosynthesis to detoxification[].Crystal structures of proteins from the HAD superfamily show that these proteins all share a conserved alpha/beta-domain classified as a hydrolase fold, which is similar to the Rossmann fold []. This conserved domain usually contains an insertion (sub)domain. For example, the crystal structure of a phosphoglycolate phosphatase from Thermoplasma acidophilum []revealed two distinct domains, a larger core domain and a smaller cap domain. The large domain is composed of a centrally located five-stranded parallel β-sheet with strand order S10, S9, S8, S1, S2 and a small β-hairpin, strands S3 and S4. This central sheet is flanked by a set of three α-helices on one side and two helices on the other. The topology of the large domain is conserved; however, structural variation is observed in the smaller domain among the different functional classes of the haloacid dehalogenase superfamily.The large HAD-like superfamily of hydrolases comprises P-type ATPases, phosphatases, epoxide hydrolases and L-2-haloacid dehalogenases [].
Polyamines such as spermidine and spermine are essential for cellular growth under most conditions, being implicated in a large number of cellular processes including DNA, RNA and protein synthesis. S-adenosylmethionine decarboxylase (AdoMetDC) plays an essential regulatory role in the polyamine biosynthetic pathway by generating the n-propylamine residue required for the synthesis of spermidine and spermine from putrescein [, ]. Unlike many amino acid decarboxylases AdoMetDC uses a covalently bound pyruvate residue as a cofactor rather than the more common pyridoxal 5'-phosphate. These proteins can be divided into two main groups which show little sequence similarity either to each other, or to other pyruvoyl-dependent amino acid decarboxylases: class I enzymes found in bacteria and archaea, and class II enzymes found in eukaryotes. In both groups the active enzyme is generated by the post-translational autocatalytic cleavage of a precursor protein. This cleavage generates the pyruvate precursor from an internal serine residue and results in the formation of two non-identical subunits termed alpha and beta which form the active enzyme.This entry represents a set of known and predicted AdoMetDC enzymes from proteobacteria. The Escherichia coli enzyme is a heterooctamer composed of four alpha and four beta chains, and is dependent on mgnesium for activity [].
This entry includes pannexins from vertebrates and innexins from invertebrate []. Gap junctions are composed of membrane proteins,which form a channel permeable for ions and small molecules connectingcytoplasm of adjacent cells. Although gap junctions provide similar functionsin all multicellular organisms, until recently it was believed thatvertebrates and invertebrates use unrelated proteins for this purpose. Whilethe connexins family of gap junction proteins is well-characterised in vertebrates, no homologues have been found in invertebrates. Inturn, gap junction molecules with no sequence homology to connexins have beenidentified in insects and nematodes. It has been suggested that these proteinsare specific invertebrate gap junctions, and they were thus named innexins(invertebrate analog of connexins) []. As innexin homologues were recently identified in other taxonomic groups including vertebrates, indicating their ubiquitous distribution in the animal kingdom, they were called pannexins(from the Latin pan-all, throughout, and nexus-connection, bond) [, , ].Genomes of vertebrates carry probably a conserved set of 3 pannexin paralogs(PANX1, PANX2 and PANX3). Invertebrate genomes may contain more than a dozenpannexin (innexin) genes. Vinnexins, viral homologues of pannexins/innexins,were identified in Polydnaviruses that occur in obligate symbioticassociations with parasitoid wasps. It was suggested that virally encodedvinnexin proteins may function to alter gap junction proteins in infected hostcells, possibly modifying cell-cell communication during encapsulationresponses in parasitized insects [, ]. Structurally pannexins are simillar to connexins. Both types of proteinconsist of a cytoplasmic N-terminal domain, followed by four transmembranesegments that delimit two extracellular and one cytoplasmic loops; the C-terminal domain is cytoplasmic.
This entry represents anaerobic, class III ribonucleotide reductase. The mechanism of the enzyme involves a glycine-centred radical[], a C-terminal zinc binding site [], and a set of conserved active site cysteines and asparagines []. This enzyme requires an activating component, NrdG, a radical-SAM domain containing enzyme (). Together the two form an alpha-2/beta-2 heterodimer.Ribonucleotide reductase (RNR) catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, and bacteriophage, use a diiron-tyrosyl radical. Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in strict or facultative anaerobic bacteria, bacteriophage, and archaea, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical []. Many organisms have more than one class of RNR present in their genomes. All three RNRs have a ten-stranded α-β barrel domain that is structurally similar to the domain of PFL (pyruvate formate lyase) []. The class III enzyme from phage T4 consists of two subunits, this model covers the larger subunit which contains the active and allosteric sites.
This entry contains the RNA-dependent RNA polymerase (RdRp) of human coronavirus HKU1, murine hepatitis virus, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage).Coronaviruses (CoVs) utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (NSPs) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of bat coronavirus HKU9 and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage).Coronaviruses (CoVs) utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (NSPs) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage).Coronaviruses (CoVs) utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (NSPs) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of Severe acute respiratory syndrome coronavirus (SARS-CoV), SARS-CoV-2 (also known as 2019 novel CoV (2019-nCoV) or COVID-19 virus), and similar proteins from betacoronaviruses in the sarbecovirus subgenera (B lineage).Coronaviruses (CoVs) utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (NSPs) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of deltacoronaviruses.Coronaviruses (CoVs) utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (NSPs) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir. NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of alphacoronaviruses, including human coronaviruses (HCoVs), HCoV-NL63, and HCoV-229E.CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir [].NSP12 containsa RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
This entry contains the RNA-dependent RNA polymerase (RdRp) of gammacoronaviruses, including the RdRp of avian infectious bronchitis virus (IBV) and similar proteins.CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. A key component, the RNA-dependent RNA polymerase (RdRp, also known as NSP12), catalyses the synthesis of viral RNA and thus plays a central role in the replication and transcription cycle of CoV, possibly interacting with its co-factors, NSP7 and NSP8. RdRp is therefore considered a primary target for nucleotide analog antiviral inhibitors such as remdesivir [].NSP12 contains a RdRp domain as well as a large N-terminal extension that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture. The RdRp domain displays a right hand with three functional subdomains, called fingers, palm, and thumb. All RpRps contain conserved polymerase motifs (A-G), located in the palm (A-E motifs) and finger (F-G) subdomains. All these motifs have been implicated in RdRp fidelity such as processes of correct incorporation and reorganization of nucleotides [].
These proteins are members of the wider radical SAM superfamily of enzymes that enzymes utilise an iron-sulphur redox cluster and S-adenosylmethionine to carry out diverse radical mediated reactions []. The Acidithiobacillus ferrooxidans ATCC 23270 protein (AFE_0975) is encoded in the same locus as the genes for squalene-hopene cyclase (SHC, ) and other proteins associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus) (Reut_B4901) this protein is encoded adjacent to the genes for HpnAB, IspH and HpnH (), although SHC itself is elsewhere in the genome. Notably, this protein (here named HpnJ) and three others form a conserved set (HpnIJKL) which occur in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling []. This group includes Zymomonas mobilis the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF) []. Continuing past SHC are genes encoding a phosphorylase enzyme (ZMO0873, i.e. HpnG, ) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. One of the well-described hopanoid intermediates is bacteriohopanetetrol. In the conversion from hopene several reactions must occur in the side chain for which a radical mechanism might be reasonable. These include the four (presumably anaerobic) hydroxylations and a methyl shift.
LRP chaperone MESD (also known as mesoderm development candidate 2) represents a set of highly conserved proteins found from nematodes to humans. It is a chaperone that specifically assists with the folding of β-propeller/EGF modules within the family of low-density lipoprotein receptors (LDLRs). It also acts as a modulator of the Wnt pathway, since some LDLRs are coreceptors for the canonical Wnt pathway and is essential for specification of embryonic polarity and mesoderm induction []. The Drosophila homologue, known as boca, is an endoplasmic reticulum protein required for wingless signaling and trafficking of LDL receptor family members [].The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs) []. The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation) [].
The LEM (LAP2, emerin, MAN1) domain is a globular module of approximately 40amino acids, which is mostly found in the nucleoplasmic portions of metazoaninner nuclear membrane proteins. The LEM domain has been shown to mediatebinding to BAF (barrier-to-autointegration factor) and BAF-DNA complexes. BAFdimers bind to double-stranded DNA non-specifically and thereby bridge DNAmolecules to form a large, discrete nucleoprotein complex [, ].The resolution of the solution structure of the LEM domain reveals that it iscomposed of a three-residue N-terminal helical turn and two large parallelalpha helices interacting through a set of conserved hydrophobic amino acids. The two helices, which are connected by a long loop are oriented at an angle of ~45 degree [, ].Proteins known to contain a LEM domain include:Vertebrate inner nuclear membrane protein MAN1. Vertebrate lamina-associated polypeptide 2 (LAP2) or thymopoietin. Mammalian emerin (EMD). In human, defects in EMD are a cause of X-linked Emery-Dreifuss muscular dystrophy (X-EDMD), an X-linked disorder, characterised by early contractures, muscle wasting and weakness and cardiomyopathy.Xenopus laevis Smad1 antagonistic effector (SANE).Drosophila melanogaster otefin (OTE).Caenorhabditis elegans W01G7.5 protein.
Proteins containing this domain include coronavirus (CoV) non-structural protein 5 (NSP5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), from gammacoronaviruses [, ].CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/NSP5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/NSP5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of α-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/NSP5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs [, , , , ].
Proteins containing this domain include (CoV) non-structural protein 5 (NSP5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in alphacoronaviruses [, , , , , , , , , , , , ].CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/NSP5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/NSP5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of α-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/NSP5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs [, , , , ].
Proteins containing this domain include the coronavirus (CoV) non-structural protein 5 (NSP5) also called the Main protease (Mpro), or 3C-like protease (3CLpro), found in deltacoronaviruses [, , ].CoVs utilize a multi-subunit replication/transcription machinery. A set of non-structural proteins (Nsps) generated as cleavage products of the ORF1a and ORF1ab viral polyproteins assemble to facilitate viral replication and transcription. Mpro/NSP5 is a key enzyme in this process, making it a high value target for the development of anti-coronavirus therapeutics. These enzymes belong to the MEROPS peptidase C30 family, where the active site residues His and Cys form a catalytic dyad. The structures of Mpro/NSP5 consist of three domains with the first two containing anti-parallel beta barrels and the third consisting of an arrangement of α-helices. The catalytic residues are found in a cleft between the first two domains. Mpro/NSP5 requires a Gln residue in the P1 position of the substrate and space for only small amino-acid residues such as Gly, Ala, or Ser in the P1' position; since there is no known human protease with a specificity for Gln at the cleavage site of the substrate, these viral proteases are suitable targets for the development of antiviral drugs [, , , , ].
Dual specificity phosphatases (DUSPs) are members of the superfamily of protein tyrosine phosphatases [, ]. They remove the phosphate group from both phospho-tyrosine and phospho-serine/threonine residues. They are structurally similar to tyrosine-specific phosphatases but with a shallower active site cleft and a distinctive active site signature motif, HCxxGxxR [, , ]. They are characterized as VHR- [, ]or Cdc25-like [, ].In general, DUSPs are classified into the following subgroups []:Slingshot phosphatasesPhosphatase of regenerating liver (PRL)Cdc14 phosphatasesPhosphatase and tensin homologue deleted on chromosome 10 (PTEN)-like and myotubularin phosphatasesMitogen-activated protein kinase phosphatases (MKPs)Atypical DUSPsThe atypical DUSPs share a high degree of similarity with the MKP subgroup, but lack the N-terminal regulatory domain, which provides the substrate specificity towards the MAP kinases. These atypical-DUSPs form a heterogeneous group and have in common the presence of a single catalytic PTP domain. VHR was the first characterised member of this subfamily; its crystal structure is known [, ].The function of many atypical DUSPs remains unknown, although some have been related to regulation of MAP kinase pathways [, , ]. VHR has also been related to the control of cell-senescence []. The atypical DUSPs can be subdivided into two groups (termed A and B) on the basis of sequence similarity. Each of these subgroups is characterised by its own distinctive set of motifs, the functions of which are as yet unknown.This entry represents atypical dual specificity phosphatase subfamily A.
Oxygenic photosynthesis uses two multi-subunit photosystems (I and II) located in the cell membranes of cyanobacteria and in the thylakoid membranes of chloroplasts in plants and algae. Photosystem II (PSII) has a P680 reaction centre containing chlorophyll 'a' that uses light energy to carry out the oxidation (splitting) of water molecules, and to produce ATP via a proton pump. Photosystem I (PSI) has a P700 reaction centre containing chlorophyll that takes the electron and associated hydrogen donated from PSII to reduce NADP+ to NADPH. Both ATP and NADPH are subsequently used in the light-independent reactions to convert carbon dioxide to glucose using the hydrogen atom extracted from water by PSII, releasing oxygen as a by-product.The photosynthetic reaction centres (RCs) of aerotolerant organisms contain a heterodimeric core, built up of two strongly homologous polypeptides each of which contributes five transmembrane peptide helices to hold a pseudo-symmetric double set of redox components. Two molecules of PscD are housed within a subunit. PscD may be involved in stabilising the PscB component since it is found to co-precipitate with FMO (Fenna-Mathews-Olson BChl a-protein) and PscB. It may also be involved in the interaction with ferredoxin [].
Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a set of zinc fingers found in bacteriophage transcriptional activators and some bacterial proteins []. These zinc fingers all contain the consensus sequence: C-X(2)-C-X(3)-A-(X)2-R-X(15)-C-X(4)-C-X(3)-F [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 6 (PRDM6, also known as PRISM). PRDM6 has been shown to co-localized with histone H4 and methylates H4-K20 []. It has also been shown to act as a transcriptional repressor by interacting with class I histone deacetylases and the G9a histone methyltransferase []. It promotes thetransition from differentiated to proliferative smooth muscle by suppressing differentiation and maintaining the proliferative potential of vascular smooth muscle cells []. It also plays a role in endothelial cells by inhibiting endothelial cell proliferation, survival and differentiation []. Mutations in the PRDM6 gene have been linked to nonsyndromic patent ductus arteriosus (PDA), which is a common congenital heart defect (CHD) with both inherited and acquired causes [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 7 and 9 (PRDM7/9). PRDM7 (also termed PR domain-containing protein 7) is a primate-specific histone methyltransferase that is the result of a recent gene duplication of PRDM9. It selectively catalyses the trimethylation of H3 lysine 4 (H3K4me3) []. PRDM9 (also termed PR domain-containing protein 9) is a histone methyltransferase that specifically trimethylates 'Lys-4' of histone H3 (H3K4me3) during meiotic prophase and is essential for proper meiotic progression. It also efficiently mono-, di-, and trimethylates H3K36. Aberrant PRDM9 expression is associated with genome instability in cancer [, , , ]. PRDM9 has also been shown to be able to performs intramolecular automethylation on multiple lysine residues localised to a lysine-rich region on the post-SET domain [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
Dual specificity phosphatases (DUSPs) are members of the superfamily of protein tyrosine phosphatases [, ]. They remove the phosphate group from both phospho-tyrosine and phospho-serine/threonine residues. They are structurally similar to tyrosine-specific phosphatases but with a shallower active site cleft and a distinctive active site signature motif, HCxxGxxR [, , ]. They are characterized as VHR- [, ]or Cdc25-like [, ].In general, DUSPs are classified into the following subgroups []:Slingshot phosphatasesPhosphatase of regenerating liver (PRL)Cdc14 phosphatasesPhosphatase and tensin homologue deleted on chromosome 10 (PTEN)-like and myotubularin phosphatasesMitogen-activated protein kinase phosphatases (MKPs)Atypical DUSPsThe atypical DUSPs share a high degree of similarity with the MKP subgroup, but lack the N-terminal regulatory domain, which provides the substrate specificity towards the MAP kinases. These atypical-DUSPs form a heterogeneous group and have in common the presence of a single catalytic PTP domain. VHR was the first characterised member of this subfamily; its crystal structure is known [, ].The function of many atypical DUSPs remains unknown, although some have been related to regulation of MAP kinase pathways [, , ]. VHR has also been related to the control of cell-senescence []. The atypical DUSPs can be subdivided into two groups (termed A and B) on the basis of sequence similarity. Each of these subgroups is characterised by its own distinctive set of motifs, the functions of which are as yet unknown.This entry also includes DUSP1 which does not belong to the atypical DUSP family.
Bacteria synthesise a set of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins [, ]. Examples include the HU protein in Escherichia coli which is a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae []. The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control []is found in enterobacteria and viral proteins include the African Swine fever virus protein Pret-047 (also known as A104R or LMW5-AR) [].The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilising it from denaturation under extreme environmental conditions. The structure is known for one of these proteins []. The protein exists as a dimer and two "β-arms"function as the non-specific binding site for bacterial DNA.The signature pattern for entry represents a twenty residue sequence which includes three perfectly conserved positions. According to the tertiary structure of one of these proteins [], this pattern spans exactly the first half of the flexible DNA-binding arm.
There are two types of fatty acid synthase systems. The type I system is found in metazoans and is carried outby a multifunctional polypeptide with multiple active sites. In contrast, the type II system found in bacteria and plantsconsists of a set of discrete monofunctional proteins, each encoded by a separate gene. ACP1 is central to both of thesepathways because it functions to ferry the pathway intermediates between active site centres or enzymes. ACPs are alsocritical to the function of other metabolic pathways such as polyketide synthases. The type II fatty acid synthase ACPs are abundant, small, acidic proteins that carry the acyl intermediates attached as thioesters to the terminus of the 4'-phosphopantetheine prosthetic group. This prosthetic group is added post-translationally to apoACP by holo-(acyl carrier protein) synthase (AcpS), which transfers the 4'-phosphopantetheine moiety of CoA to a serine reidue of apoACP.The crystal structures of a number of the type II fatty acid synthase ACPs have been determined. The structures reveal a novel trimeric arrangement of molecules resulting in three active sites [, ].
Salmonella, and related proteobacteria, secrete large amounts of proteins into the culture media. The major secreted proteins are either flagellar proteins or virulence factors [], secreted through the flagellar or virulence export structures respectively. Both secretion systems penetrate the inner and outer membranes and their components bear substantial sequence similarity. Both the flagellar and needle like pilus look fairly similar to each other []. The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. It is believed that the family of type III flagellar and pilus inner membrane proteins are used as structural moieties in a complex with several other subunits []. One such set of inner membrane proteins, labeled "S"here for nomenclature purposes, includes the Salmonella and Shigella SpaS, the Yersinia YscU, Rhizobium Y4YO, and the Erwinia HrcU genes, Salmonella FlhB and Escherichia coli EscU [, , , ].This superfamily represent the C-terminal domain of the type III secretion system substrate exporters. Many of the proteins containing this domain undergo autocatalytic cleavage promoted by cyclisation of a conserved asparagine.
The tripartite DENN (after differentially expressed in neoplastic versusnormal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, calleduDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [, ].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation andseparated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which arealways associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing deathdomain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, theortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the dDENN domain.
The tripartite DENN (after differentially expressed in neoplastic versusnormal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, calleduDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [, ].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation andseparated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which arealways associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing deathdomain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, theortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the uDENN domain.
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE []. SopE and its protein homologue SopE2 canactivate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in thehost cell during infection.
The circumsporozoite (CS) protein is the most prominant surface antigen onthe sporozoite of the malaria parasite, Plasmodium spp. The sporozoite isthe infectious stage of the Plasmodium life cycle, the form in which malariais passed from the mosquito vector to the mammalian host []. Antibodies tothis protein are used in the field to detect exposure to malaria []and it is a target for several vaccines [].The sequence of the CS protein consists of head and tail regions, which arelargely conserved, and a large set of low-complexity repeats, which are variant across strain and species []. The C-terminal region is probably used for anchoring the protein to the cell membrane, while the centralrepeat sequences would be the surface antigen of the organism. The repeats,which encode the immunodominant epitope of the CS protein (see ), diverge morerapidly than the remainder of the gene. It is thought that the maintenanceand evolution of the repeats is achieved via a mechanism that acts not atthe protein level, but rather directly on the DNA sequence [].
Proteins in this entry contain glycosyl transferase family 2 domains which are responsible, generally, for the transfer of nucleotide-diphosphate sugars to substrates such as polysaccharides and lipids. The Acidithiobacillus ferrooxidans ATCC 23270 protein (AFE_0974) is encoded in the same locus as the genes for squalene-hopene cyclase (SHC, ) and other proteins associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus) this protein (Reut_B4902) is encoded adjacent to the genes for HpnAB, IspH and HpnH (), although SHC itself is encoded elsewhere in the genome. Notably, this protein (here named HpnI) and three others form a conserved set (HpnIJKL) which occurs in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling []. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes HpnA-E and SHC (HpnF) []. Continuing past SHC are found genes for a phosphorylase enzyme (ZMO0873, i.e. HpnG, ) and another radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL. Hopanoids are known to feature polar glycosyl head groups in many organisms.
Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior []. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia. The type III secretion system is of great interest as it is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis []. However, while the latter forms a ring structure to allow secretion of flagellin and is an integral part of the flagellum itself [], type III subunits in the outer membrane translocate secreted proteins through a channel-like structure. It is believed that the family of type III inner membrane proteins are used as structural moieties in a complex with several other subunits [], including the ATPase necessary for driving the secretion system.One such set of inner membrane proteins, termed "P"here for nomenclature purposes, includes the Salmonella and Shigella SpaP, the Yersinia YscR, the Erwinia HrcR, and the Xanthamonas Pro2 genes [], as well as several FliP flagellar biosynthesis genes []. FliP is an ~30Kd protein containing three or four transmembrane (TM) regions.
Treacher Collins Syndrome (TCS) is an autosomal dominant disorder ofcraniofacial development, the features of which include conductive hearing loss and cleft palate [, ]; it is the most common of the human mandibulo-facial dysostosis disorders []. The TCS locus has been mapped to human chromosome 5q31.3-32 and the mutated gene identified (TCOF1) []. To date, 35 mutations have been reported in TCOF1, all but one of which result in the introduction of a premature-termination codon into the predicted protein, Treacle. The observed mutational spectrum supports the hypothesis that TCS results from haploinsufficiency.Treacle is a low complexity protein of 1,411 amino acids whose predictedprotein structure contains a set of highly polar repeated motifs []. These motifs are common to nucleolar trafficking proteins in other species and are predicted to be phosphorylated by casein kinase. In concert with this observation, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localisation signals []. Throughout the open reading frame are found mutations in TCS families and several polymorphisms. It has thus been suggested that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.This entry contains Treacle and other related proteins.
This superfamily represents a structural domain which consists of three α-helices, including the arfaptin homology (AH) domain and the BAR (Bin-Amphiphysin-Rvs) domain.The arfaptin homology (AH) domain is a protein domain found in a range of proteins, including arfaptins, protein kinase C-binding protein PICK1 []and mammalian 69kDa islet cell autoantigen (ICA69) []. The AH domain of arfaptin has been shown to dimerise and to bind Arf and Rho family GTPases [, ], including ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The AH domain consists of three α-helices arranged as an extended antiparallel α-helical bundle. Two arfaptin AH domains associate to form a highly elongated, crescent-shaped dimer [, ].Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs) domain, which is required for their in vivofunction and their ability to tubulate membranes []. The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases.
Many biosynthesis clusters for secondary metabolites feature a glycosyltransferase gene next to a P450 homologue, often with the P450 lacking a critical heme-binding Cys. These P540-derived sequences seem to be allosteric activators of glycosyltransferases such as the member of this family. This entry represents a set of related glycosyltransferases, many of which can be recognised as activator-dependent from genomic context. Proteins in this entry include:3-alpha-mycarosylerythronolide B desosaminyl transferase eryCIII from Saccharopolyspora erythraea. It catalyzes the conversion of alpha-L-mycarosylerythronolide B into erythromycin D in the erythromycin biosynthesis pathway [].Aclacinomycin-T 2-deoxy-L-fucose transferase AknK from Streptomyces galilaeus. It is involved in the biosynthesis of the trisaccharide moiety characteristic of the antitumor drug aclacinomycins []. Aklavinone 7-beta-L-rhodosaminyltransferase AknS from Streptomyces galilaeus. It is involved in the biosynthesis of the anthracycline antitumor agent aclacinomycin A []. Tylactone mycaminosyltransferase from Streptomyces fradiae. It is involved in the biosynthesis of the macrolide antibiotic tylosin derived from the polyketide lactone tylactone []. Erythronolide mycarosyltransferase EryBV from Saccharopolyspora erythraea. It is involved in the biosynthesis of the macrolide antibiotic erythromycin []. TDP-daunosamine transferase DnrS from Streptomyces peucetius. It is involved in the biosynthesis of the anthracyclines carminomycin and daunorubicin (daunomycin) which are aromatic polyketide antibiotics that exhibit high cytotoxicity and are widely applied in the chemotherapy of a variety of cancers [].
Lantibiotic genes reside on the bacterial chromosome, where they cluster with genes that adapt and secrete them to the extracellular space. Many of these so-called 'pathogenicity islands' have been characterised, including the epidermin (epi) cluster in Staphylococcus epidermis, and the nisin (nis) cluster in Lactococcus lactis []. The gene encoding the lantibiotic is flanked by 3 regulatory genes: 2 that are usually involved in a 2-component regulatory system, and another that cleaves the signal peptide from the precursor to produce the mature lantibiotic.This protein (usually designated with a "P"suffix - nisP, mutP, etc.) is highly conserved amongst pathogenic species, and is essential for virulence and survival of the bacterium against competitors in the host []. A novel pathogenicity island in resistant Enterococcus faecalis has been sequenced. In addition to thelantibiotic Cyl gene cluster, this revealed a novel set of virulence factors involved in vancomycin resistance and pathogenicity []. Lantiobiotic (lanthionine-containing antibiotics) specific proteases are serine proteases in the subtilisin family (family S8). The proteases that cleave the N-terminal leader peptides from lantiobiotics include: epiP, nsuP, mutP, and nisP. EpiP (MEROPS identifier S08.060), from Staphylococcus, is thought to cleave matured epidermin []. NsuP, a dehydratase from Streptococcus and NisP (MEROPS identifier S08.059), a membrane-anchored protease from Lactococcus, cleaves nisin []. MutP (MEROPS identifier S08.065) is highly similar to epiP and nisP and is thought to process the prepeptide mutacin III of S. mutans [].
Glutaminases () deaminate glutamine to glutamate. In Bacillus subtilis, glutaminase is encoded by glnA, which is part of an operon, glnA-glnT (formerly ybgJ-ybgH), where glnT encodes a glutamine transporter. The glnA-glnT operon is regulated by the 2-component system GlnK-GlnL in response to glutamine []. This entry represents the core structural motif of a family of glutaminases that include GlnA, which are characterised by their beta-lactamase-like topology, containing a cluster of α-helices and an alpha/beta sandwich.This family describes the enzyme glutaminase, from a larger family that includes serine-dependent beta-lactamases and penicillin-binding proteins. Many bacteria have two isozymes. This model is based on selected known glutaminases and their homologues within prokaryotes, with the exclusion of highly-derived (long branch) and architecturally varied homologues, so as to achieve conservative assignments. A sharp drop in scores occurs below 250, and cutoffs are set accordingly. The enzyme converts glutamine to glutamate, with the release of ammonia. Members tend to be described as glutaminase A (glsA), where B (glsB) is unknown and may not be homologous (as in Rhizobium etli. Some species have two isozymes that may both be designated A (GlsA1 and GlsA2).
The homocysteine (Hcy) binding domain is an ~300-residue module which is found in a set of enzymes involved in alkyl transfer to thiols:Prokaryotic and eukaryotic B12-dependent methionine synthase (MetH) (EC 2.1.1.13), a large, modular protein that catalyses the transfer of a methyl group from methyltetrahydrofolate (CH3-H4folate) to Hcy to form methionine, using cobalamin as an intermediate methyl carrier.Mammalian betaine-homocysteine S-methyltransferase (BHMT) (EC 2.1.1.5). It catalyzes the transfer of a methyl group from glycine betaine to Hcy, forming methionine and dimethylglycine.Plant selenocysteine methyltransferase (EC 2.1.1.-).Plant and fungal AdoMet homocysteine S-methyltransferases (EC 2.1.1.10).The Hcy-binding domain utilises a Zn(Cys)3 cluster to bind and activate Hcy. It has been shown to form a (beta/alpha)8 barrel. The Hcy binding domain barrel is distorted to form the metal- and substrate-binding sites. To accommodate the substrate, strands 1 and 2 of the barrel are loosely joined by nonclassic hydrogen bonds; to accommodate the metal, strands 6 and 8 are drawn together and strand 7 is extruded from the end of the barrel. The cysteines ligating the catalytic zinc atom are located at the C-terminal ends of strands 6 and 8 [, ].
The tripartite DENN (after differentially expressed in neoplastic versusnormal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, calleduDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [, ].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation andseparated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which arealways associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing deathdomain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, theortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the core or cDENN domain.
The hsp70 chaperone machine performs many diverse roles in the cell, including folding of nascent proteins, translocation of polypeptides across organelle membranes, coordinating responses to stress, and targeting selected proteins for degradation. DnaJ is a member of the hsp40 family of molecular chaperones, which is also called the J-protein family, the members of which regulate the activity of hsp70s. DnaJ (hsp40) binds to DnaK (hsp70) and stimulates its ATPase activity, generating the ADP-bound state of DnaK, which interacts stably with the polypeptide substrate []. Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation [].DnaJ consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acid residues, a glycine and phenylalanine-rich domain ('G/F' domain), a central cysteine rich domain (CR-type zinc finger) containing four repeats of a CXXCXGXG motif which can coordinate two zinc atom and a C-terminal domain (CTD) [].This entry represents the central cysteine-rich (CR) domain of DnaJ proteins. This central cysteine rich domain (CR-type zinc finger) has an overall V-shaped extended β-hairpin topology and contains four repeats of the motif CXXCXGXG where X is any amino acid. The isolated cysteine rich domain folds in zinc dependent fashion. Each set of two repeats binds one unit of zinc. Although this domain has been implicated in substrate binding, no evidence of specific interaction between the isolated DnaJ cysteine rich domain and various hydrophobic peptides has been found [].
Treacher Collins Syndrome (TCS) is an autosomal dominant disorder ofcraniofacial development, the features of which include conductive hearing loss and cleft palate [, ]; it is the most common of the human mandibulo-facial dysostosis disorders []. The TCS locus has been mapped to human chromosome 5q31.3-32 and the mutated gene identified (TCOF1) []. To date, 35 mutations have been reported in TCOF1, all but one of which result in the introduction of a premature-termination codon into the predicted protein, Treacle. The observed mutational spectrum supports the hypothesis that TCS results from haploinsufficiency.Treacle is a low complexity protein of 1,411 amino acids whose predictedprotein structure contains a set of highly polar repeated motifs []. These motifs are common to nucleolar trafficking proteins in other species and are predicted to be phosphorylated by casein kinase. In concert with this observation, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localisation signals []. Throughout the openreading frame are found mutations in TCS families and several polymorphisms. It has thus been suggested that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.
The haloacid dehydrogenase (HAD) superfamily includes phosphatases, phosphonatases, P-type ATPases, beta-phosphoglucomutases, phosphomannomutases, and dehalogenases, which are involved in a variety of cellular processes ranging from amino acid biosynthesis to detoxification[].Crystal structures of proteins from the HAD superfamily show that these proteins all share a conserved alpha/beta-domain classified as a hydrolase fold, which is similar to the Rossmann fold []. This conserved domain usually contains an insertion (sub)domain. For example, the crystal structure of a phosphoglycolate phosphatase from Thermoplasma acidophilum []revealed two distinct domains, a larger core domain and a smaller cap domain. The large domain is composed of a centrally located five-stranded parallel β-sheet with strand order S10, S9, S8, S1, S2 and a small β-hairpin, strands S3 and S4. This central sheet is flanked by a set of three α-helices on one side and two helices on the other. The topology of the large domain is conserved; however, structural variation is observed in the smaller domain among the different functional classes of the haloacid dehalogenase superfamily.
Methylation at CpG dinucleotide, the most common DNA modification ineukaryotes, has been correlated with gene silencing associated with variousphenomena such as genomic imprinting, transposon and chromosome X inactivation, differentiation, and cancer. Effects of DNA methylation are mediated through proteins which bind to symmetrically methylated CpGs. Such proteins contain a specific domain of ~70 residues, the methyl-CpG-binding domain (MBD), which is linked to additional domains associated with chromatin, such as the bromodomain, the AT hook motif,the SET domain, or the PHD finger. MBD-containing proteins appear to act as structural proteins, which recruit a variety of histone deacetylase (HDAC) complexes and chromatin remodelling factors, leading to chromatin compaction and, consequently, to transcriptional repression. The MBD of MeCP2, MBD1, MBD2, MBD4 and BAZ2 mediates binding to DNA, in case of MeCP2, MBD1 and MBD2 preferentially to methylated CpG. In case of human MBD3 and SETDB1 the MBD has been shown to mediate protein-protein interactions [, ].The MBD folds into an alpha/beta sandwich structure comprising a layer oftwisted beta sheet, backed by another layer formed by the alpha1 helix and ahairpin loop at the C terminus. These layers are both amphipathic, with the alpha1 helix and the beta sheet lying parallel and the hydrophobic faces tightly packed against each other. The beta sheet is composed of two long inner strands (beta2 and beta3) sandwiched by two shorter outer strands (beta1 and beta4) [].
ASHR3 protein, a member of this family, interacts with the putative basic helix-loop-helix transcription factor ABORTED MICROSPORES (AMS), which is involved in anther and stamen development in Arabidopsis. This interaction is mediated bythe PHD finger and the SET domain of ASHR3 [].Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
Laminins are large heterotrimeric glycoproteins involved in basement membrane function []. The Laminin G or LNS domain (for Laminin-alpha, Neurexin and Sex hormone-binding globulin) is an around 180 amino acid long domain found in a large and diverse set of extracellular proteins [, ]. The laminin globular (G) domain can be found in one to several copies in various laminin family members, including a large number of extracellular proteins. The C terminus of the laminin alpha chain contains a tandem repeat of five laminin G domains, which are critical for heparin-binding and cell attachment activity []. Laminin alpha4 is distributed in a variety of tissues including peripheral nerves, dorsal root ganglion, skeletal muscle and capillaries; in the neuromuscular junction, it is required for synaptic specialisation []. The structure of the laminin-G domain has been predicted to resemble that of pentraxin [].Laminin G domains can vary in their function, and a variety of binding functions have been ascribed to different LamG modules. For example, the laminin alpha1 and alpha2 chains each have five C-teminal laminin G domains, where only domains LG4 and LG5 contain binding sites for heparin, sulphatides and the cell surface receptor dystroglycan []. Laminin G-containing proteins appear to have a wide variety of roles in cell adhesion, signalling, migration, assembly and differentiation. Proteins with laminin-G domains include:Laminin.Merosin.Agrin.Neurexins.Vitamin K dependent protein S.Sex steroid binding protein SBP/SHBG.Drosophila proteins Slit, Crumbs, Fat.several proteoglycan precursors.
Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].This superfamily represents the α-helical structural domain found at the N terminus of these proteins.
RNA-binding motif protein 8 (RBM8), also termed binder of OVCA1-1 (BOV-1) or RNA-binding protein Y14, is one of the components of the exon-exon junction complex (EJC) []. It has two isoforms, RBM8A and RBM8B, both of which are identical except that RBM8B is 16 amino acids shorter at its N terminus []. Three-dimensional modelling of the RBM8 RRM domain indicates that the sequences fold into an RNA-binding domain, forming a hydrophobic core between a β-sheet and two helices. The human RBM8A protein is ubiquitously expressed; the protein is localised predominantly in the cell nucleus and diffused throughout the cytoplasm []. It preferentially associates with mRNAs produced by splicing, including both nuclear mRNAs and newly exported cytoplasmic mRNAs. Evidence suggests the protein remains associated with spliced mRNAs as a tag to indicate the position of spliced introns. Human RBM8A protein specifically binds to MAGOH, the human homologue of Drosophila mago nashi, a protein required for normal germ plasm development in the Drosophila embryo []; a similar association occurs with the Drosophila RBM8 protein, Tsunagi []. The RBM8A and RBM8B protein sequences contain a putative bipartite nuclear localisation signal []at the N terminus, as well a stretch of glycine residues. In addition, the RRM contained within RBM8A and RBM8B contains one set of the two consensus nucleic acid-binding motifs, RNP-1 and RNP-2, characteristic of heterogeneous nuclear ribonucleoprotein (hnRNP).
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE []. Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the guanine nucleotide exchange factor domain of SopE and homologues. This domain has an α-helical structure consisting of two three-helix bundles arranged in a lamdba shape [, ].
Interferon (INF)-gamma is a dimeric glycoprotein produced by activated T cells and natural killer cells. Although originally isolated based on itsantiviral activity, INF-gamma also displays powerful anti-proliferative and immuno-modulatory activities, which are essential for developing appropriate cellular defences against a variety of infectious agents. The first step in eliciting these responses is the specific high affinity interaction of INF-gamma with its cell-surface receptor (INF-gammaRalpha); the complex then interacts with at least one of a family of additional species-specific accessory factors (AF-1 or INF-gammabeta), which convey different cellular responses. One such response is the association and phosphorylation of two protein tyrosine kinases (Jak-1 and Jak-2), which in turn stimulate nuclear transcription activators [, ].The human INF-gammaR, is a member of the hematopoietic cytokine receptor superfamily. It is expressed in a membrane-bound form in many cell types, and is over-expressed in tumour cells. It comprises an extracellular portion of 229 residues, a single transmembrane region, and a cytoplasmic domain of 221 residues. As with other members of its superfamily, the cytokine-binding sites are formed by a small set of closely-spaced surface loops that extend from a β-sheet core, much like antigen-binding sites on antibodies. The extracellular INF-gammaR monomer comprises two domains (domain D1 from residue 14-102, and domain D2 from residue 114-221), each resembling an Ig fold with fibronectin type III topology [].This entry refers to the interferon gamma receptor alpha subunit, also known as interferon gamma receptor 1.
This entry includes the SLC41A family members mostly from eukaryotes and archaea. Some proteins are from bacteria. The SLC41A family of divalent cation transporters includes SLC41A1, SLC41A2 and SLC41A3 and MgtE [, , , ]. MgtE, the SLC41A1 orthologue found in prokaryotes, consists of a homodimer architecture with five transmembrane domains at the C terminus and the cytosolic domains at the N terminus. Structural and functional analyses showed MgtE conserved acidic side-chains play a key role as part of its selectivity filter [, ]. In humans, SLC41A1 is a Na+/Mg2+ ion exchanger that acts as a predominant Mg2+ efflux system at the plasma membrane. The transporter activity is driven by the inwardly directed electrochemical gradient for Na+ ions, thus directly depends on the extracellular Na+ ion concentration set by the Na+/K+ pump. In this way, it generates circadian cellular Mg2+ fluxes that feed back to regulate clock-controlled gene expression and metabolism and facilitate higher energetic demands during the day. This protein is located in the cell membrane [, ]. Its expression level has been linked to diverse disorders such as preeclampsia in pregnant woman, nephronophthisis, Parkinson's disease and bone mass loss. Hence, it has been suggested this protein might be a great therapeutic target [, , , ]. Mouse SLC41A2 has been shown to transport Mg2+ and a range of other divalent cations: Ba2+, Ni2+, Co2+, Fe2+, or Mn2+, but not Ca2+, Zn2+, or Cu2+ [].
This entry represents the Bacterial Immunoglobulin-like 21 (BIg21) domain found in InvasinE (InvE). Invasins are members of the inverse autotransporter (IAT) family also referred to as type Ve secretion system. In general, they consist of an N-terminal β-barrel-like domain, which is responsible for attachment of invasin to the outer membrane region of bacteria, repetitive Immunoglobulin-like (Ig) domains, which vary significantly in number among all the invasins, and the C-terminal domain/adhesion domain (AD) ) which provides invasins with the specificity to bind to its host target molecules. The overall structure of InvE shows that it comprises of three domain architecture in which the domains BIg20 (Bacterial Immunoglobulin-like 20) and BIg21 adopt two layer β-sandwich fold resembling eukaryotic members of Immunoglobulin-superfamily (IgSF), while the AD is a globular, α/β-domain. The structure of BIg21 belongs to the I2 set of IgSF with a unique modification in the C-E interstrand loop, important for its interaction with AD. BIg21 and AD form a functional super-domain as well, necessary to target the host receptor []. This domain is also found repeated in sequences from Enterobacterales such as Salmonella typhimurium and Escherichia coli. These repeats are almost always found with and are associated with RatA and RatB, the coding sequences of which are found in the pathogeneicity island of Salmonella. The sequences may be determinants of pathogenicity [, ].
This entry represents FAD-dependent hydroxylases (monooxygenases) which are all believed to act in the aerobic ubiquinone biosynthesis pathway []. A separate set of hydroxylases, as yet undiscovered, are believed to be active under anaerobic conditions []. In Escherichia coli, three enzyme activities have been described: UbiB (which acts first at position 6, see ), UbiH (which acts at position 4, []) and UbiF (which acts at position 5, []). UbiH and UbiF are similar to one another and form the basis of this subfamily. Interestingly, E. coli contains another hydroxylase gene, called visC, that is highly similar to UbiF, adjacent to UbiH and, when mutated, results in a phenotype similar to that of UbiH (which has also been named visB) []. Several other species appear to have three homologs in this family, although they assort themselves differently on phylogenetic trees (e.g. Xylella and Mesorhizobium) making it difficult to ascribe a specific activity to each one. Eukaryotes appear to have only a single homologue in this subfamily (COQ6, []) which complements UbiH, but also possess a non-orthologous gene, COQ7 which complements UbiF.This entry represents the conserved site of the ubiquinone biosynthesis hydroxylase enzyme.
The tripartite DENN (after differentially expressed in neoplastic versusnormal cells) domain is found in several proteins involved in Rab-mediatedprocesses or regulation of MAPKs (Mitogen-activated preotein kinases)signaling pathways. It actually consists of three parts as the original DENNdomain is always encircled on both sides by more divergent domains, calleduDENN (after upstream DENN) and dDENN (for downstream DENN). The tripartiteDENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity [, ].The general characteristics of DENN domains - three regions dDENN, DENNitself, and uDENN having different patterns of sequence conservation andseparated by sequences of variable length - suggest that they are composed ofat least three sub-domains which may feature distinct folds but which arealways associated due to functional and/or structural constraints [].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP) Human mitogen-activated protein kinase activating protein containing deathdomain (MADD). It is orthologous to Rab3GEP Caenorhabditis elegans regulator of presynaptic activity aex-3, theortholog of Rab3GEP Mouse Rab6 interacting protein 1 (Rab6IP1) Human SET domain-binding factor 1(SBF1) Human suppressor of tumoreginicity 5 (ST5) Human C-MYC promoter-binding protein IRLB This entry represents the C-terminal lobe of the DENN domain which consists of a 5-stranded beta sheet surrounded by helices [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a morecomplex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].
The nuclear hormone receptor subfamily 5 includes group A member 1 (NR5A1) or steroidogenic factor 1 (SF-1), group A member 2 (NR5A) or liver receptor homologue-1, and FTZ-F1 (group A member 3) and FTZ-F1 beta (group B member 1) from Drosophila [, ]. SF-1 is a key regulator for steroid biosynthesis []. NR5A2 is involved in bile acid/cholesterol homeostasis and in the development of some human cancers [].Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.
Serine-tRNA ligase () exists as monomer and belongs to class IIa [].The serine-tRNA ligases from a few of the archaea that belong to this group are different from the set of mutually more closely related serine-tRNA ligases from eubacteria, eukaryotes, and other archaea ().There are two distinct types of seryl-tRNA synthetase, as differentiated by primary sequence analysis, three-dimensional structure and substrate recognition mechanism: type 1 () is found in the majority of organisms (prokaryotes, eukaryotes and archaea), whereas type 2 (this entry) is confined to some methanogenic archaea []. Methanosarcina barkeri possesses two seryl-tRNA synthetases, one of each type [].The aminoacyl-tRNA synthetases (also known as aminoacyl-tRNA ligases) catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction [, ]. These proteins differ widely in size and oligomeric state, and have limited sequence homology []. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric []. Class II aminoacyl-tRNA synthetases share an anti-parallel β-sheet fold flanked by α-helices [], and are mostly dimeric or multimeric, containing at least three conserved regions [, , ]. However, tRNA binding involves an α-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, valine, and some lysine synthetases (non-eukaryotic group) belong to class I synthetases. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, phenylalanine, proline, serine, threonine, and some lysine synthetases (non-archaeal group), belong to class-II synthetases. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents annexin A2 that binds two calcium ions and inhibits phospholipase A2, following dephosphorylation by protein kinases involved in the signal transduction pathway. They may also cross-link plasma membrane phospholipids with actin and the cytoskeleton, and possibly play a part in exocytosis, since they are also involved in granule aggregation and membrane fusion.
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed byX-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Annexin A1, which inhibits phospholipase A2, either in response to inflammation, or following dephosphorylation by protein kinases involved in the signal transduction pathway. The protein may also associate with the cell cytoskeleton by binding to actin fibres.
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity,exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Type V annexin that behaves as an anticoagulant, acting as an indirect inhibitor of the thromboplastin-specific complex, which is involved in the blood coagulation casacade. It may also act as a form of calcium channel.
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity amongindividual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Type IV class annexins that have not been identified in all species known to possess annexins, but are thought to be required for the budding of clathrin-coated pits. Annexin IV localises to the apical membrane of epithelial cells. It modifies membrane bilayers by increasing rigidity, reducing permeability to water and H+ ions, promoting vesicle aggregation and regulating ion conductances [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C(in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Type III annexins that inhibit phospholipase A2 activity, and also play a role in inositol phosphate metabolism, cleaving the cyclic bond of inositol-1,2-cyclic phosphate to yield inositol-1-phosphate.
This family, found in archaea and eukaryotes, includes the only archaeal proteins markedly similar to bacterial TruB, the tRNA pseudouridine 55 synthase. However, among two related yeast proteins, the archaeal set matches yeast YLR175w far better than YNL292w. The first, termed centromere/microtubule binding protein 5 (CBF5), is an apparent rRNA pseudouridine synthase, while the second is the exclusive tRNA pseudouridine 55 synthase for both cytosolic and mitochondrial compartments. It is unclear whether archaeal proteins found by this entry modify tRNA, rRNA, or both. Yeast CBF5 plays a central role in ribosomal RNA processing. It is a probable catalytic subunit of H/ACA small nucleolar ribonucleoprotein (H/ACA snoRNP) complex, which catalyzes pseudouridylation of rRNA. This involves the isomerization of uridine such that the ribose is subsequently attached to C5, instead of the normal N1. Its pseudouridine ('psi') residues may serve to stabilise the conformation of rRNAs. It may function as a pseudouridine synthase. It is also a centromeric DNA-CBF3-binding factor which is involved in mitotic chromosome segregation [, , , ]. Human CBF5 homologue, DKC1 (also called Dyskerin), has been involved in a variety of disparate cellular functions. DKC1 isoform 1 is required for correct processing or intranuclear trafficking of TERC, the RNA component of the telomerase reverse transcriptase (TERT) holoenzyme []. In Hela cells, overexpression of DKC1 isoform 3 promotes cell to cell and cell to substratum adhesion, increases the cell proliferation rate and leads to cytokeratin hyper-expression []. Mutations in the human DKC1 gene cause the X-linked form of DC, a bone marrow failure syndrome characterised by mucosal leukoplakia, nail dystrophy, abnormal skin pigmentation, premature aging, stem cell dysfunction and increased susceptibility to cancer. DKC1 loss of function also causes the Hoyeraal-Hreidarsson syndrome, recognised as a severe X-DC allelic variant [, , , , , ].
Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negativelyregulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains []. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [, ]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons.
Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed "orphan"receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The progesterone receptor consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domain that mediates specific binding to target DNA sequences (ligand-responsive elements); and a hormone binding domain. The N-terminal domain is unique to the progesterone receptors and spans approximately the first 500 residues; the highly-conserved DNA-binding domain is smaller (around 65 residues) and occupies the central portion of the protein; and the hormone binding domain lies at the receptor C terminus.
This entry represents bacterial-type nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase enzymes involved in dimethylbenzimidazole synthesis []. This enzyme catalyses the synthesis of alpha-ribazole-5'-phosphate from nicotinate mononucleotide (NAMN) and 5,6-dimethylbenzimidazole (DMB). This function is essential to de novocobalamin (vitamin B12) production in bacteria. Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].
Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].This entry represents bacterial- and archaeal-type nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase enzymes involved in dimethylbenzimidazole synthesis, as well as a group of proteins of unknown function. This function is essential to de novocobalamin (vitamin B12) production in bacteria.The structure of CobT has a three layers (α/β/β) fold with parallel β-sheetof seven strands.
This entry represents bacterial- and archaeal-type nicotinate-nucleotide-dimethylbenzimidazole phosphoribosyltransferase enzymes involved in dimethylbenzimidazole synthesis, as well as a group of proteins of unknown function. This function is essential to de novocobalamin (vitamin B12) production in bacteria.Nicotinate mononucleotide (NaMN):5,6-dimethylbenzimidazole (DMB) phosphoribosyltransferase (CobT) plays a central role in the synthesis of alpha-ribazole-5'-phosphate, an intermediate for the lower ligand of cobalamin []. It is one of the enzymes of the anaerobic pathway of cobalamin biosynthesis, and one of the four proteins (CobU, CobT, CobC, and CobS) involved in the synthesis of the lower ligand and the assembly of the nucleotide loop [, ]. Vitamin B12(cobalamin) is used as a cofactor in a number of enzyme-catalysed reactions in bacteria, archaea and eukaryotes []. The biosynthetic pathway to adenosylcobalamin from its five-carbon precursor, 5-aminolaevulinic acid, can be divided into three sections: (1) the biosynthesis of uroporphyrinogen III from 5-aminolaevulinic acid; (2) the conversion of uroporphyrinogen III into the ring-contracted, deacylated intermediate precorrin 6 or cobalt-precorrin 6; and (3) the transformation of this intermediate to form adenosylcobalamin []. Cobalamin is synthesised by bacteria and archaea via two alternative routes that differ primarily in the steps of section 2 that lead to the contraction of the macrocycle and excision of the extruded carbon molecule (and its attached methyl group) []. One pathway (exemplified by Pseudomonas denitrificans) incorporates molecular oxygen into the macrocycle as a prerequisite to ring contraction, and has consequently been termed the aerobic pathway. The alternative, anaerobic, route (exemplified by Salmonella typhimurium) takes advantage of a chelated cobalt ion, in the absence of oxygen, to set the stage for ring contraction [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Annexin A9, which is a low affinity receptor for acetylcholine known to be targeted by disease-causing pemphigus vulgaris antibodies in keratinocytes [].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents a fungal annexin. The first fungal annexin was reported in 1998, encoded by the anx14 gene of the filamentous ascomycete Neurospora crassa. Phylogenetic analyses clustered the fungal annexin with homologous proteins from anumber of animal species; this is consistent with the existence of an animal-fungal clade.
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents annexin D proteins found in plants. Plant annexins generally lack N-terminal domains and functional calcium-binding sites in their second and third repeats. There are eight and nine annexin genes in the complete genomes of Arabidopsis and rice, respectively. These are a result of gene or genome duplication events [].
The Toprim (topoisomerase-primase) domain is a structurally conserved domain of ~100 amino acids that is found in bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, type IA and type II topoisomerases, bacterial and archaeal nucleases of the OLD family and bacterial DNA repair proteins of the RecR/M family. The Toprim domain can be found alone or in combination with several other domains, such as the ASM domain, the superfamily 2 helicase domain, the superfamily 3 helicase domain, the DnaB interaction domain, the C4 'little finger' domain, the CHC2 zinc finger, the ATPase domain of the HSP90-gyrase-histidine kinase superfamily, the S5 domain, the SET domain, the helix-hairpin-helix (HhH) DNA-binding domain, the mobilisation (MOB) domain or the ATPase domain of the ABC transporter/SMC superfamily. The Toprim domain is a catalytic domain involved in DNA strand breakage and rejoining [].The Toprim domain has two conserved motifs, one of which centres at a conserved glutamate and the other one at two conserved aspartates (DxD). Both motifs are preceded by conserved hydrophobic regions predicted to form β-strands. The glutamate residue is probably involved in catalysis, whereas the DxD motif is involved in the co-ordination of Mg(2++) that is required for the activity of all Toprim-containing enzymes. The Toprim domain has a compact alpha/beta fold, with four conserved strands and three helices; with the exception of the second helix and the C-terminal strands, each of these elements contains positions that are highly conserved. The Toprim domain contains three regions that can accommodate variable sized inserts, which are particularly prominent in the topoisomerases [, , ].
The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Type XIII annexin. Screening of a cDNA library with intestine-specific sequences revealed acDNA deriving from apparently gut-specific mRNA. Subsequent analysis revealed that this encoded a 316-residue protein belonging to the annexin/lipocortin family (specifically, annexin XIII). Type XIII annexin is also referred to as intestine-specific annexin (ISA) [].
Tumor necrosis factor receptor superfamily member 14 (TNFRSF14), also known as herpes virus entry mediator or HVEM, ATAR, CD270, HVEA, LIGHTR or TR2, regulates T-cell immune responses by activating inflammatory, as well as inhibitory signaling pathways. HVEM acts as a receptor for the canonical TNF-related ligand LIGHT (lymphotoxin-like), which exhibits inducible expression, and competes with herpes simplex virus glycoprotein D for HVEM [, ]. It also acts as a ligand for the immunoglobulin superfamily proteins BTLA (B and T lymphocyte attenuator) and CD160, a feature distinguishing HVEM from other immune regulatory molecules, thus, creating a functionally diverse set of intrinsic and bidirectional signaling pathways [].HVEM is highly expressed in the gut epithelium. Genome-wide association studies have shown that HVEM is an inflammatory bowel disease (IBD) risk gene, suggesting that HVEM could have a regulatory role influencing the regulation of epithelial barrier, host defense, and the microbiota []. Mouse models have revealed that HVEM is involved in colitis pathogenesis, mucosal host defense, and epithelial immunity, thus acting as a mucosal gatekeeper with multiple regulatory functions in the mucosa. HVEM plays a critical role in both tumor progression and resistance to antitumor immune responses, possibly through direct and indirect mechanisms. It is known to be expressed in several human malignancies, including esophageal squamous cell carcinoma, follicular lymphoma and melanoma. HVEM network may therefore be an attractive target for drug intervention [, ].TNF-receptors are modular proteins. The N-terminal extracellular part contains a cysteine-rich region responsible for ligand-binding. This region is composed of small modules of about 40 residues containing 6 conserved cysteines; the number and type of modules can vary in different members of the family [, , ].This entry represents the N-terminal domain of TNFRSF14, and is also found in the orphan TNFR, UL144, present in human Cytomegalovirus. UL144 binds BTLA, but not LIGHT, and inhibits T cell proliferation, selectively mimicking the inhibitory cosignaling function of HVEM [].
Over 70 metallopeptidase families have been identified to date. In these enzymes a divalent cation which is usually zinc, but may be cobalt, manganese or copper, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. In some families of co-catalytic metallopeptidases, two metal ions are observed in crystal structures ligated by five amino acids, with one amino acid ligating both metal ions. The known metal ligands are His, Glu, Asp or Lys. At least one other residue is required for catalysis, which may play an electrophillic role. Many metalloproteases contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site []. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases [].This group of metallopeptidases belong to MEROPS peptidase family M48 (subfamily M48B). The members of this set of proteins are mostly described as probable protease htpX homologue (). HtpX is a zinc-dependent endoprotease member of the membrane-localized proteolytic system in E. coli, which participates in the proteolytic quality control of membrane proteins in conjunction with FtsH, a membrane-bound and ATP-dependent protease. Biochemical characterisation revealed that HtpX undergoes self-degradation upon cell disruption or membrane solubilisation. It can also degraded casein and cleaves solubilised membrane proteins, for example, SecY []. Expression of HtpX in the plasma membrane is under the control of CpxR, with the metalloproteinase active site of HtpX located on the cytosolic side of the membrane. This suggests a potential role for HtpX in the response to mis-folded proteins [].
The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding 'helix-turn-helix' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors [, ]. The plastids of higher plants originating from an ancestral cyanobacterial endosymbiont also contain sigma factors that are encoded by a small family of nuclear genes. All plastid sigma factors belong to the superfamily of sigmaA/sigma70 and have sequences homologous to the conserved regions 1.2, 2, 3, and 4 of bacterial sigma factors [].Region 4 of sigma-70 like sigma-factors is involved in binding to the -35 promoter element via a helix-turn-helix motif []. Due to the way Pfam works, the threshold has been set artificially high to prevent overlaps with other helix-turn-helix families. Therefore there are many false negatives.
This entry represents S-adenosyl-L-methionine-dependent methyltransferases (SAM MTase). Methyltransferases transfer a methyl group from a donor to an acceptor. SAM-binding methyltransferases utilise the ubiquitous methyl donor SAM as a cofactor to methylate proteins, small molecules, lipids, and nucleic acids. All SAM MTases contain a structurally conserved SAM-binding domain consisting of a central seven-stranded β-sheet that is flanked by three α-helices per side of the sheet [].A review published in 2003 []divides all methyltransferases into 5 classes based on the structure of their catalytic domain (fold):class I: Rossmann-like α/βclass II: TIM β/α-barrel α/βclass III: tetrapyrrole methylase α/βclass IV: SPOUT α/β class V: SET domain all β Another paper []based on a study of the Saccharomyces cerevisiae methyltransferome argues for four more folds:class VI: transmembrane all-α class VII: DNA/RNA-binding 3-helical bundle all-αclass VIII: SSo0622-like α+βclass IX: thymidylate synthetase α+βThe vast majority of methyltransferases belong to the Rossmann-like fold (Class I) which consists in a seven-stranded β-sheet adjoined by α-helices. The β-sheet contains a central topological switch-point resulting in a deep cleft in which SAM binds. Class I methyltransferases display two conserved positions, the first one is a GxGxG motif (or at least a GxG motif) at the end of the first β-strand which is characteristic of a nucleotide-binding site and is hence used to bind the adenosyl part of SAM, the second conserved position is an acidic residue at the end of the second β-strand that forms one hydrogen bond to each hydroxyl of the SAM ribose part. The core of these enzymes is composed by about 150 amino acids that show very strong spatial conservation [].
The P-loop guanosine triphosphatases (GTPases) control amultitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPasesexert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The commondenominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides.Members of the dynamin GTPase family appear to be ubiquitous. They catalysediverse membrane remodelling events in endocytosis, cell division, and plastidmaintenance. Their functional versatility also extends to other core cellularprocesses, such as maintenance of cell shape or centrosome cohesion. Membersof the dynamin family are characterised by their common structure and byconserved sequences in the GTP-binding domain. The minimal distinguishingarchitectural features that are common to all dynamins and are distinct fromother GTPases are the structure of the large GTPase domain (~280 amino acids)and the presence of two additional domains: the middle domain and the GTPaseeffector domain (GED), which are involved in oligomerisationand regulation of the GTPase activity. In many dynamin family members, thebasic set of domains is supplemented by targeting domains, such as:pleckstrin-homology (PH) domain, proline-rich domains(PRDs), or by sequences that target dynamins to specific organelles, such asmitochondria and chloroplasts [, , ].The dynamin-type G domain consists of a central eight-stranded β-sheetsurrounded by seven alpha helices and two one-turn helices.It contains the five canonical guanine nucleotide binding motifs (G1-5). TheP-loop (G1) motif (GxxxxGKS/T) is also present in ATPases (Walker A motif) andfunctions as a coordinator of the phosphate groups of the bound nucleotide. Aconserved threonine in switch-I (G2) and the conserved residues DxxG ofswitch-II (G3) are involved in Mg(2+) binding and GTP hydrolysis. Thenucleotide binding affinity of dynamins is typically low, with specificity forGTP provided by the mostly conserved N/TKxD motif (G4). The G5 or G-cap motifis involved in binding the ribose moiety [, , ].This entry represents a conserved site in the dynamin-type G domain and is based on a highly conserved region downstream of the ATP/GTP-binding motif 'A' (P-loop).
Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Peroxisome proliferator-activated receptors (PPAR) are ligand-activatedtranscription factors that belong to the nuclear hormone receptor superfamily. Three cDNAs encoding PPARs have been isolated from Xenopus laevis: xPPAR alpha, beta and gamma []. All three xPPARs appear to be activated by both synthetic peroxisome proliferators and naturally occurring fatty acids, suggesting a common mode of action for all members of this subfamily of receptors []. Furthermore, the multiplicity of the receptors suggests the existence of hitherto unknown cellular signalling pathways for xenobiotics and putative endogenous ligands []. Synonym(s): 1C nuclear receptor
Peroxisome proliferator-activated receptors (PPAR) are ligand-activatedtranscription factors that belong to the nuclear hormone receptor superfamily. Three subtypes of this receptor have been discovered: PPAR alpha, beta and gamma []. They control a variety of target genes involved in lipid homeostasis, diabetes and cancer []. PPAR-alpha is a regulator of lipid metabolism []. It modulates the activities of all three fatty acid oxidation systems, namely mitochondrial and peroxisomal beta-oxidation and microsomal omega-oxidation []. Oleoylethanolamide (OEA), a naturally occurring lipid that regulates feeding and body weight, has been shown to bind with high affinity to PPAR-alpha []. Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcriptionfactors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.
Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Novel members of the steroid receptor superfamily designated NOR-1 (neuronderived orphan receptor) [], Nurr1 (Nur-related factor 1) [], and NGFI-B []have been identified from forebrain neuronal cells undergoing apoptosis, from brain cortex, and from lung, superior cervical ganglia and adrenal tissue respectively. The NOR-1 protein binds to the B1a response-element, which has been identified as the target sequence of the Nur77 family, suggesting that three members of the Nur77 family may transactivate common target gene(s) at different situations [, ]. Ewing's sarcoma is characterised by chromosomal translocations that involve the NOR protein [].