This entry represents the virulence associated protein C (VapC)-like PIN (PilT N terminus) domain of the Synechocystis sp. (strain PCC 6803) Sll0205 protein and other uncharacterized homologs. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism []. VapC-like PIN domains are homologs of flap endonuclease-1 (FEN1)-like PIN domains, but lack the extensive arch/clamp region and the H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region, seen in FEN1-like PIN domains []. PIN domains within this subgroup contain four highly conserved acidic residues. These putative active site residues are thought to bind Mg2+ and/or Mn2+ ions and be essential for single-stranded ribonuclease activity [, ].
Matrix metalloproteinases (MMPs) are zinc-dependent and calcium-dependent proteases that cleave within a polypeptide (endopeptidases). They degrade most components of the extracellular matrix (such as growth factors, their binding proteins, and other bioactive molecules, as well as binding sites for cell-surface molecules) and some non-extracellular-matrix molecules []. Two categories of MMPs can be recognised based on their cellular localisation: soluble vs. membrane-bound. The soluble MMPs are divided into the collagenases (MMP1, MMP8 and MMP13), gelatinases (MMP2 and MMP9), stromelysins (MMP3, MMP12) and those yet to be classified. The membrane-bound MMPs include MT1, 2, 3, 4, 5 and their hallmark is the presence of plasma membrane anchoring domains []. MMPs are highly expressed in various cancers, both by tumour cells and in surrounding stromal cells such as macrophages []. Matrix metalloproteinase-17 (MMP17; MEROPS identifier M10.017) or membrane-type matrix metalloproteinase 4 degrades various components of the extracellular matrix, such as fibrin. It may be involved in the activation of membrane-bound precursors of growth factors or inflammatory mediators, such as tumour necrosis factor-alpha []. It may also be involved in tumour progression [, ].
This entry includes the N-terminal catalytic domain of aspartokinase (AK) of the bifunctional enzyme AK-homoserine dehydrogenase (HSDH). These aspartokinases are found in bacteria (E. coli AKI-HSDHI, ThrA and E. coli AKII-HSDHII, MetL) and higher plants (Z. mays AK-HSDH). AK and HSDH are the first and third enzymes in the biosynthetic pathway of the aspartate family of amino acids. AK catalyzes the phosphorylation of Asp to P-aspartyl phosphate. HSDH catalyzes the NADPH-dependent conversion of Asp 3-semialdehyde to homoserine. ThrA and MetL are involved in threonine and methionine biosynthesis, respectively. In E. coli, ThrA is subject to allosteric regulation by the end product L-threonine and the native enzyme is reported to be tetrameric. As with bacteria, plant AK and HSDH are feedback inhibited by pathway end products. Maize AK-HSDH is a Thr-sensitive 180kDa enzyme. Arabidopsis AK-HSDH is an alanine-activated, threonine-sensitive enzyme whose ACT domains, located C-terminal to the AK catalytic domain, were shown to be involved in allosteric activation [, , , , , , , ].
This entry represents the N-terminal catalytic aspartokinase (AK) domain of the lysine-sensitive aspartokinase isoenzyme AKII of Bacillus subtilis 168, the lysine plus threonine-sensitive aspartokinase of Corynebacterium glutamicum, and related sequences. In B. subtilis 168, the regulation of the diaminopimelate (Dap)-lysine biosynthetic pathway involves dual control by Dap and lysine, effected through separate Dap- and lysine-sensitive aspartokinase isoenzymes. The B. subtilis 168 AKII is induced by methionine, and repressed and inhibited by lysine. Although Corynebacterium glutamicum is known to contain a single aspartokinase isoenzyme type, both the succinylase and dehydrogenase variant pathways of DAP-lysine synthesis operate simultaneously in this organism. In this organism and other various Gram-positive bacteria, the DAP-lysine pathway is feedback regulated by the concerted action of lysine and threonine. Also included in this entry are the aspartokinases of the extreme thermophile, Thermus thermophilus HB27, the Gram-negative obligate methylotroph, Methylophilus methylotrophus AS1, and those single aspartokinases found in Pseudomons, C. glutamicum, and Amycolatopsis lactamdurans. B. subtilis 168 AKII, and the C. glutamicum, Streptomyces clavuligerus and A. lactamdurans aspartokinases are described as tetramers consisting of two alpha and two beta subunits; the alpha (44 kD) and beta (18 kD) subunits formed by two in-phase overlapping polypeptides [, , , , , , , , , , ].
Glycoprotein hormones (or gonadotropins) are protein hormones, that includes the mammalian hormones follicle-stimulating hormone (FSH, also known as follitropin), luteinizing hormone (LH, also known as lutropin), thyroid-stimulating hormone (TSH, also known as thyrotropin) and human chorionic gonadotropin (hCG). These hormones are central to the complex endocrine system that regulates normal growth, sexual development, and reproductive function []. The hormones FSH, LH and TSH are secreted by the anterior pituitary gland [, ], while the choriogonadotropins are secreted by the placenta []. Glycoprotein hormone receptors are members the rhodopsin-like G-protein coupled receptor (GPCR) family. They function as receptors for the pituitary hormones thyrotropin (TSH receptor), follitropin (FSH receptor) and lutropin (LH receptor). In mammals the LH receptor is also the receptor for the placental hormone, human chorionic gonadotropin (hCG), so is denominated as a lutropin-choriogonadotropic hormone receptor (LHCG receptor). The receptors share close sequence similarity, and are characterised by large extracellular domains believed to be involved in hormone binding via leucine-rich repeats (LRR) [].This entry represents the glycoprotein hormone receptor family, which includes the follicle stimulating hormone receptor, lutropin-choriogonadotropic hormone receptor and the thyroid stimulating hormone receptor.
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 10 () comprises enzymes with two known activities; galactoside 3(4)-L-fucosyltransferase () and galactoside 3-fucosyltransferase (). The galactoside 3-fucosyltransferases display similarities with the alpha-2 and alpha-6-fucosyltranferases []. The biosynthesis of the carbohydrate antigen sialyl Lewis X (sLe(x)) is dependent on the activity of an galactoside 3-fucosyltransferase. This enzyme catalyses the transfer of fucose from GDP-beta-fucose to the 3-OH of N-acetylglucosamine present in lactosamine acceptors []. Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 3(4)-L-fucosyltransferase () belongs to the Lewis blood group system and is associated with Le(a/b) antigen.
Matrix metalloproteinases (MMPs) are zinc-dependent and calcium-dependent proteases that cleave within a polypeptide (endopeptidases). They degrade most components of the extracellular matrix (such as growth factors, their binding proteins, and other bioactive molecules, as well as binding sites for cell-surface molecules) and some non-extracellular-matrix molecules []. Two categories of MMPs can be recognised based on their cellular localisation: soluble vs. membrane-bound. The soluble MMPs are divided into the collagenases (MMP1, MMP8 and MMP13), gelatinases (MMP2 and MMP9), stromelysins (MMP3, MMP12) and those yet to be classified. The membrane-bound MMPs include MT1, 2, 3, 4, 5 and their hallmark is the presence of plasma membrane anchoring domains []. MMPs are highly expressed in various cancers, both by tumour cells and in surrounding stromal cells such as macrophages []. Matrix metalloproteinase-16 (MMP16; MEROPS identifier M10.016), also called MT3-MMP, degrades various components of the extracellular matrix, such as collagen type III and fibronectin. It has no effect on type I, II, IV and V collagen. However, upon interaction with CSPG4, it may be involved in degradation and invasion of type I collagen by melanoma cells []. MMP-16 can not only directly degrade some matrix molecules, but can also activate pro-MMP-2 (gelatinase A), one of the most important MMPs in tissue remodelling and cell migration [, ].
This is the ZF1 (Zinc Finger 1) domain found in Zic family proteins found in Eukaryotes. In humans, there are five members of the Zic family that are involved in human congenital anomalies. One of them, ZIC3, causes X-linked heterotaxy (HTX1), which is a left-right axis disturbance that manifests as variable combinations of heart malformation, altered lung lobation, splenic abnormality and gastrointestinal malrotation. Zic faily proteins contain multiple zinc finger domains (ZFD), which are generally composed of five tandemly repeated C2H2 zinc finger (ZF) motifs. Sequence comparison analysis reveal that this N-terminal ZF (ZF1) domain of the Zic zinc finger domains is unique in that it possesses more amino acid residues (6-38 amino acids) between the two cysteine residues of the C2H2 motif compared to Gli and Glis ZF1s or any of the other ZFs (ZF2-5) in the Gli/Glis/Zic superfamily of proteins. Mutations in cysteine 253 (C253S) or histidine 286 (H286R) in ZIC3 ZF1, which are found in heterotaxy patients, result in extranuclear localization of the mutant ZIC3 protein. Furthermore, mutations in the evolutionarily conserved amino acid residues (C253, W255, C268, H281 and H286) of ZF1 generally impair nuclear localization [].
The major intrinsic protein (MIP) family is large and diverse, possessing over 100 members that form transmembrane channels. These channel proteins function in water, small carbohydrate (e.g., glycerol), urea, NH3, CO2 and possibly ion transport, by an energy independent mechanism. They are found ubiquitously in bacteria, archaea and eukaryotes.The MIP family contains two major groups of channels: aquaporins and glycerol facilitators. The known aquaporins cluster loosely together as do the known glycerol facilitators. MIP family proteins are believed to form aqueous pores that selectively allow passive transport of their solute(s) across the membrane with minimal apparent recognition. Aquaporins selectively transport water (but not glycerol) while glycerol facilitators selectively transport glycerol but not water. Some aquaporins can transport NH3 and CO2. Glycerol facilitators function as solute nonspecific channels, and may transport glycerol, dihydroxyacetone, propanediol, urea and other small neutral molecules in physiologically important processes. Some members of the family, including the yeast FPS protein and tobacco NtTIPA may transport both water and small solutes. The structures of various members of the MIP family have been determined by means of X-ray diffraction [, , ], revealing the fold to comprise a right-handed bundle of 6 transmembrane (TM) α-helices [, , ]. Similarities in the N-and C-terminal halves of the molecule suggest that the proteins may have arisen through tandem, intragenic duplication of an ancestral protein that contained 3 TM domains [].
The plant cytochrome b6f is located in the thylakoid membrane and functions in both linear and cyclic electron transport, providing ATP and NADPH for photosynthetic carbon fixation. The cytochrome b6f complex has eight different subunits, six being encoded in the chloroplast genome (PetA [cyt f], PetB [cyt b6], PetD, PetG, PetL, and PetN) and two in the nucleus (PetC [Rieske FeS]and PetM. The complex functions as a dimer []. In cyanobacteria, the cytochrome b6f complex contains four large subunits, including cytochrome f, cytochrome b6, the Rieske iron-sulfur protein (ISP), and subunit IV; as well as four small hydrophobic subunits, PetG, PetL, PetM, and PetN []. This entry represents the Rieske FeS protein (encoded by the PetC gene) of the cytochrome b6f complex from plants and cyanobacteria. The Rieske subunit acts by binding plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to thecytochrome f haem iron. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit. In plants, Rieske FeS is required for the successful assembly of the b6f complex and is essential for photosynthesis [, ].
Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles [], and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus [].
This entry includes Escherichia coli ExoIII and related proteins. They are AP endonucleases that belong to the large EEP (exonuclease/endonuclease/phosphatase) superfamily. ExoIII removes the damaged DNA at cytosines and guanines by cleaving on the 3'-side of the AP site by a beta-elimination reaction. It exhibits 3'-5'-exonuclease, 3'-phosphomonoesterase, 3'-repair diesterase and ribonuclease H activities [].Cellular DNA is spontaneously and continuously damaged by environmental and internal factors such as X-rays, UV light and agents such as the antitumor drugs bleomycin and neocarzinostatin or those that generate oxygen radicals. Apurinic/apyrimidinic (AP) sites form both spontaneously and as highly cytotoxic intermediates in the removal of the damaged base by the base excision repair (BER) pathway. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3'terminus of DNA strand breaks. AP endonucleases can be classified into two families on the basis of sequence similarity and structure. ExoIII belongs to family 1 [].
Tyrosinase () []is a copper monooxygenases that catalyzes thehydroxylation of monophenols and the oxidation of o-diphenols to o-quinols.This enzyme, found in prokaryotes as well as in eukaryotes, is involved in theformation of pigments such as melanins and other polyphenolic compounds.Tyrosinase binds two copper ions (CuA and CuB). Each of the two copper ions hasbeen shown []to be bound by three conserved histidines residues. The regionsaround these copper-binding ligands are well conserved and also shared by somehemocyanins, which are copper-containing oxygen carriers from the hemolymph ofmany molluscs and arthropods [, ].At least two proteins related to tyrosinase are known to exist in mammals, and include TRP-1 (TYRP1) [], which is responsible for the conversion of 5,6-dihydro-xyindole-2-carboxylic acid (DHICA) to indole-5,6-quinone-2-carboxylic acid; and TRP-2 (TYRP2) [], which is the melanogenic enzyme DOPAchrome tautomerase() that catalyzes the conversion of DOPAchrome to DHICA. TRP-2differs from tyrosinases and TRP-1 inthat it binds two zinc ions insteadof copper [].Other proteins that belong to this family are plant polyphenol oxidases (PPO) (), which catalyze the oxidationof mono- and o-diphenols to o-diquinones []; and Caenorhabditis elegans hypothetical protein C02C2.1.
Formyl peptide receptors (FPR) are members of the rhodopsin-like G-protein coupled receptor family and are involved in chemotaxis [, ]. They were originally identified by their ability to bind N-formyl peptides (typified by fMet-Leu-Phe (fMLP)), produced by the degradation of either bacterial or host cells [, ]but subsequent ligands have been discovered, containing many microbial agonists derived from both bacteria and viruses [, ].FPRs were initially found on leukocytes, but they are expressed in other cells, for example, immature dendritic cells, platelets, microglial cells, astrocytes, fibroblasts and platelets [, ]. FPRs are expressed at high levels on polymorphonuclear and mononuclear phagocytes. Formyl peptide receptors are not only involved in mediating immune cell response to infection, but also act to suppress the immune system under certain conditions []. The main responses elicited upon ligation of formylated peptides, are those of morphological polarization, locomotion, production of reactive-oxygen species and release of proteolytic enzymes []. There are three formyl peptide receptor subtypes, FPR1, FPR2 and FPR3 [, ]. The sequence similarity between FPR1 and FPR2 is high (69%), and although there is a large sequence similarity also between FPR2 and FPR3 (83%), FPR3 can not bind formylated peptides [, ]. This entry includes the formyl peptide receptors and other related receptors such as C3a and C5a anaphylatoxin chemotactic receptors []and G-protein-coupled receptor CMKlR1 [].
Members of the mas-related receptor family (also known as oncogene-like MAS and mas-related G-protein coupled receptor MRG) have been implicated in the development, regulation and function of nociceptive neurons, specifically in the modulation of pain. Most members are orphaned, with no endogeneous ligand identified. Of the human mas-related GPCRs, four (MRGPRD, MRGPRE, MRGPRF and MRGPRG) are also found in rodents, whereas MRGPRX1, MRGPRX2, MRGPRX3 and MRGPRX4 are found exclusively in primates. Certain rodent MRGs have been reported to respond to adenine []and to RF-amide peptides, including neuropeptide FF [, ], but the relevance of these findings to man is unclear. MRGs are expressed predominantly in small diameter sensory neurons of the dorsal root ganglia, where there is emerging evidence that they may be mediators of histamine-independent itch [, ].This entry represents mas-related G protein-coupled receptor F. It is thought to be involved with nociceptor function and development, and directly involved in the modulation of pain. The receptor is currently orphaned; however, it is thought to be activated by a neuropeptide.
Bacterial high affinity transport systems are involved in activetransport of solutes across the cytoplasmic membrane. Most of the bacterial ABC (ATP-binding cassette) importers are composed of one or two transmembrane permease proteins, one or two nucleotide-binding proteins and a highly specific periplasmic solute-binding protein. In Gram-negative bacteria the solute-binding proteins are dissolved in the periplasm, while in archaea and Gram-positive bacteria, their solute-binding proteins are membrane-anchored lipoproteins [, ]. On the basis of sequence similarities, the vast majority of these solute-binding proteins can be grouped []into eight families or clusters, which generally correlate with the nature of the solute bound. Family 5 members include:Periplasmic oligopeptide-binding proteins (oppA) of Gram-negative bacteria and homologous lipoproteins in Gram-positive bacteria (oppA, amiA or appA)Periplasmic dipeptide-binding proteins of Escherichia coli (dppA) and Bacillus subtilis (dppE)Periplasmic murein peptide-binding protein of E. coli (mppA) Periplasmic peptide-binding proteins sapA of E. coli, Salmonella typhimurium and Haemophilus influenzaePeriplasmic nickel-binding protein (nikA) of E. coliHaem-binding lipoprotein (hbpA or dppA) from H. influenzaeLipoprotein xP55 from Streptomyces lividansHypothetical proteins from H. influenzae (HI0213) and Rhizobium sp. (strain NGR234) symbiotic plasmid (y4tO and y4wM)HTH-type transcriptional regulator SgrR from E. coli. The solute-binding domain is localised in its C-terminal [].
Found in the main part of the sperm flagellum, the fibrous sheath is a cytoskeletal structure comprising two longitudinal columns connected by closely spaced circumferential ribs []. A-kinase anchoring proteins (AKAPs) secure cyclic AMP-dependent protein kinases within specific cytoplasmic domains; most abundant in the fibrous sheath is AKAP4 []. AKAP4 has been shown to bind AKAP3 and two spermatogenic cell-specific proteins, Fibrous Sheath Interacting Proteins 1 and 2 (FSIP1, FSIP2) [].How the fibrous sheath assembles is not yet fully understood. AKAP4 is synthesised and incorporated into the nascent fibrous sheath late in spermatid development; its precursor is processed in the flagellum, and only the mature form of AKAP4 appears to bind AKAP3 []. It is likely, therefore, that AKAP3 is involved in organising the basic structure of the fibrous sheath, while AKAP4 helps to complete fibrous sheath assembly.The FSIP1 protein is 435 amino acids in length, and is rich in glutamine and asparagine []. Minimum domains for its binding to AKAP4 have been localised to amino acids 351-359 and 692-721 of AKAP4 constructs [].
N-acetylmuramoyl-L-alanine amidase or MurNAc-LAA (also known as peptidoglycan aminohydrolase, NAMLA amidase, NAMLAA, Amidase 3, and peptidoglycan amidase) is an autolysin that hydrolyzes the amide bond between N-acetylmuramoyl and L-amino acids in certain cell wall glycopeptides. These proteins are Zn-dependent peptidases with highly conserved residues involved in cation co-ordination.In Escherichia coli, there are five MurNAc-LAAs present: AmiA, AmiB, AmiC and AmiD that are periplasmic, and AmpD that is cytoplasmic. Three of these (AmiA, AmiB and AmiC) belong to this family, the other two (AmiD and AmpD) do not. E. coli AmiA, AmiB and AmiC play an important role in cleaving the septum to release daughter cells after cell division []. In general, bacterial MurNAc-LAAs are members of the bacterial autolytic system and carry a signal peptide in their N termini that allows their transport across the cytoplasmic membrane. However, the bacteriophage MurNAc-LAAs are endolysins since these phage-encoded enzymes break down bacterial peptidoglycan at the terminal stage of the phage reproduction cycle. As opposed to autolysins, almost all endolysins have no signal peptides and their translocation through the cytoplasmic membrane is thought to proceed with the help of phage-encoded holin proteins [].The amidase catalytic module is fused to another functional module (cell wall binding module) either at the N or C terminus, which is responsible for high affinity binding of the protein to the cell wall [].
The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 11 comprises enzymes with only one known activity; galactoside 2-L-fucosyltransferase (). Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Galactoside 2-L-fucosyltransferase 1 () and Galactoside 2-L-fucosyltransferase 2 () belong to the Hh blood group system and are associated with H/h and Se/se antigens.
This family consists of various coronavirus matrix proteins which are transmembrane glycoproteins []. The membrane (M) protein is the most abundant structural protein and defines the shape of the viral envelope. It is also regarded as the central organiser of coronavirus assembly, interacting with all other major coronaviral structural proteins. M proteins play a critical role in protein-protein interactions (as well as protein-RNA interactions) since virus-like particle (VLP) formation in many CoVs requires only the M and envelope (E) proteins for efficient virion assembly []. Interaction of spike (S) with M is necessary for retention of S in the ER-Golgi intermediate compartment (ERGIC)/Golgi complex and its incorporation into new virions, but dispensable for the assembly process. Binding of M to nucleocapsid (N) proteins stabilises the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and, ultimately, promotes completion of viral assembly. Together, M and E protein make up the viral envelope and their interaction is sufficient for the production and release of virus-like particles (VLPs) [, ].
The membrane (M) protein is the most abundant structural protein and defines the shape of the viral envelope. It is also regarded as the central organiser of coronavirus assembly, interacting with all other major coronaviral structural proteins. M proteins play a critical role in protein-protein interactions (as well as protein-RNA interactions) since virus-like particle (VLP) formation in many CoVs requires only the M and envelope (E) proteins for efficient virion assembly []. Interaction of spike (S) with M is necessary for retention of S in the ER-Golgi intermediate compartment (ERGIC)/Golgi complex and its incorporation into new virions, but dispensable for the assembly process. Binding of M to nucleocapsid (N) proteins stabilises the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and, ultimately, promotes completion of viral assembly. Together, M and E protein make up the viral envelope and their interaction is sufficient for the production and release of virus-like particles (VLPs) [, , ].This group contains the Membrane (M) protein of Rousettus bat coronavirus HKU9, and similar proteins from betacoronaviruses in the nobecovirus subgenera (D lineage).
The membrane (M) protein is the most abundant structural protein and defines the shape of the viral envelope. It is also regarded as the central organiser of coronavirus assembly, interacting with all other major coronaviral structural proteins. M proteins play a critical role in protein-protein interactions (as well as protein-RNA interactions) since virus-like particle (VLP) formation in many CoVs requires only the M and envelope (E) proteins for efficient virion assembly []. Interaction of spike (S) with M is necessary for retention of S in the ER-Golgi intermediate compartment (ERGIC)/Golgi complex and its incorporation into new virions, but dispensable for the assembly process. Binding of M to nucleocapsid (N) proteins stabilises the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and, ultimately, promotes completion of viral assembly. Together, M and E protein make up the viral envelope and their interaction is sufficient for the production and release of virus-like particles (VLPs) [, , ].This entry contains the Membrane (M) protein of human coronaviruses (HCoVs), HCoV-OC43 and HCoV-HKU1, and similar proteins from betacoronaviruses in the embecovirus subgenera (A lineage).
This entry contains the Membrane (M) protein of Middle East respiratory syndrome (MERS)-related CoV, bat-CoV HKU5, and similar proteins from betacoronaviruses in the merbecovirus subgenera (C lineage).The membrane (M) protein is the most abundant structural protein and defines the shape of the viral envelope. It is also regarded as the central organiser of coronavirus assembly, interacting with all other major coronaviral structural proteins. M proteins play a critical role in protein-protein interactions (as well as protein-RNA interactions) since virus-like particle (VLP) formation in many CoVs requires only the M and envelope (E) proteins for efficient virion assembly []. Interaction of spike (S) with M is necessary for retention of S in the ER-Golgi intermediate compartment (ERGIC)/Golgi complex and its incorporation into new virions, but dispensable for the assembly process. Binding of M to nucleocapsid (N) proteins stabilises the nucleocapsid (N protein-RNA complex), as well as the internal core of virions, and, ultimately, promotes completion of viral assembly. Together, M and E protein make up the viral envelope and their interaction is sufficient for the production and release of virus-like particles (VLPs) [, , ].
Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions []. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+and the counter-transport of K+, with no apparent dependence on Cl-[]. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters [, ].
This group contains Holliday junction resolvases (HJRs) of the archaeal type, Hjc and Hje proteins []. The Holliday junction is an essential intermediate of homologous recombination. Holliday junctions are four-stranded DNA complexes that are formed during recombination and related DNA repair events. In the presence of divalent cations, these junctions exist predominantly as the stacked-X form, in which the double-helical segments are coaxially stacked and twisted by 60 degrees in a right-handed direction across the junction crossover. In this structure, the stacked arms resemble two adjacent double helices, but are linked at the junction by two common strands that cross over between the duplexes [, , ]. During homologous recombination, genetic information is physically exchanged between parental DNAs via crossing single strands of the same polarity within the four-way Holliday structure. This process is terminated by the endonucleolytic activity of resolvases, which convert the four-way DNA back to two double strands. There is some structural similarity between the archeal HJRs and type II restriction endonucleases. This similarity includes their active site configurations [].
Cellulose, an aggregate of unbranched polymers of beta-1,4-linked glucose residues, is the major component of wood and thus paper, and is synthesized by plants, most algae, some bacteria and fungi, and even some animals. The genes that synthesize cellulose in higher plants differ greatly from the well-characterised genes found in Acetobacter and Agrobacterium spp. More correctly designated as "cellulose synthase catalytic subunits", plant cellulose synthase (CesA) proteins are integral membrane proteins, approximately 1,000 amino acids in length. There are a number of highly conserved residues, including several motifs shown to be necessary for processive glycosyltransferase activity [].An operon encoding 4 proteins required for bacterial cellulose biosynthesis(bcs) in Acetobacter xylinus (Gluconacetobacter xylinus) has been isolated via genetic complementationwith strains lacking cellulose synthase activity []. Nucleotide sequence analysis showed the cellulose synthase operon to consist of 4 genes, designated bcsA, bcsB, bcsC and bcsD, all of which are required for maximal bacterial cellulose synthesis in A. xylinum.The calculated molecular mass of the protein encoded by bcsD is 17.3kDa []. The function of BcsD is unknown.
This PIN domain can be found in the Pyrobaculum aerophilum proteins, Pae0151 (also known as VapC3) and Pae2754 (also known as VapC9), and their homologues []. They are similar to the PIN domains of the Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB toxins of the prokaryotic toxin/antitoxin operons, VapBC and FitAB, respectively, which are believed to be involved in growth inhibition by regulating translation. These toxins are nearly always co-expressed with an antitoxin, a cognate protein inhibitor, forming an inert protein complex. Disassociation of the protein complex activates the ribonuclease activity of the toxin by an, as yet undefined mechanism [, ].PIN domains are small protein domains identified by the presence of three strictly conserved acidic residues. Apart from these three residues, there is poor sequence conservation []. PIN domains are found in eukaryotes, eubacteria and archaea. In eukaryotes they are ribonucleases involved in nonsense mediated mRNA decay []and in processing of 18S ribosomal RNA []. In prokaryotes, they are the toxic components of toxin-antitoxin (TA) systems, their toxicity arising by virtue of their ribonuclease activity. The PIN domain TA systems are now called VapBC TAs(virulence associated proteins), where VapB is the inhibitor and VapC, the PIN-domain ribonuclease toxin [].
This entry represents the interlocking domain of the eukaryotic nuclear receptor coactivators CREBP and p300. The interlocking domain forms a 3-helical non-globular array that forms interlocked heterodimers with its target.Nuclear receptors are ligand-activated transcription factors involved in the regulation of many processes, including development, reproduction and homeostasis. Nuclear receptor coactivators act to modulate the function of nuclear receptors. Coactivators associate with promoters and enhancers primarily through protein-protein contacts to facilitate the interaction between DNA-bound transcription factors and the transcription machinery. Many of these coactivators are structurally related, including CBP (CREB-binding protein) and p300 []. CBP and p300 both have histone acetyltransferase activity (). CBP/p300 proteins function synergistically to activate transcription, acting to remodel chromatin and to recruit RNA polymerase II and the basal transcription machinery. CBP is required for proper cell cycle control, differentiation and apoptosis. The interaction of CBP/p300 with transcription factors involves several small domains. The IBiD domain in the C-terminal of CBP is responsible for CBP interaction with IRF-3, as well as with the adenoviral oncoprotein E1A, TIF-2 coactivator, and the IRF homologue KSHV IRF-1 [].
p38 kinases are mitogen-activated protein kinases (MAPKs), serving as important mediators of cellular responses to extracellular signals. They function in the regulation of the cell cycle, cell development, cell differentiation, senescence, tumorigenesis, apoptosis, pain development and pain progression, and immune responses. p38 kinases are activated by the MAPK kinases MKK3 and MKK6, which in turn are activated by upstream MAPK kinase kinases including TAK1, ASK1, and MLK3, in response to cellular stresses or inflammatory cytokines []. p38 substrates include other protein kinases and factors that regulate transcription, nuclear export, mRNA stability and translation. p38 kinases are drug targets for the inflammatory diseases psoriasis, rheumatoid arthritis, and chronic pulmonary disease [, ].Vertebrates contain four p38 kinases, named alpha, beta, gamma, and delta, which show varying substrate specificity and expression patterns.p38alpha/MAPK14 is expressed in most tissues and is the major isoform involved in the immune and inflammatory response. It is the central p38 MAPK involved in myogenesis []. It plays a role in regulating cell cycle check-point transition and promoting cell differentiation. p38alpha also regulates cell proliferation and death through crosstalk with the JNK pathway []. Its substrates include MAPK activated protein kinase 2 (MK2), MK5, and the transcription factors ATF2 and Mitf [].
This entry represents the serine/threonine-protein kinase TOR (target of rapamycin), which was first identified by mutations in yeast that confer resistance to the growth inhibitory properties of rapamycin []. TOR proteins are structurally and functionally conserved in all eukaryotes examined. However, yeasts contain two Tor proteins (Tor1 and Tor2), while higher eukaryotes such as humans possess a single TOR protein []. They are central regulators of cellular metabolism, growth and survival in response environmental signals [, , ]. In budding yeast, the Tor2 protein exists in two distinct multi-component complexes, TORC1 and TORC2. TORC1 regulates cell growth by regulating many growth-related processes and is rapamycin sensitive, while TROC2 regulates the cell cytoskeleton and is rapamycin insensitive. Budding yeast TORC1 consists of either Tor1 or Tor2 in complex with Kog1, Lst8 and Tco89, while TORC2 is composed of Avo1, Avo2, Tsc11, Lst8, Bit61, Slm, Slm2 and Tor2 [, ]. In both yeast and mammals, FKBP12-rapamycin binds to Tor (Tor1, Tor2, or mTOR) in TORC1, but not to Tor (Tor2 or mTOR) in TORC2. It has been suggested that the architecture of TORC2 or its unique composition might be responsible for the observed rapamycin resistance [].
Expansins are unusual proteins that mediate cell wall extension in plants. They are believed to act as a sort of chemical grease, allowingpolymers to slide past one another by disrupting non-covalent hydrogenbonds that hold many wall polymers to one another. This process is notdegradative and hence does not weaken the wall, which could otherwiserupture under internal pressure during growth.Sequence comparisons indicate at least four distinct expansin cDNAs inrice and at least six in Arabidopsis thaliana. The proteins are highly conserved insize and sequence (75-95% amino acid sequence similarity between any pairwise comparison), and phylogenetic trees indicate that this multigenefamily formed before the evolutionary divergence of monocotyledons and dicotyledons. Sequence and motif analyses show no similarities to knownfunctional domains that might account for expansin action on wall extension[]. It is thought that several highly-conserved tryptophans may function in expansin binding to cellulose, or other glycans. The high conservation of the family indicates that the mechanism by which expansins promote wallextensin tolerates little variation in protein structure.
Condensin is a multi-subunit protein complex that acts as an essential regulator of chromosome condensation [, ]. It contains both SMC (structural maintenance of chromosomes) and non-SMC subunits. Condensin plays an important role during mitosis in the compaction and resolution of chromosomes to remove and prevent catenations that would otherwise inhibit segregation.This is thought to be achieved by the introduction of positive supercoils into relaxed DNA in the presence of type I topoisomerases and converts nicked DNA into positive knotted forms in the presence of type II topoisomerases. During interphase condensin promotes clustering of dispersed loci into subnuclear domains and inhibits associations between homologues. In meiosis, condensin has been shown to influence the number of crossover events by regulating programmed double-strand breaks. Roles in gene regulation and lymphocyte development have also been defined.Condensin subunit 1 (known as Cnd1 in Schizosaccharomyces pombe (Fission yeast), and XCAP-D2 in Xenopus laevis laevis) represents one of the non-SMC subunits in the complex. This subunit is phosphorylated at several sites by Cdc2. This phosphorylation process increases the supercoiling activity of condensin [, ].This entry represents the conserved N-terminal domain of Cnd1.
Synonym: dark protochlorophyllide reductaseProtochlorophyllide reductase catalyzes the reductive formation of chlorophyllide from protochlorophyllide during biosynthesis of chlorophylls and bacteriochlorophylls. Three genes, bchL, bchN and bchB, are involved in light-independent protochlorophyllide reduction in bacteriochlorophyll biosynthesis. In cyanobacteria, algae, and gymnosperms, three similar genes, chlL, chlN and chlB are involved in protochlorophyllide reduction during chlorophylls biosynthesis. BchL/chlL, bchN/chlN and bchB/chlB exhibit significant sequence similarity to the nifH, nifD and nifK subunits of nitrogenase, respectively. Nitrogenase catalyzes the reductive formation of ammonia from dinitrogen []. The light-independent (dark) form of protochlorophyllide reductase plays a key role in the ability of gymnosperms, algae, and photosynthetic bacteria to form chlorophyll in the dark. Genetic and sequence analyses have indicated that dark protochlorophyllide reductase consists of three protein subunits that exhibit significant sequence similarity to the three subunits of nitrogenase, which catalyzes the reductive formation of ammonia from dinitrogen. Dark protochlorophyllide reductase activity was shown to be dependent on the presence of all three subunits, ATP, and the reductant dithionite.The BchL peptide (ChlL in chloroplast and cyanobacteria) is an ATP-binding iron-sulphur protein of the dark form protochlorophyllide reductase, an enzyme similar to nitrogenase [].
This entry contains members of the peptidase family S8 (subtilisin) []. Included in this entry are: PfSUB1 from the malaria parasite Plasmodium(MEROPS identifier S08.012), which activates the merozoite surface protein MSP1 allowing it to bind spectrin in the host erythrocyte membrane prior to egress []; perkinsin from the pathogenic marine protozoan Perkinsus(S08.041); and MCP-01 peptidase from the deep-sea bacterium Pseudoalteromonas(S08.130), which degrades insoluble collagen [].The subtilisin family is one of the largest serine peptidase families characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence []. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses []. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase [, ]. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity [, ]. Some subtilisins are mosaic proteins, while others contain N- and C-terminal extensions that show no sequence similarity to any other known protein [].
The Rap1 subgroup is part of the Rap subfamily of the Ras family. It can be further divided into the Rap1a and Rap1b isoforms. In humans, Rap1a and Rap1b share 95% sequence homology, but are products of two different genes located on chromosomes 1 and 12, respectively. Rap1a is sometimes called smg p21 or Krev1 in the older literature.Rap1 proteins are believed to perform different cellular functions, depending on the isoform, its subcellular localization, and the effector proteins it binds. For example, in rat salivary gland, neutrophils, and platelets, Rap1 localizes to secretory granules and is believed to regulate exocytosis or the formation of secretory granules [, ]. High expression of Rap1 has been observed in the nucleus of human oropharyngeal squamous cell carcinomas (SCCs) and cell lines; interestingly, in the SCCs, the active GTP-bound form localized to the nucleus, while the inactive GDP-bound form localized to the cytoplasm []. Rap1a, which is stimulated by T-cell receptor (TCR) activation, is a positive regulator of T cells by directing integrin activation and augmenting lymphocyte responses []. In murine hippocampal neurons, Rap1b determines which neurite will become the axon and directs the recruitment ofCdc42, which is required for formation of dendrites and axons []. In murine platelets, Rap1b is required for normal homeostasis in vivo and is involved in integrin activation [].
The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA []. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence []has revealed that S17 has virtually no groups exposed on the ribosomal surface.This entry represents ribosomal S17 proteins from bacteria and chloroplasts [].
Sialidases (neuraminidases) hydrolyse the non-reducing, terminal sialic acid linkage in various natural substrates, such as glycoproteins, glycolipids, gangliosides, and polysaccharides []. In mammals, sialidases occur in the lysosome, the cytosol, and associated with the plasma membrane. Sialidases have also been implicated in the pathogenesis of many diseases. For example, in viruses neuraminidases enable the transport of the virus through mucin, the eruption of the virus from the infected host cell, and the prevention of self-aggregation of virus particles through the destruction of the host cell receptor recognised by the virus []. Eukaryotic, bacterial and viral sialidases share highly conserved regions of β-sheet motifs. Bacterial sialidases often possess domains in addition to the catalytic sialidase domain, for instance the sialidase from Micromonospora viridifaciens contains three domains, of which the catalytic domain described here is the N-terminal domain []. Similarly, leech sialidase is a multidomain protein, where the catalytic domain is the C-terminal domain []. In several paramyxoviruses, sialidase forms part of the multi-functional haemagglutinin-sialidase glycoprotein found on the viral envelope [].
This entry represents the RNA recognition motif 3 (RRM3) of heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3). hnRNP H3 (also termed hnRNP 2H9) is a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H, hnRNP H2, and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock [, ]. The typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C terminus, which may allow it to homo- or heterodimerize [].
This entry represents the RNA recognition motif 2 (RRM2) of heterogeneous nuclear ribonucleoprotein H3 (hnRNP H3).hnRNP H3 (also termed hnRNP 2H9) is a nuclear RNA binding protein that belongs to the hnRNP H protein family that also includes hnRNP H, hnRNP H2, and hnRNP F. This family is involved in mRNA processing and exhibit extensive sequence homology. Little is known about the functions of hnRNP H3 except for its role in the splicing arrest induced by heat shock [, ]. The typical hnRNP H proteins contain contain three RNA recognition motifs (RRMs), except for hnRNP H3, in which the RRM1 is absent. RRM1 and RRM2 are responsible for the binding to the RNA at DGGGD motifs, and they play an important role in efficiently silencing the exon. Members in this family can regulate the alternative splicing of the fibroblast growth factor receptor 2 (FGFR2) transcripts, and function as silencers of FGFR2 exon IIIc through an interaction with the exonic GGG motifs. The lack of RRM1 could account for the reduced silencing activity within hnRNP H3. In addition, like other hnRNP H protein family members, hnRNP H3 has an extensive glycine-rich region near the C terminus, which may allow it to homo- or heterodimerize [].
This entry represents the RNA recognition motif 1 (RRM1) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [, ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe []and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
This entry represents the RNA recognition motif 2 (RRM2) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc[, ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe []and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
This entry represents the RNA recognition motif 3 (RRM3) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [, ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe []and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
This entry represents the RNA recognition motif 4 (RRM4) of ubiquitously expressed protein nucleolin.Nucleolin is a multifunctional major nucleolar phosphoprotein that has been implicated in various metabolic processes, such as ribosome biogenesis, cytokinesis, nucleogenesis, cell proliferation and growth, cytoplasmic-nucleolar transport of ribosomal components, transcriptional repression, replication, signal transduction, inducing chromatin decondensation, etc [, ]. Nucleolin exhibits intrinsic self-cleaving, DNA helicase, RNA helicase and DNA-dependent ATPase activities []. It can be phosphorylated by many protein kinases, such as the major mitotic kinase Cdc2, casein kinase 2 (CK2), and protein kinase C-zeta [, ]. Nucleolin shares similar domain architecture with gar2 from Schizosaccharomyces pombe []and NSR1 from Saccharomyces cerevisiae. The highly phosphorylated N-terminal domain of nucleolin is made up of highly acidic regions separated from each other by basic sequences, and contains multiple phosphorylation sites. The central domain of nucleolin contains four closely adjacent N-terminal RNA recognition motifs (RRMs), which suggests that nucleolin is potentially able to interact with multiple RNA targets. The C-terminal RGG (or GAR) domain of nucleolin is rich in glycine, arginine and phenylalanine residues, and contains high levels of NG, NG-dimethylarginines. RRM1, together with RRM2, binds specifically to RNA stem-loops containing the sequence (U/G)CCCG(A/G) in the loop [].
This entry represents the theta subunit of DNA polymerase III from bacteria, whose core structure consists of an irregular array of three helices [].DNA polymerase III (Pol III) is the primary enzyme responsible for replication of Escherichia coli chromosomal DNA. The holoenzyme consists of 17 proteins and contains two core polymerases. The Pol III catalytic core has three tightly associated subunits: alpha, epsilon and theta. The alpha subunit is responsible for the DNA polymerase activity, while the epsilon subunit is the 3'-5' proofreading exonuclease. The epsilon subunit binds to both the alpha and theta subunits in the linear order α-ε-theta. The theta subunit is the smallest, and may act to enhance the proofreading activity of epsilon, especially under extreme conditions []. This entry also includes a homologue of polymerase III theta called HOT (homologue of theta) from Bacteriophage P1. HOT contains three α-helices, as reported for theta, but the folding topology of the two is different, which could account for the suggested greater heat stability of HOT as compared to theta [].This superfamily also includes the uncharacterised protein YibT from Salmonella sp.
This superfamily represents the interlocking domain of various eukaryotic nuclear receptor coactivators, including CREBP, P300, Ncoa1, Ncoa2 and Ncoa3. The interlocking domain forms a 3-helical non-globular array that forms interlocked heterodimers with its target.Nuclear receptors are ligand-activated transcription factors involved in the regulation of many processes, including development, reproduction and homeostasis. Nuclear receptor coactivators act to modulate the function of nuclear receptors. Coactivators associate with promoters and enhancers primarily through protein-protein contacts to facilitate the interaction between DNA-bound transcription factors and the transcription machinery. Many of these coactivators are structurally related, including CBP (CREB-binding protein), P300 and ACTR (activator for thyroid and retinoid receptors) []. CBP and P300 both have histone acetyltransferase activity (). CBP/P300 proteins function synergistically to activate transcription, acting to remodel chromatin and to recruit RNA polymerase II and the basal transcription machinery. CBP is required for proper cell cycle control, differentiation and apoptosis. The interaction of CBP/P300 with transcription factors involves several small domains. The IBiD domain in the C-terminal of CBP is responsible for CBP interaction with IRF-3, as well as with the adenoviral oncoprotein E1A, TIF-2 coactivator, and the IRF homologue KSHV IRF-1 [].Ncoa1, Ncoa2 and Ncoa3 are all coactivators of various nuclear receptors. In addition, Ncoa1 and Ncoa3 both have histone acetyltransferase activity, but Ncoa2 does not [, ].
TRADD is a signalling adaptor protein involved in tumour necrosis factor-receptor I (TNFR1)-associated apoptosis and cell survival. The decision between apoptosis and cell survival involves the interplay between two sequential signalling complexes. The plasma membrane-bound complex I is comprised of TNFR1, TRADD, the kinase RIP1, and TRAF2, which together mediate the activation of NF-kappaB. Subsequently, complex II is formed in the cytoplasm, where TRADD and RIP1 associate with FADD and caspase-8. If NF-kappaB is activated by complex I, then complex II will associate with the caspase-8 inhibitor FLIP(L) and the cell survives, while the failure to activate NF-kappaB leads to apoptosis [].TRADD contains two functionally separate domains, which allow the protein to couple to two distinct signaling pathways. The TRADD C-terminal death domain is responsible for its association with TNFR1, and with the death-domain proteins FADD and RIP1, which promote apoptosis. The TRADD N-terminal domain binds TRAF2 and promotes TRAF2 recruitment to TNFR1, thereby mediating the activation of NK-kappaB and JNK/AP1, which promote cell survival []. The N-terminal TRADD domain is composed of an α/β sandwich, where the β-strands form an antiparallel β-sheet.
Proteins containing this domain are highly conserved in species ranging from archaea to vertebrates and plants [], including several Shwachman-Bodian-Diamond syndrome (SBDS, OMIM 260400) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterised by bone marrow failure and leukemia predisposition. Members of this superfamily play a role in RNA metabolism [, ]. In yeast Sdo1 is involved in the biogenesis of the 60S ribosomal subunit and translational activation of ribosomes. Together with the EF-2-like GTPase RIA1 (EfI1), it triggers the GTP-dependent release of TIF6 from 60S pre-ribosomes in the cytoplasm, thereby activating ribosomes for translation competence by allowing 80S ribosome assembly and facilitating TIF6 recycling to the nucleus, where it is required for 60S rRNA processing and nuclear export. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition [].A number of uncharacterised hydrophilic proteins of about 30kDa share regions of similarity. These include,Mouse protein 22A3.Saccharomyces cerevisiae chromosome XII hypothetical protein YLR022c.Caenorhabditis elegans hypothetical protein W06E11.4.Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ0592.
Gas vesicles are small, hollow, gas filled protein structures found in several cyanobacterial and archaebacterialmicroorganisms []. They allow the positioning of the bacteria at the favourable depth for growth.Gas vesicles are hollow cylindrical tubes, closed by a hollow, conical cap at each end. Both the conical endcaps and central cylinder are made up of 4-5 nm wide ribs that run at right angles to the long axis of thestructure. Gas vesicles seem to be constituted of two different protein components, GVPa and GVPc. GVPa, asmall protein of about 70 amino acid residues, is the main constituent of gas vesicles and form the essentialcore of the structure. The sequence of GVPa is extremely well conserved. GvpJ and GvpM, two proteins encodedin the cluster of genes required for gas vesicle synthesis in the archaebacteria Halobacterium salinarium andHalobacterium mediterranei (Haloferax mediterranei), have been found []to be evolutionary related to GVPa. The exact functionof these two proteins is not known, although they could be important for determining the shape determinationgas vesicles. The N-terminal domain of Aphanizomenon flos-aquae protein GvpA/J is also related to GVPa.
The α-helical ferredoxin domain contains two Fe4-S4 clusters, typical of bacterial ferredoxin. Iron-sulphur proteins play an important role in electron transfer processes and in various enzymatic reactions. In eukaryotes, the mitochondria are the major site of Fe-S cluster biosynthesis in the cell, used for the assembly of mitochondrial and non-mitochondrial Fe-S proteins. The α-helical ferredoxin domain is present in several proteins involved in redox reactions, including the C-terminal of the respiratory proteins succinate dehydrogenase (SQR) in bacteria/mitochondria, and fumarate reductase (QFR) in bacteria. SQR is analogous to the mitochondrial respiratory complex II, and is involved in the electron transport pathway from succinate as a donor to the acceptor ubiquinone. SQR helps prevent the formation of reactive oxygen species and is used during aerobic respiration, whereas QFR does not and, consequently, is used to catalyse the final step of anaerobic respiration using the acceptor fumarate [].The α-helical ferredoxin domain is also present in the N-terminal of the cytosolic protein dihydropyrimidine dehydrogenase, (DPD) which catalyses the NADPH-dependent, rate-limiting step in pyrimidine degradation, converting pyrimidines to 5,6-dihydro compounds []. DPD catalysis involves electron transfer from NADPH to the substrate via the Fe4-S4 centre and FAD. In mammals, this pathway produces the neurotransmitter beta-alanine.
This entry represents the theta subunit of DNA polymerase III from bacteria, whose core structure consists of an irregular array of three helices [].DNA polymerase III (Pol III) is the primary enzyme responsible for replication of Escherichia coli chromosomal DNA. The holoenzyme consists of 17 proteins and contains two core polymerases. The Pol III catalytic core has three tightly associated subunits: alpha, epsilon and theta. The alpha subunit is responsible for the DNA polymerase activity, while the epsilon subunit is the 3'-5' proofreading exonuclease. The epsilon subunit binds to both the alpha and theta subunits in the linear order α-ε-theta. The theta subunit is the smallest, and may act to enhance the proofreading activity of epsilon, especially under extreme conditions []. This entry also includes a homologue of polymerase III theta called HOT (homologue of theta) from Bacteriophage P1. HOT contains three α-helices, as reported for theta, but the folding topology of the two is different, which could account for the suggested greater heat stability of HOT as compared to theta [].
Tumor necrosis factor receptor superfamily member 11A (TNFRSF11A), also known as RANK, FEO, OFE, ODFR, OSTS, PDB2, CD26, OPTB7, TRANCER, or LOH18CR1, induces the activation of NF-kappa B and MAPK8/JNK through interactions with various TRAF adaptor proteins []. This receptor and its ligand are important regulators of the interaction between T cells and dendritic cells. The receptor is also an essential mediator for osteoclast and lymph node development. Mutations at this locus have been associated with familial expansile osteolysis, autosomal recessive osteopetrosis, and Juvenile Paget's disease (JPD) of bone [, , , ]. Alternatively spliced transcript variants have been described for this locus []. Mutation analysis may improve diagnosis, prognostication, recurrence risk assessment, and perhaps treatment selection among the monogenic disorders of RANKL/OPG/RANK activation.This entry represents the N-terminal domain of TNFRSF11A. TNF-receptors are modular proteins. The N-terminal extracellular part contains a cysteine-rich region responsible for ligand-binding. This region is composed of small modules of about 40 residues containing 6 conserved cysteines; the number and type of modules can vary in different members of the family [, , ].
TRADD is a signalling adaptor protein involved in tumour necrosis factor-receptor I (TNFR1)-associated apoptosis and cell survival. The decision between apoptosis and cell survival involves the interplay between two sequential signalling complexes. The plasma membrane-bound complex I is comprised of TNFR1, TRADD, the kinase RIP1, and TRAF2, which together mediate the activation of NF-kappaB. Subsequently, complex II is formed in the cytoplasm, where TRADD and RIP1 associate with FADD and caspase-8. If NF-kappaB is activated by complex I, then complex II will associate with the caspase-8 inhibitor FLIP(L) and the cell survives, while the failure to activate NF-kappaB leads to apoptosis [].TRADD contains two functionally separate domains, which allow the protein to couple to two distinct signaling pathways. The TRADD C-terminal death domain is responsible for its association with TNFR1, and with the death-domain proteins FADD and RIP1, which promote apoptosis. The TRADD N-terminal domain binds TRAF2 and promotes TRAF2 recruitment to TNFR1, thereby mediating the activation of NK-kappaB and JNK/AP1, which promote cell survival []. The N-terminal TRADD domain is composed of an α/β sandwich, where the β-strands form an antiparallel β-sheet.
Tumor necrosis factor receptor superfamily member 4 (TNFRSF4), also known as OX40, ACT35, CD134, IMD16 or TXGP1L, activates NF-kappaB through its interaction with adaptor proteins TRAF2 and TRAF5 []. It also promotes the expression of apoptosis inhibitors BCL2 and BCL2lL1/BCL2-XL, and thus suppresses apoptosis []. It is primarily expressed on activated CD4+ and CD8+ T cells, where it is transiently expressed and upregulated on the most recently antigen-activated T cells within inflammatory lesions. This makes it an attractive target to modulate immune responses, i.e. TNFRSF4 (OX40) blocking agents to inhibit adverse inflammation or agonists to enhance immune responses [, ]. An artificially created biologic fusion protein, OX40-immunoglobulin (OX40-Ig), prevents OX40 from reaching the T-cell receptors, thus reducing the T-cell response. Some single nucleotide polymorphisms (SNPs) of its natural ligand OX40 ligand (OX40L, CD252), which is also found on activated T cells, have been associated with systemic lupus erythematosus [].This entry represents the N-terminal domain of TNFRSF4. TNF-receptors are modular proteins. The N-terminal extracellular part contains a cysteine-rich region responsible for ligand-binding. This region is composed of small modules of about 40 residues containing 6 conserved cysteines; the number of modules can vary in number and type in different members of the family [, , ].
Breast cancer anti-estrogen resistance protein 3 (BCAR3) is an SH2-containing signal transducer that regulates the proliferation in breast cancer cells []. BCAR3 binds to the adaptor molecule p130Cas (also known as BCAR1), which function as key signalling nodes with important regulatory roles in normal and pathological cells []. BCAR3 promotes cell motility by regulating actin cytoskeletal and adhesion remodeling in invasive breast cancer cells []. It also promotes interactions between p130Cas and the protein tyrosine kinase c-Src, leading to increased c-Src kinase activity and p130Cas phosphorylation [].This entry represents the SH2 domain found in SHEP1 (also known as SH2D3C), BCAR3 and NSP1 (also known as SH2D3A). SHEP1, BCAR3 and NSP1 are cytoplasmic proteins involved in cell adhesion/migration and antiestrogen resistance. All three proteins contain an SH2 domain and an exchange factor-like domain that binds both Ras GTPases and the scaffolding protein Cas []. In general, SH2 domains are involved in signal transduction. They typically bind pTyr-containing ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites [].
Proteins in this entry are EutA ethanolamine utilization proteins, reactivating factors for ethanolamine ammonia lyase, encoded by the ethanolamine utilization eut operon [].The holoenzyme of adenosylcobalamin-dependent ethanolamine ammonia-lyase (EutBC, , ), which is part of the ethanolamine utilization pathway [, , ], undergoes suicidal inactivation during catalysis as well as inactivation in the absence of substrate. The inactivation involves the irreversible cleavage of the Co-C bond of the coenzyme. The inactivated holoenzyme undergoes rapid and continuous reactivation in the presence of ATP, Mg2+, and free adenosylcobalamin in permeabilised cells (in situ), homogenate, and cell extracts of Escherichia coli. The EutA protein is essential for reactivation. It wasdemonstrated with purified recombinant EutA that both the suicidally inactivated and O2-inactivated holoethanolamine ammonia lyase underwent rapid reactivation in vitro by EutA in the presence of adenosylcobalamin, ATP, and Mg2+ []. The inactive enzyme-cyanocobalamin complex was also activated in situ and in vitro by EutA under the same conditions. Thus EutA is believed to be the only component of the reactivating factor for ethanolamine ammonia lyase. Reactivation and activation occur through the exchange of modified coenzyme for free intact adenosylcobalamin [].Bacteria that harbor the ethanolamine utilization pathway can use ethanolamine as a source of carbon and nitrogen. For more information on the ethanolamine utilization pathway, please see , .
Methylpurine-DNA glycosylase (MPG, or alkyladenine DNA glycosylase (AAG)) is a base excision-repair protein, catalyzing the first step in base excision repair by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site. MPG bends DNA by intercalating between the base pairs, causing the damaged base to flip out of the double helix and into the enzyme active site for cleavage. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA [, , , , , , , , , , ]. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts []. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension [].Although AAG represents one of six DNA glycosylase classes, it lacks the helix-hairpin-helix active site motif associated with other base excision repair glycosylases and is structurally distinct from them.
Fat storage-inducing transmembrane protein family (FIT/FITM, also known as Acyl-coenzyme A diphosphatase ) plays an important role in lipid droplet accumulation. They are endoplasmic reticulum (ER) resident membrane proteins that induce lipid droplet accumulation in cell culture and when expressed in mouse liver []; they hydrolyses fatty acyl-CoA to yield acyl-4'-phosphopantetheine and adenosine 3',5'-bisphosphate, with preference of unsaturated long-chain acyl-CoA substrates in the ER []. The ability to store fat in the form of cytoplasmic triglyceride droplets is conserved from yeast to humans, important for maintaining ER structure and for lipid droplets (LDs) biogenesis, which are lipid storage organelles involved in maintaining lipid and energy homeostasis [, ]. The FIT family of proteins are not involved in triglyceride biosynthesis [].In mammals there are two FIT proteins, FIT1, which is muscle specific and FIT2, which is expressed in most other tissues [, ]. Yeast has two FIT2 orthologues, called Scs3p and Yft2p but no FITM1. FIT1 and FIT2 proteins are six-transmembrane-domain containing proteins with both the N and C termini residing in the cytosol. FIT2 is the more ancient conserved homologue of the FIT family; this family of proteins do not share sequence similarity to known proteins or domains.
This family consists of several radial spoke protein 3 (RSP3) sequences. Eukaryotic cilia and flagella present in diverse types of cells perform motile, sensory, and developmental functions in organisms from protists to humans. They are centred by precisely organised, microtubule-based structures, the axonemes. The axoneme consists of two central singlet microtubules, called the central pair, and nine outer doublet microtubules. These structures are well conserved during evolution. The outer doublet microtubules, each composed of A and B sub-fibres, are connected to each other by nexin links, while the central pair is held at the centre of the axoneme by radial spokes. The radial spokes are T-shaped structures extending from the A-tubule of each outer doublet microtubule to the centre of the axoneme. Radial spoke protein 3 (RSP3), is present at the proximal end of the spoke stalk and helps in anchoring the radial spoke to the outer doublet. It is thought that radial spokes regulate the activity of inner arm dynein through protein phosphorylation and dephosphorylation [].
Phosphatidylinositol-specific phospholipase C, a eukaryotic intracellular enzyme, plays an important role in signal transduction processes []. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins [, , ]. In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC. All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as the 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distancebetween these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. By profile analysis, we could show that sequences with significant similarity to the X-box domain occur also in prokaryotic and trypanosome PI-specific phospholipases C. Apart from this region, the prokaryotic enzymes show no similarity to their eukaryotic counterparts.
The Herpesvirus major capsid protein (MCP) is the principal protein of the icosahedral capsid, forming the main component of the hexavalent and probably the pentavalent capsomeres. The capsid shell consists of 150 MCP hexamers and 12 MCP pentamers. One pentamer is found at each of the 12 apices of the icosahedral shell, and the hexamers form the edges and 20 faces []. The MCP can be considered as having three domains: floor, middle and upper. The floor domains form a thin largely continuous layer, or shell, and are the only parts that interact directly to form intercapsomeric connections. They also interact with the internal scaffolding protein during capsid assembly []. The remainder of the protein extends radially outward from the capsid producing the hexamer and pentamer capsomere structures. The middle domains are involved in binding to the triplexes that lie between and link adjacent capsomeres []. The upper domains form the tops of the hexamer and pentamer towers and are the binding sites for the small capsid protein VP26 in the hexons and for tegument proteins in the pentons.
This entry represents the haemagglutinin-esterase fusion glycoprotein (HEF) found specifically in infectious anaemia virus (ISAV), an orthomyxovirus-type virus that is an important fish pathogen in marine aquaculture [, ]. Other viruses, such as influenza C virus, coronaviruses and toroviruses, also contain surface HEF proteins, but whereas they usually bind 9-O-acetylsialic acid receptors, ISAV HEF appears to bind 4-O- acetylsialic acid receptors []. Haemagglutinin-esterase fusion glycoprotein is a multi-functional protein embedded in the viral envelope of ISAV. HEF is required for infectivity, and functions to recognise the host cell surface receptor, to fuse the viral and host cell membranes, and to destroy the receptor upon host cell infection. The haemagglutinin region of HEF is responsible for receptor recognition and membrane fusion. The serine esterase region of HEF is responsible for the destruction of the receptor, though it appears to be distinct from the esterase domain found in influenza C virus.Haemagglutinin-esterase glycoproteins must usually be cleaved by the host's trypsin-like proteases to produce two peptides (HEF1 and HEF2) necessary for the virus to be infectious. The cleaved HEF protein can then fuse the viral envelope to the cellular membrane of the host cell, which allows the virus to infect the host cell.
This family includes FIT1.Fat storage-inducing transmembrane protein family (FIT/FITM, also known as Acyl-coenzyme A diphosphatase ) plays an important role in lipid droplet accumulation. They are endoplasmic reticulum (ER) resident membrane proteins that induce lipid droplet accumulation in cell culture and when expressed in mouse liver []; they hydrolyses fatty acyl-CoA to yield acyl-4'-phosphopantetheine and adenosine 3',5'-bisphosphate, with preference of unsaturated long-chain acyl-CoA substrates in the ER []. The ability to store fat in the form of cytoplasmic triglyceride droplets is conserved from yeast to humans, important for maintaining ER structure and for lipid droplets (LDs) biogenesis, which are lipid storage organelles involved in maintaining lipid and energy homeostasis [, ]. The FIT family of proteins are not involved in triglyceride biosynthesis [].In mammals there are two FIT proteins, FIT1, which is muscle specific and FIT2, which is expressed in most other tissues [, ]. Yeast has two FIT2 orthologues, called Scs3p and Yft2p but no FITM1. FIT1 and FIT2 proteins are six-transmembrane-domain containing proteins with both the N and C termini residing in the cytosol. FIT2 is the more ancient conserved homologue of the FIT family; this family of proteins do not share sequence similarity to known proteins or domains.
This family includes FIT1 and FIT2 proteins.Fat storage-inducing transmembrane protein family (FIT/FITM, also known as Acyl-coenzyme A diphosphatase ) plays an important role in lipid droplet accumulation. They are endoplasmic reticulum (ER) resident membrane proteins that induce lipid droplet accumulation in cell culture and when expressed in mouse liver []; they hydrolyses fatty acyl-CoA to yield acyl-4'-phosphopantetheine and adenosine 3',5'-bisphosphate, with preference of unsaturated long-chain acyl-CoA substrates in the ER []. The ability to store fat in the form of cytoplasmic triglyceride droplets is conserved from yeast to humans, important for maintaining ER structure and for lipid droplets (LDs) biogenesis, which are lipid storage organelles involved in maintaining lipid and energy homeostasis [, ]. The FIT family of proteins are not involved in triglyceride biosynthesis [].In mammals there are two FIT proteins, FIT1, which is muscle specific and FIT2, which is expressed in most other tissues [, ]. Yeast has two FIT2 orthologues, called Scs3p and Yft2p but no FITM1. FIT1 and FIT2 proteins are six-transmembrane-domain containing proteins with both the N and C termini residing in the cytosol. FIT2 is the more ancient conserved homologue of the FIT family; this family of proteins do not share sequence similarity to known proteins or domains.
This family represents Kae1 and its homologues from Archaea. They belong to the Kae1/TsaD family. Its partner kinase Bud32 is fused with it in about half of the known archaeal genomes []. The pair, which appears universal in the archaea, corresponds to EKC/KEOPS complex in eukaryotes []. The first characterised member of the Kae1/TsaD family was annotated as Gcp for O-sialoglycoprotein endopeptidase [], but this activity could not be confirmed []. Later, its homologue, Kae1 from Pyrococcus abyssi, has been shown to have DNA-binding properties and apurinic-endonuclease activity []. Members of this family have since been studied in yeast, archaea and bacteria resulting in sometimes conflicting data, several proposed functions and annotations but no definitive characterisation. For instance, some members have been linked to DNA maintenance in bacteria and mitochondria []and transcription regulation and telomere homeostasis in eukaryotes [, ], but their function remained unclear. Recent research indicates that this family is involved in the biosynthesis of N6-threonylcarbamoyl adenosine, a universal modification found at position 37 of tRNAs that read codons beginning with adenine [].
Neuroplastin and basigin are members of the immunoglobulin (Ig) superfamily. They have two Ig domains that project into the extracellular space. They are also present as isoforms containing an additional N-terminal Ig domain. They contain a glutamate (E) at exactly the same position in the transmembrane domain, which may be important for molecular interactions within the membrane region [].Basigin is present on the surface of tumour cells and stimulate nearby fibroblasts to synthesise matrix metalloproteases (MMPs), which play an important role in tumour invasiveness and metastasis []. Basigin has also been repeatedly implicated in the proper function of the blood brain barrier [, ]. In addition, the protein is essential for fertility in both males and females[]. In males, it is required for the completion of spermatogenesis, while in females, it is needed for maintaining normal reproductive functions. Basigins are highly glycosylated membrane proteins, the degree of glycosylation varying with tissue type. The extracellular region of basigin contains two randomly arranged Ig domains: an Ig-like C2-type domain and an Ig-like V-domain [].
Tyrosine-type site-specific recombinases mediate a wide range of important genetic rearrangement reactions. They catalyse functionally diverse processes, including integration and excision of phages from their host chromosomes, conjugative transposition, partitioning of phage, bacterial and plasmid genomes during cell division, antigenic phase variation, dissemination of antibiotic- and antiseptic-resistance gene cassettes, and relaxation of DNA supercoils. Tyrosine recombinases have two or three domains, depending upon whether the system includes regulated integration and excision reactions. The C-terminal catalytic domain contains six active site amino acids, RK(H/K)R(H/W)Y, required for catalysis. The core-binding (CB) domain, which interacts primarily with the major groove on the attachment site and facilitates binding to the core DNA sequence, is widely conserved among viral, eubacterial and archaeal recombinases. It is also involved in protein-protein interactions. Some recombinases, like lambda Int and IntDOT, have an additional amino-terminal domain that recognises sites, called arm-type sites, that flank the crossover region and gives directionality to the recombination reaction [, , ].This entry represents the CB domain, which contains four major α-helices, arranged in an orthogonally crossed conformation [, ].
Retroviral matrix proteins (or major core proteins) are components of envelope-associated capsids, which line the inner surface of virus envelopes and are associated with viral membranes []. Matrix proteins are produced as part of Gag precursor polyproteins. During viral maturation, the Gag polyprotein is cleaved into major structural proteins by the viral protease, yielding the matrix (MA), capsid (CA), nucleocapsid (NC), and some smaller peptides. Gag-derived proteins govern the entire assembly and release of the virus particles, with matrix proteins playing key roles in Gag stability, capsid assembly, transport and budding. Although matrix proteins from different retroviruses appear to perform similar functions and can have similar structural folds which predominantly consist of four closely packed α-helices that are interconnected through loops, their primary sequences can be very different []. This superfamily represents structurally homologous matrix proteins from different retroviruses, their structure consisting of four-five alpha helices in a right-handed superhelix. Retroviral matrix proteins bearing this structure have been isolated from Human immunodeficiency virus (HIV), Simian immunodeficiency virus (SIV-cpz), Human T-lymphotropic virus 1, Human T-cell leukemia virus 2 (HTLV-2), Mason-Pfizer monkey virus (MPMV) (Simian Mason-Pfizer virus), Rous sarcoma virus (RSV), Equine infectious anemia virus (EIAV), and Moloney murine leukemia virus (MoMLV). This entry also identifies matrix proteins from several eukaryotic endogenous retroviruses, which arise when one or more copies of the retroviral genome becomes integrated into the host genome [].
During the development of the vertebrate nervous system, many neurons become redundant (because they have died, failed to connect to target cells, etc.) and are eliminated. At the same time, developing neurons send out axon outgrowths that contact their target cells []. Such cells control their degree of innervation (the number of axon connections) by the secretion of various specific neurotrophic factors that are essential for neuron survival. One of these is nerve growth factor (NGF), which is involved in the survival of some classes of embryonic neuron (e.g., peripheral sympathetic neurons) []. NGF is mostly found outside the central nervous system (CNS), but slight traces have been detected in adult CNS tissues, although a physiological role for this is unknown []; it has also been found in several snake venoms [, ]. Proteins similar to NGF include brain-derived neurotrophic factor (BDNF) and neurotrophins 3 to 7, all of which demonstrate neuron survival and outgrowth activities. This entry represents Neurotrophin-6 (NT-6), which has been identified in two species of platty fish []. It has been shown to have trophic effects on embryonic sympathetic neurons, similar to those of NGF [].
Pyridoxamine 5'-phosphate oxidase () is an enzyme that is involved in the de novo synthesis of pyridoxine (vitamin B6) and pyridoxal phosphate. It oxidizes pyridoxamine-5-P (PMP) and pyridoxine-5-P (PNP) to pyridoxal-5-P. The enzyme requires the presence of flavin mononucleotide (FMN) as a cofactor, although there is some evidence that coenzyme F420 may perform this role in some species [].The sequences of the enzyme from bacterial (genes pdxH or fprA) []andfungal (gene PDX3) []sources show that this protein has been highly conserved throughout evolution. PdxH is evolutionary related []to one of the enzymes in the phenazine biosynthesis protein pathway, phzD (also known as phzG).This entry represents one of the two dimerisation regions of the protein, located at the edge of the dimer interface, at the C terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In , S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule []. To date, the only time functional oxidase or phenazine biosynthesis activities have been experimentally demonstrated is when the sequences contain both and . It is unknown the role performed by each domain in bringing about molecular functions of either oxidase or phenazine activity [].
The voltage-sensitive sodium channel consists of an ion conducting pore forming alpha-subunit regulated by one or more non-pore-forming beta subunits. There are five different beta-subunit proteins (beta-1, beta-1B, beta-2, beta-3, and beta-4) encoded by four genes (SCN1B-SCN4B; beta-1B is a splice variant of SCN1B) [, ]. Beta-subunits modulate the kinetics and voltage dependence of the alpha-subunits and they also affect the voltage-gated potassium channels. These subunits participate in nonconducting roles, including cell-cell and cell-matrix adhesion, directing neuronal proliferation, migration, and fasciculation, and modulating the effects of pharmacological compounds on voltage-gated sodium channels, playing important roles in development and disease [, ].Subunit beta-1 is crucial in the assembly, expression, and functional modulation of the sodium channel that can modulate multiple alpha subunit isoforms from brain, skeletal muscle, and heart [, ]. Both beta-1 and beta-3 associate with neurofascin through their extracellular immunoglobulin-like domains. This association may target the sodium channels to the nodes of Ranvier of developing axons and retain these channels at the nodes in mature myelinated axons [].
During the development of the vertebrate nervous system, many neurons become redundant (because they have died, failed to connect to target cells, etc.) and are eliminated. At the same time, developing neurons send out axon outgrowths that contact their target cells []. Such cells control their degree of innervation (the number of axon connections) by the secretion of various specific neurotrophic factors that are essential for neuron survival. One of these is nerve growth factor (NGF), which is involved in the survival of some classes of embryonic neuron (e.g., peripheral sympathetic neurons) []. NGF is mostly found outside the central nervous system (CNS), but slight traces have been detected in adult CNS tissues, although a physiological role for this is unknown []; it has also been found in several snake venoms [, ]. Proteins similar to NGF include brain-derived neurotrophic factor (BDNF) and neurotrophins 3 to 7, all of which demonstrate neuron survival andoutgrowth activities. In contrast to mammalian NGFs, which exist as multimeric complexes of alpha, beta and gamma subunits, snake venom NGFs exist almost exclusively as beta-chains []. They act as low-potency neurotrophic tyrosine kinase receptor type 1 (NTRK1; also called TrkA) agonists [], and have been shown to promote survival and differentiation of cultured cells [].
This entry represents the SET domain found in SETD2 from animals, ASHH2 from plants and Set2 from fungi. Proteins containing this domain are a group of histone methyltransferases that methylates histone H3 to form H3K36me [, ].Yeast Set2 is involved in transcription elongation as well as in transcription repression []. The methyltransferase activity of budding yeast Set2 requires the recruitment to the RNA polymerase II, which is CTK1 dependent [, , , , , , ]. Plant ASHH2 is required for the correct expression of genes essential to reproductive development [].SETD2 acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate [, ]. SETD2 is also required for DNA double-strand break repair and activation of the p53-mediated checkpoint []. SETD2-inactivation has been linked to tumour development []. SETD2 also methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy []. Moreover, SETD2 is also involved in interferon-alpha-induced antiviral defense by mediating both monomethylation of STAT1 at 'Lys-525' and catalyzing H3K36me3 on promoters of some interferon-stimulated genes (ISGs) to activate gene transcription [].SETD2 has been linked to several human diseases, including Renal cell carcinoma (RCC) [], Luscan-Lumish syndrome (LLS) [], Leukemia, acute lymphoblastic (ALL) []and Leukemia, acute myelogenous (AML) [, ].
Endothelins are small proteins that play an important role in the regulation of the cardiovascularsystem [, , ]. They are the most potent vasoconstrictors known, they stimulatecardiac constriction, regulate release of vasoactive substances, and stimulate mitogenesis in blood vessels in primary culture. They alsostimulate contraction in almost all other smooth muscles (e.g., uterus,bronchus, vas deferensa and stomach) and stimulate secretion in severaltissues (e.g., kidney, liver and adrenals). Endothelin receptors have alsobeen found in the brain, e.g. cerebral cortex, cerebellum and glial cells.Endothelins have been implicated in a variety of pathophysiologicalconditions associated with stress, including hypertension, myocardialinfarction, subarachnoid haemorrhage and renal failure.Endothelins are synthesised by proteolysis of large preproendothelins,which are cleaved to `big endothelins' before being processed to themature peptide. Three distinct human endothelins encoded by separategenes have been identified: ET1, ET2 and ET3 are present in lung, kidney,adrenal gland, brain and other tissues. The sequences of the peptidescontain 4 cysteine residues, which are involved in disulphide bondformation, and are highly similar to the SRTX family from snake venom.
This entry represents the most conserved part of the core region of ubiquitin conjugation factor E4 (or Ub elongating factor, or Ufd2P), running from helix α-11 to α-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C terminus, (), which has ligase activity. Ubiquitin conjugation factor E4 is involved in N-terminal ubiquitin fusion degradation proteolytic pathway (UFD pathway). E4 binds to the ubiquitin moieties of preformed conjugates and catalyses ubiquitin chain assembly in conjunction with E1, E2, and E3. E4 appears to influence the formation and topology of the multi-Ub chain as it enhances ubiquitination at 'Lys-48' but not at 'Lys-29' of the N-terminal Ub moiety.
The LanC-like protein superfamily encompasses a highly divergent group of peptide-modifying enzymes, including the eukaryotic and bacterial lanthionine synthetase C-like proteins (LanC) [, , ]; subtilin biosynthesis protein SpaC from Bacillus subtilis [, ]; epidermin biosynthesis protein EpiC from Staphylococcus epidermidis []; nisin biosynthesis protein NisC from Lactococcus lactis [, , ]; GCR2 from Arabidopsis thaliana []; and many others. The 3D structure of the lantibiotic cyclase from L. lactis has been determined by X-ray crystallography to 2.5A resolution []. The globular structure is characterised by an all-alpha fold, in which an outer ring of helices envelops an inner toroid composed of 7 shorter, hydrophobic helices. This 7-fold hyrophobic periodicity has led several authors to claim various members of the family, including eukaryotic LanC-1 and GCR2, to be novel G protein-coupled receptors [, ]; some of these claims have since been corrected [, , ]. This entry represents Nisin biosynthesis protein NisC [, , ]is believed to be involved in the cyclisation of the lantibiotic nisin: specifically, nisin contains 5 cyclic thioethers, which are installed by the NisC enzyme.
This entry represents the PX domain found in Sorting nexin-6 (SNX6).SNX6 was found to interact with members of the transforming growth factor-beta family of receptor serine/threonine kinases. Strong heteromeric interactions were also seen among SNX1, -2, -4, and -6, suggesting the formation in vivoof oligomeric complexes. SNX6 is localized in the cytoplasm where it is thought to target proteins to the trans-Golgi network []. In addition, SNX6 was found to be translocated from the cytoplasm to nucleus by Pim-1, an oncogene product of serine/threonine kinase. This translocation is not affected by Pim-1-dependent phosphorylation, but the functional significance is unknown [].The Phox Homology (PX) domain is a phosphoinositide (PI) binding module present in many proteins with diverse functions. Sorting nexins (SNXs) make up the largest group among PX domain containing proteins. They are involved in regulating membrane traffic and protein sorting in the endosomal system. The PX domain of SNXs binds phosphoinositides (PIs) and targets the protein to PI-enriched membranes [, ]. SNXs differ from each other in PI-binding specificity and affinity, and the presence of other protein-protein interaction domains, which help determine subcellular localization and specific function in the endocytic pathway [, , ].
This family represents ScpB, which along with ScpA () interacts with SMC in vivo forming a complex that is required for chromosome condensation and segregation [, ]. The SMC-Scp complex appears to be similar to the MukB-MukE-Muk-F complex in Escherichia coli [], where MukB () is the homologue of SMC. ScpA and ScpB have little sequence similarity to MukE () or MukF (), they are predicted to be structurally similar, being predominantly α-helical with coiled coil regions. In general scpA and scpB form an operon in most bacterial genomes. Flanking genes are highly variable suggesting that the operon has moved throughout evolution. Bacteria containing an smc gene also contain scpA or scpB but not necessarily both. An exception is found in Deinococcus radiodurans, which contains scpB but neither smc nor scpA. In the archaea the gene order SMC-ScpA is conserved in nearly all species, as is the very short distance between the two genes, indicating co-transcription of the both in different archaeal genera and arguing that interaction of the gene products is not confined to the homologues in Bacillus subtilis. It would seem probable that, in light of all the studies, SMC, ScpA and ScpB proteins or homologues act together in chromosome condensation and segregation in all prokaryotes [].
Chromatin assembly factor 1 (CAF-1) consists of three evolutionary conserved subunits, p150, p60, and p48 (yeast homologues Cac1, cac2 and cac3 respectively), and mediates the assembly of nucleosomes onto newly replicated DNA. The p150 subunit (CAF-1_p150, also known as subunit A) is the core component of the CAF-1 histone chaperone complex, which functions in depositing newly synthesised and acetylated histones H3/H4 into chromatin during DNA replication and repair [, ], being essential for cell viability and efficient DNA replication. The p150 subunit contains the interaction regions with proliferating cell nuclear antigen (PCNA), heterochromatin protein 1 (HP1), the CAF-1 p60 subunit among others proteins []. It is thought that the DNA association with two histone-bound CAF-1 complexes may promote the formation of the (H3-H4)2 tetramer on DNA [].This entry represents the N-terminal region of the CAF-1 subunit p150 that contains one of the PCNA (proliferating cell nuclear antigen) binding sites, designated PIP1 and the heterochromatin protein 1 (HP1) interacting domain MIR []. These domains are dispensable for p150 role in nucleosome assembly and it is thought that this N-terminal region of CAF-1 might act as a regulatory domain contributing to CAF-1-PCNA interaction stability or mediate other functions during DNA repair or heterochromatin maintenance [].
Torsin-1A belongs to the Torsin family. Torsin-1A inhibits cell adhesion and neurite extension through interference with cytoskeletal dynamics and is implicated in primary dystonia, an autosomal-dominant movement disorder [, ]. It interacts with lamina-associated polypeptide 1 (LAP1) in the nuclear envelope and lumenal domain like LAP1 (LULL1) in the endoplasmic reticulum [, ]. Torsin-1A does not display ATPase activity in isolation, its ATP hydrolysis function is induced upon association with LAP1 and LULL1 []. It may play an important role in protein folding or trafficking at the endoplasmic reticulum []. In humans, Torsin-1A is widely expressed and appears to have its most critical role in the central nervous system (CNS), where it is present at high levels during development []. It modulates synaptic vesicle recycling []. It is involved in several biological processes, such as nuclear envelope integrity, cytoskeleton organisation, nuclear polarity, chaperone functions and degradation of misfolded proteins [].Torsin family members are membrane-associated ATPases belonging to the AAA+ (ATPase associated with a variety of cellular activities) superfamily of ATPases. The AAA+ ATPases typically form oligomeric rings obtaining energy from ATP hydrolysis and acting as chaperone-like modules [, ]. However, torsins lack conserved catalytic residues typically found in related ATPases and do not display ATPase activity unless they are engaged by their regulatory cofactors lamina-associated polypeptide 1 (LAP1) or luminal domain-like LAP1 (LULL1) [], which are type II transmembrane proteins located in the nuclear envelope and endoplasmic reticulum (ER) []. LAP1 and LULL1 integrate into the Torsin ring to produce the biologically active ATPase machine [].
This entry represents the HflX-type G domain.The P-loop guanosine triphosphatases (GTPases) control amultitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPasesexert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The commondenominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides[, , ].Within the translation factor-related (TRAFAC) class of P-loop GTPases, theHflX-type is a widely distributed family of GTPases that interact with thelarge ribosomal subunit. The broad phylogenetic distribution pattern of HflXGTPases in Bacteria, Archaea, and Eukaryotes (including human) suggests abasic cellular function for this protein family.The HflX-type G domain is composed of six β-strands and five α-helices []. It consists of the following conserved sequence motifs:the G1 motif (or P-loop), consensus GX4GK(S/T), which is responsible forinteracting with the alpha and beta-phosphates of nucleotide di- andtriphosphates; the G2 variable effector loop (DXnT); the G3 motif (DX2G),which interacts with the gamma-phosphate of nucleotide triphosphates; and theG4 motif (NKXD), which conveys specificity for guanine nucleotides throughhydrogen bonding to the base [].
The P-loop guanosine triphosphatases (GTPases) control amultitude of biological processes, ranging from cell division, cell cycling,and signal transduction, to ribosome assembly and protein synthesis. GTPasesexert their control by interchanging between an inactive GDP-bound state andan active GTP-bound state, thereby acting as molecular switches. The commondenominator of GTPases is the highly conserved guanine nucleotide-binding (G)domain that is responsible for binding and hydrolysis of guanine nucleotides.The p47 or immunity-related GTPases (IRG) are at least as old as thevertebrates. The IRG proteins are an essential resistance system in the mousefor immunity against pathogens that enter the cell via a vacuole. Despite itsimportance for the mouse, the IRG resistance system is absent from humansbecause it has been lost during the divergent evolution of the primates. TheIRG proteins appear to be accompanied phylogenetically by homologous proteins,named 'quasi IRG' (IRGQ) proteins, that probably lack nucleotide binding orhydrolysis function, and that may form regulatory heterodimers with functionalIRG proteins. The region of lowest similarity is in the G domain, andconserved GTP-binding motifs are lacking [, , ].
DOCK family members are evolutionarily conserved guanine nucleotide exchange factors (GEFs) for Rho-family GTPases []. DOCK proteins are required during several cellular processes, such as cell motility and phagocytosis. The N-terminal SH3 domain of the DOCK proteins functions as an inhibitor of GEF, which can be relieved upon its binding to the ELMO1-3 adaptor proteins, after their binding to active RhoG at the plasma membrane [, ]. DOCK family proteins are categorised into four subfamilies based on their sequence homology: DOCK-A subfamily (DOCK1/180, 2, 5), DOCK-B subfamily (DOCK3, 4), DOCK-C subfamily (DOCK6, 7, 8), DOCK-D subfamily (DOCK9, 10, 11) []. This entry represents the C2 domain found in the Dock-C members. In addition to the C2 domain (also known as DHR-1 domain) and the DHR-2 domain, Dock-C members contain a functionally uncharacterised domain upstream of the C2 domain. DHR-2 has the catalytic activity for Rac and/or Cdc42, but is structurally unrelated to the DH domain. The C2/DHR-1 domains of Dock1 (also known as Dock180) and Dock4 have been shown to bind phosphatidylinositol-3, 4, 5-triphosphate (PtdIns(3,4,5)P3) [, , ].
T cell-dependent immune processes require cell-surface interactions thatmediate the initiation, modulation and the ultimate course of the response.The specificity of T cell recognition is determined by the engagement of theT cell receptor (TCR) on T cells with cognate peptide-MHC complexes presented by antigen presenting cells (APCs). Additional signals arerequired to sustain and enhance T cell activity, the most important of whichis provided by the engagement of CD28 on T cells with its ligands B7-1(CD80) and B7-2 (CD86). By contrast, the interaction of B7 isoformswith cytotoxic T lymphocyte-associated molecule-4 CTLA-4, a CD28 homologue receptor on T cells (31% identity), provides inhibitory signals requiredfor down-regulation of the response, while it may also prevent T cell activation by weak TCR signals[, , , , ].Sequence comparison between human CTLA-4 and CD28 proteins suggests they arehomologous, with the highest of degree of similarity being in the juxta-membrane and cytoplasmic regions. In addition, the cytoplasmic domainsof human and murine CTLA-4 are identical, suggesting that this region hasimportant functional properties [].
In eukaryotes and archaea, the e/aIF2 factor is involved in the initiation of protein biosynthesis. In its GTP bound form, e/aIF2 delivers methionylated initiator tRNA to the small subunit of the ribosome. After the pairing between the AUG initiation codon on mRNA and the CAU anticodon of the initiator tRNA, GTP is hydrolysed and e/aIF2:GDP is released from the ribosome. In eukaryotes, eIF2B acts as the guanine nucleotide exchange factor for eIF2. Archaea have no equivalent of eIF2B, and the exchange between GDP and GTP is thought to be spontaneous []. eIF2 is composed of three subunits, alpha, beta and gamma. The gamma subunit forms the core of the heterotrimer and confers both tRNA binding and GTP/GDP binding [].This entry represents the C-terminal domain of the gamma subunit of eukaryotic translation initiation factor 2 (eIF2-gamma) found in Eukaryota and Archaea. This domain has a beta barrel structure with Greek key topology. It is required for formation of the ternary complex with GTP and initiator tRNA [].
Phosphoglycerate kinase () (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.Trypanosoma brucei and Crithidia fasciculata both contain three different phosphoglycerate kinase (PGK) genes, A, B and C []. The genes B and C encode the major PGKs: the cytosolic and glycosomal PGKs, respectively. The PGK-A genes of both Trypanosomatid species encode open reading frames related to PGK, which have most active site residues conserved, but contain an insert of 80 amino acids at approximately position 80 of the 420 amino acids average PGK sequence []. PGK-A may be a minor PGK with special function []. This entry includes phosphoglycerate kinase-A/B/C (PGK-A/B/C) from euglenozoa.
The Bacteriophage T4 gene 59 helicase assembly protein (Gp59) is required for recombination-dependent DNA replication and repair, which is the predominant mode of DNA replication in the late stage of T4 infection. Gp59 accelerates the loading of the T4 gene 41 helicase during DNA synthesis by the T4 replication system in vitro. This protein binds to both T4 gene 41 helicase and T4 gene 32 single-stranded DNA binding protein, and to single and double-stranded DNA [].The C-terminal domain of the T4 gene 59 helicase assembly protein consists of seven α-helices with short intervening loops and turns; the surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. The hydrophobic region on the 'bottom' surface of the domain near the C-terminal helix binds the leading strand DNA, whilst the hydrophobic region on the, top, surface of the domain lies between the two arms of the fork DNA, allowing for T4 gene 41 helicase binding and assembly into a hexameric complex around the lagging strand [].
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks []. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA [].Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils [].This entry represents subunit A (parC) of topoisomerase IV from Gram-positive bacteria. Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication []. Topoisomerase IV consists of two polypeptide subunits, parC (subunit A), which is homologous to gyrA of topoisomerase II, and parE (subunit B), whichis homologous to gyrB of topoisomerase II.
DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks []. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis [, ]. DNA topoisomerases are divided into two classes: type I enzymes (; topoisomerases I, III and V) break single-strand DNA, and type II enzymes (; topoisomerases II, IV and VI) break double-strand DNA [].Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils [].This entry represents subunit A (parC) of topoisomerase IV from Gram-negative bacteria. Topoisomerase IV primarily decatenates DNA and relaxespositive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication []. Topoisomerase IV consists of two polypeptide subunits, parC (subunit A), which is homologous to gyrA of topoisomerase II, and parE (subunit B), which is homologous to gyrB of topoisomerase II.
Sorbin is an active peptide present in the digestive tract, where it has pro-absorptive and anti-secretory effects in different parts of the intestine, including the ability to decrease VIP (vasoactive intestinal peptide) and cholera toxin-induced secretion. It is expressed in some intestinal and pancreatic endocrine tumours in humans [].Sorbin-homology (SoHo) domains are found in adaptor proteins such as vinexin, CAP/ponsin and argBP2, which regulate various cellular functions, including cell adhesion, cytoskeletal organisation, and growth factor signalling []. In addition to the sorbin domain, these proteins contain three SH3 (src homology 3) domains. The sorbin homology domain mediates the interaction of vinexin and CAP with flotillin, which is crucial for the localisation of SH3-binding proteins to the lipid raft, a region of the plasma membrane rich in cholesterol and sphingolipids that acts to concentrate certain signalling molecules. The sorbin homology domain of adaptor proteins may mediate interactions with the lipid raft that are crucial to intracellular communication [].Human sorbin is generated via splicing of an alternative transcript from the ArgBP2 gene locus [].
The clustered protocadherin (Pcdh) family is the largest subgroup of the cadherin superfamily. In mammals, the clustered Pcdh family consists of three gene clusters: Pcdh-alpha, Pcdh-beta, and Pcdh-gamma (Pcdhg) []. The genomic organisation of the human protocadherin gene clusters is remarkably similar to that of immunoglobulin and T-cell receptor genes. The extracellular and transmembrane domains of each protocadherin protein are encoded by an unusually large "variable"region exon, while the intracellular domains are encoded by three small "constant"region exons located downstream from a tandem array of variable region exons [, ]. The clustered Pcdh proteins are predominantly expressed in the brain []. Pcdh cluster genes may contribute to specifying the identity and diversity of individual neurons [].The 22 isoforms of the Pcdhg gene cluster are diversified into A-, B-, and C-types, and the C-type isoforms differ from all other clustered Pcdhs in sequence and expression. Mice lacking the three C-type isoforms (Pcdhgc3, Pcdhgc4, Pcdhgc5) display cellular and synaptic alterations resulting from neuronal apoptosis [].This entry represents protocadherin gamma-C3.
Protocadherin-19 (Pcdh19) belongs to the delta-2 subfamily of nonclustered protocadherins. The non-clustered PCDHs appear to have homophilic/heterophilc cell-cell adhesion properties [], and the delta-2 subfamily (comprising protocadherin-8, -10, -17, -18, and -19) is widely expressed in the nervous system [, ]. Pcdh19 is highly expressed during brain development, and could play significant roles in neuronal migration or establishment of synaptic connections. Pcdh19 mutations cause an unusual X-linked inheritance disorder resulting in epilepsy and mental retardation [, , ].The cadherin family consists of a large group of cell adhesion proteins. It can be classified into three subfamilies: classical cadherins, desmosomal cadherins and protocadherins (PCDHs). Based on the genomic structure, the PCDH family can be divided into two groups, clustered PCDHs and non-clustered PCDHs. Non-clustered PCDHs can be further classified into three subgroups: delta1 (PCDH1, PCDH7, PCDH9, PCDH11 and PCDH20), delta2 (PCDH8, PCDH10, PCDH12, PCDH17, PCDH18 and PCDH19) and epsilon (PCDH15, PCDH16, PCDH21 and MUCDHL). Non-clustered PCDHs are expressed predominantly in the nervous system and have spatiotemporally diverse expression patterns [].
Methylpurine-DNA glycosylase (MPG, or alkyladenine DNA glycosylase (AAG)) is a base excision-repair protein, catalyzing the first step in base excision repair by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site. MPG bends DNA by intercalating between the base pairs, causing the damaged base to flip out of the double helix and into the enzyme active site for cleavage. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA [, , , , , , , , , , ]. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts []. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension [].Although AAG represents one of six DNA glycosylase classes, it lacks the helix-hairpin-helix active site motif associated with other base excision repair glycosylases and is structurally distinct from them.
This entry represents pleckstrin homology (PH) domain found in the Pleckstrin homology domain-containing family A members 4-7 (PKHA4-7) from humans. This domain is involved in targeting these proteins to appropriate cellular compartments or enabling them to interact with other components of the signal transduction pathways. Some PH domains are responsible for the protein binding to phosphoinositide phosphates (PIPs) with high affinity and specificity, others display strong specificity in lipid binding. Its specificity is usually determined by loop regions or insertions in the N terminus of the domain, which are not conserved across all PH domains. Proteins included in this entry are predominantly found in chordates. Some members also contain WW (also known as WWP) domains, also occurring in proteins involved in signal transduction processes [, , , ]. PKHA4 (PEPP-1) binds specifically to phosphatidylinositol 3-phosphate (PtdIns3P) and was reported to be involved in ubiquitination [, ]. In humans, PKHA6 (PEPP-3) has been related to the pathophysiology of schizophrenia and the therapy response towards antipsychotics []. PKHA7, required for zonula adherens biogenesis and maintenance, has been identified as one of the host factors mediating death by S. aureus alpha-toxin []and related to hypertension, glaucoma and cancer [, , , , ].
Herpesviruses are large and complex DNA viruses, widely found in nature. Human cytomegalovirus (HCMV), an important human pathogen, defines the betaherpesvirus family. Mouse cytomegalovirus (MCMV) and rat cytomegalovirus serve as biological model systems for HCMV. HCMV, MCMV, and rat CMV display the largest genomes among the herpesviruses and are essentially co-linear over the central 180 kb of the 230-kb genomes. Betaherpesviruses, which include the CMVs as well as human herpesviruses 6 and 7, differ from alpha- and gammaherpesviruses by the presence of additional gene families such as the US22 gene family, which are mainly clustered at the ends of the genome. The US22 family was first described in HCMV. This gene family comprises 12 members in both HCMV and MCMV and 11 in rat CMV [].US22 proteins have been found across many animal DNA viruses and some vertebrates []. The name sake of this family US22 () is an early nuclear protein that is secreted from cells []. The US22 family may have a role in virus replication and pathogenesis []. Domain analysis showed that US22 proteins usually contain two copies of conserved modules which is homologous to several other families like SMI1 and SYD (commonly called SUKH superfamily). Bacterial operon analysis revealed that all bacterial SUKH members function as immunity proteins against various toxins. Thus US22 family is predicted to counter diverse anti-viral responses by interacting with specific host proteins [].
TIMP-3 (MEROPS identifier I35.003) is known to inhibit matrix metalloproteinases, aggrecanases, and tumour necrosis factor (TNF)-alpha-converting enzyme (TACE, ADAM17), and mutations in the Human TIMP-3 gene cause a dominantly inherited, adult-onset blindness (Sorsby's fundus dystrophy or SFD) []. Deletion of the Mouse Timp3 gene results in an increase in TNF-alpha converting enzyme activity, constitutive release of TNF and activation of TNF signalling in the liver []. The knockout animals also develop spontaneous air space enlargement in the lung that is evident at 2 weeks after birth and progresses with age of the animal, and succumb to death as early as 13 months of age []. TIMP-3 has been shown to regulate agonist-induced vascular remodelling and hypertension [].Tissue inhibitors of metalloproteinases (TIMPs) are natural inhibitors of matrix metalloproteinases (MMPs) found in most tissues and body fluids. By inhibiting MMPs activities, they participate in tissue remodeling of the extracellular matrix (ECM). The balance between MMPs and TIMPs activities is involved in both normal and pathological events such as wound healing, tissue remodeling, angiogenesis, invasion, tumourigenesis and metastasis []. TIMPs also exhibit functions that appear to be independent of their metalloproteinase inhibitory capacity []. There are four mammalian TIMPs (TIMP-1 to -4), and each TIMP has its own profile of metalloproteinase inhibition.
Guanylate cyclases () catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties [, , , ]. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.This domain is also found in bacterial pyrimidine cyclases, which synthesize cyclic nucleotides in response to bacteriophage infection, providing immunity. These cyclic nucleotides serve as specific second messenger signals that activate the adjacent effector, leading to bacterial cell death and abortive phage infection [].In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.
The Bacteriophage T4 gene 59 helicase assembly protein (Gp59) is required for recombination-dependent DNA replication and repair, which is the predominant mode of DNA replication in the late stage of T4 infection. Gp59 accelerates the loading of the T4 gene 41 helicase during DNA synthesis by the T4 replication system in vitro. This protein binds to both T4 gene 41 helicase and T4 gene 32 single-stranded DNA binding protein, and to single and double-stranded DNA [].The C-terminal domain of the T4 gene 59 helicase assembly protein consists of seven α-helices with short intervening loops and turns; the surface of the domain contains large regions of exposed hydrophobic residues and clusters of acidic and basic residues. The hydrophobic region on the 'bottom' surface of the domain near the C-terminal helix binds the leading strand DNA, whilst the hydrophobic region on the, top, surface of the domain lies between the two arms of the fork DNA, allowing for T4 gene 41 helicase binding and assembly into a hexameric complex around the lagging strand [].
This entry represents the interlocking domain superfamily of the eukaryotic nuclear receptor coactivators CREBP and p300. The interlocking domain forms a 3-helical non-globular array that forms interlocked heterodimers with its target.Nuclear receptors are ligand-activated transcription factors involved in the regulation of many processes, including development, reproduction and homeostasis. Nuclear receptor coactivators act to modulate the function of nuclear receptors. Coactivators associate with promoters and enhancers primarily through protein-protein contacts to facilitate the interaction between DNA-bound transcription factors and the transcription machinery. Many of these coactivators are structurally related, including CBP (CREB-binding protein) and p300 []. CBP and p300 both have histone acetyltransferase activity (). CBP/p300 proteins function synergistically to activate transcription, acting to remodel chromatin and to recruit RNA polymerase II and the basal transcription machinery. CBP is required for proper cell cycle control, differentiation and apoptosis. The interaction of CBP/p300 with transcription factors involves several small domains. The IBiD domain in the C-terminal of CBP is responsible for CBP interaction with IRF-3, as well as with the adenoviral oncoprotein E1A, TIF-2 coactivator, and the IRF homologue KSHV IRF-1 [].
TonB-dependent transporters (TBDT) are bacterial outer membrane (OM) proteins that bind and transport ferric chelates called siderophores. While iron complexes constitute the majority of substrates for TBDTs, others, like vitamin B12, are also transported by this mechanism []. These transporters show high affinity and specificity for siderophores and require energy derived from the proton motive force across the inner membrane to transport them. The energy force is provided through interaction with an inner membrane protein complex consisting of TonB, ExbB, and ExbD []. The source of this energy is the ion electrochemical gradient of the cytoplasmic membrane, harvested by heteromultimeric complexes of ExbB and ExbD proteins, and transduced to the OM high affinity siderophore transporters by the protein TonB [].TonB is composed of three domains. The N-terminal transmembrane helix anchors the protein to the inner membrane and makes contact with ExbB and ExbD to form an energy transducing complex. The C-terminal globular domain directly contacts the transporters in the OM. These two domains are separated by a flexible, unstructured proline-rich domain that resides within the periplasm [].Escherichia coli has only one TonB protein which is shared by different TBDTs involved in in the acquisition of various substrates, but most bacteria have more than one tonB gene [].
Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions []. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+and the counter-transport of K+, with no apparent dependence on Cl-[]. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters [, ].