This family of proteins is found in cyanobacteria and is functionally uncharacterised. Proteins in this family are approximately 90 amino acids in length. A member of this family, the ssr1528 gene was found to be regulated by an sRNA NsiR4 []. In addition, the expression of that gene was repressed under conditions of nitrogen depletion, suggesting it may have a role related to cellular nitrogen status [].
This domain is found in squalene epoxidase (SE) and related proteins which are found in taxonomically diverse groups of eukaryotes and also in bacteria. SE was first cloned from Saccharomyces cerevisiae (Baker's yeast) where it was named ERG1. It contains a putative FAD binding site and is a key enzyme in the sterol biosynthetic pathway []. Putative transmembrane regions are found to the protein's C terminus.
The Sec7 domain was named after the first protein found to contain such a region []. It has been shown to be linked with guanine nucleotide exchange function [, ]. The 3D structure of the domain displays several α-helices []. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian factors [].This superfamily represents the alpha orthogonal structural domain which is found at the C terminus of the Sec7 domain ().
Contractile injection systems (CISs) are cell-puncturing nanodevices that share ancestry with contractile tail bacteriophages. This entry represents the spike tip protein from a extracellular CIS present in bacteria.This protein family includes the Pvc10 (the homologue of T4 gp5.4). This monomeric protein was part of a CryoEM structure and although not resolved at atomic resolution in the map, it was observed to form the sharp conical tip on the Pvc8 spike [].
U15 is an ORF present in human herpesvirus 6 (HHV-6) that was initially isolated from patients with the AIDS and lymphoproliferative disorders, but was subsequently shown to be responsible for the common childhood disease exanthema subitum (roseola). Several gene fragments of HHV-6 have been shown to activate the human immunodeficiency virus (HIV) type 1 long terminal repeat (LTR) []. The ORF U15 encodes a protein of 110 amino acids, whose function in unknown.
This entry includes MAT1-1-2, MatA-2 and Smr1 from Sordariomycetes. MatA-2 is encoded by the MAT1-1-2 gene which is present in the mating types of Sordariomycetes. The most famous representative in this class of fungi is Neurospora crassa. MAT1-1-2 is the generic nomenclature of all mating-type genes encoding proteins with a HPG (also termed PPF) domain. This gene and its domain was first identified in Podospora anserina (its name in this species is Smr1) and Neurospora crassa (its name in this species is MatA-2) []. HPG was the first name proposed for the domain found in MAT1-1-2 proteins see [], based on the most conserved residues (histidine, proline and glycine). PPF was a second denomination []for the same domain but these authors identified different conserved residues (proline, proline and phenylalanine).Smr1 from Podospora anserina contains a putative acidic/hydrophobic α-helix, which has been proposed to be a feature common to transcriptional activators [].
This family includes WD repeat and coiled-coil-containing protein (WDCP, previously known as C2orf44), which is found in eukaryotes and consists of around 721 amino acids. The N-terminal contains two WD (tryptophan-aspartic acid) repeats (WD1 and WD2). WD repeats may be involved in a range of biological functions including apoptosis, transcriptional regulation and signal transduction. The C-terminal contains a proline-rich sequence (PPRLPQR), and is predicted to have leucine-rich coiled coil region (CC) [].WDCP was identified in a proteomic screen to find signalling components that interact with Hck (hematopoietic cell kinase), a non-receptor tyrosine kinase. WDCP was shown to bind tightly and specifically to the SH3 domain of Hck in U937 human monocytic cells. WDCP was also shown to exist as an oligomer when expressed in mammalian cells. While the function of WDCP is unknown, it has been identified in a gene fusion event with anaplastic lymphoma kinase (ALK) in colorectal cancer patients [].
This entry represent the hydroxylamine reductases (Hcp, also known as Prismane) and the Ni-containing CO dehydrogenases (CODH) (). Hydroxylamine reductases have been identified in bacteria, archaea and eukaryotic protozoa. They contain two Fe/S centres - a [4Fe-4S]cubane cluster, and a hybrid [4Fe-2S-2O]cluster. The physiological role of this protein is as yet unknown, although a role in nitrate/nitrite respiration has been suggested []. The Hcp protein from Escherichia coli was shown to contain hydroxylamine reductase activity (NH2OH + 2e + 2 H+ ->NH3 + H2O). This activity is rather low []. Hydroxylamine reductase activity was also found in CO-dehydrogenase in which the active site Ni was replaced by Fe []. The CO dehydrogenase contains a Ni-3Fe-2S-3O centre. Ni-containing CO dehydrogenases allows bacterial growth in a CO-dependent manner in the dark. It oxidizes carbon monoxide coupled, via CooF, to the reduction of a hydrogen cation by a hydrogenase (possibly CooH) [, ].
This family consists of several eukaryotic Organic-Anion-Transporting Polypeptides (OATPs). Several have been identified mostly in human and rat. Different OATPs vary in tissue distribution and substrate specificity. Since the numbering of different OATPs in particular species was based originally on the order of discovery, similarly numbered OATPs in humans and rats did not necessarily correspond in function, tissue distribution and substrate specificity (in spite of the name, some OATPs also transport organic cations and neutral molecules) so a scheme of using digits for rat OATPs and letters for human ones was introduced []. Prostaglandin transporter (PGT) proteins are also considered to be OATP family members. In addition, the methotrexate transporter OATK is closely related to OATPs. This family also includes several predicted proteins from Caenorhabditis elegans and Drosophila melanogaster. This similarity was not previously noted. All characterized OATPs are predicted to have 12 transmembrane domains and are sodium-independent transport systems [].
MAP1 is one of the first described microtubule-associated proteins (MAPs). It was discovered through its association with tubulin, and was later resolved into three distinct proteins: MAP1A, MAP1B (also called MAP5) and MAP1C. MAP1A and MAP1B are structurally related and share light chains with each other, and MAP1C was later shown to be the heavy chain of brain cytoplasmic dynein [].MAP1A is a long, rod-shaped protein that is made up of a large heavy chain and three light chains, LC1, LC2 and LC3. Light chain binding appears to regulate the activity of MAP1A and MAP1B []. MAP1A binds to and stabilises microtubules and can promote microtubule assembly []. It is expressed in mature neurons and may serve a key function in synaptic plasticity [, ].
Paralemmin was identified in the chicken lens as a protein with a molecular weight of 65kDa (isoform 1) and a splice variant of 60kDa (isoform 2). Isoform 2 is predominant during infancy and levels of isoform 1 increase with age. Paralemmin is localised to the plasma membrane of fibre cells, and was not detected in the annular pad cells. Its localisation to the short side of the fibre cell and the sites of fibre cell interlocking suggests that paralemmin may play a role in the development of such interdigitating processes []. Palmitoylation is important for localising these proteins to the filopodia of dendritic cells where they have been implicated in the regulation of membrane dynamics and process outgrowth []. Paralemmin-3 is an ATP-binding protein that was shown to interacts with single immunoglobulin IL-1 receptor-related molecule (SIGIRR) and may act as a adapter in the Toll-like receptor (TLR) signalling [].
Thymidylate synthase catalyses the reductive methylation of dUMP to dTMP using methylene tetrahydrofolate as a methyl donor, an essential step in DNA biosynthesis []. This entry represents a number of proteins that are predicted thymidylate synthases though their function has not been proven. The Methanobacterium thermoautotrophicum protein was shown to catalyse the side reactions characteristic of thymidylate synthase, but conversion of dUTP to dTMP was not observed under the experimental conditions used []. Partial sequence data showed no similarity to known thymidylate synthases simply because the region sequenced was from a distinctive N-terminal region not found in other thymidylate synthases. Members of this protein family appear, therefore, to a novel, tetrahydromethanopterin-dependent thymidylate synthase [].
This entry includes GCFC2 and PAXBP1 from humans, and ILP1 from Arabidopsis thaliana. GCFC2 is involved in pre-mRNA splicing through regulating spliceosome C complex formation []. ILP1 and its orthologue in mouse have been shown to be repressors of cyclin genes. In Arabidopsis, all members of the CYCLINA2 family are repressed, and in mouse cyclin A2. The CYCLINA2 cyclins control endoreduplication, in which DNA is replicated but the cell cyle continues without cell division leading to polyploidy. ILP1 therefore regulates endoreduplication through control of CYCA2 expression in Arabidopsis []. ILP1 has been shown to be required for efficient splicing [].It's worth noting that GCFC2 was originally thought to be a DNA-binding transcriptional repressor. However, later work showed that the original sequence was a chimera and that the DNA-binding activity was derived from the incorrect N-terminal sequence [].
This entry represents a bifunctional enzyme CoaBC (gene name dfp) that catalyses the second and third steps (cysteine ligation, (), and decarboxylation, ()) in the biosynthesis of coenzyme A (CoA) from pantothenate in bacteria. This enzyme contains the FMN cofactor, but no FAD or pyruvoyl group. The amino-terminal region is responsible for the phosphopantothenoylcysteine decarboxylase activity [].The protein product of the dfp gene in Escherichia coli was found to be able to restore a temperature-sensitive conditionally lethal mutation that resulted in a slow cessation of DNA synthesis []. The protein encoded by dfp was then found to affect both DNA and pantothenate metabolism []. Later, phosphopantothenate-cysteine ligase activity, which is involved in the biosynthesis of coenzyme A, was demonstrated in the same protein. Therefore this protein has been named bifunctional protein CoaBC [].
Claudins form the paracellular tight junction seal in epithelial tissues. In humans, 24 claudins (claudin 1-24) have been identified. Their ability to polymerise and form strands is affected by the cell types [, , ]. They can also form heteropolymers with each other within and between tight junction strands []. Most of the claudins (claudin-12 being the exception) have a C-terminal PDZ-binding motif that can interact with other PDZ domain proteins, such as scaffolding protein, ZO-1, -2 and -3 []. They also interact with non-tight junction proteins, such as cell adhesion proteins EpCam and tetraspanins and the signaling proteins, ephrin A and B and their receptors, EphA and EphB [].Claudin-7 was identified through searching expressed sequence tag (EST)databases for sequences similar to claudin-1 and -2. It was subsequently cloned and expressed in cells, where it was shown toconcentrate at tight junctions [].
Members of this family are 18kDa serine/threonine-rich polypeptides containing a P-loop motif and an SH3-binding region with phosphorylation sites for a variety of protein kinases (cdc2, CDK2, MAPK, CDK5, protein kinase C, Ca2/calmodulin protein kinase 2, casein kinase 2) involved in cell proliferation and differentiation. Functional studies revealed that expression is associated with proliferating and migrating cells in developing brain. Furthermore, it has beensuggested that CROC-4 participates in brain-specific c-fos signaling pathways involved in cellular remodeling of brain architecture []. C1orf61 expression was also found associated with the progression of liver disease as well as human embryogenesis. It was shown to be up-regulated in hepatic cirrhosis tissues and further up-regulated in primary hepatocellular carcinoma tumors where it was suggested to play a role as a tumor activator [].
This entry includes INAVA and CCDC120.Coil-coiled domain-containing protein 120 (CCDC120) was first identified as a centrosome protein []. Later, it was found to interact with cytohesin-2 to regulate vesicular trafficking and neurite growth []. It is part of the centriole subdistal appendages (SDAs) that anchor microtubules in interphase cells [].Innate immunity activator protein (INAVA) is required for optimal MAPK and NF-kappaB activation, cytokine secretion, and intracellular bacterial clearance. Genetic variations in INAVA are associated with an increased risk of inflammatory bowel disease []. Homologues are known from chordates.
This entry represents fission yeast Dim1 and its homologues, including Dib1 from budding yeasts, YLS8 from plants and TXNL4 from animals. Dim1 was originally identified as a mitosis protein []. Later, it was found to interact with spliceosome component Prp6, which is involved in pre-mRNA splicing []. Dim1 may act at the level of mRNA, which impacts the functioning of the APC/C, a critical complex in controlling mitotic progression [].It's worth noting that although the Dim proteins exhibit a thioredoxin-like fold, they lack the disulfide bond required for the thioredoxin redox activity [].
Pappalysin-1 (also known as pregnancy-associated plasma protein-A (PAPP-A); MERIOPS identifier M43.004) is a metalloendopeptidase that belongs to the MEROPS peptidase family M43B. It was first found in high concentrations in the blood of pregnant women. Later, it was identified as a proteinase responsible for cleavage of insulin-like growth factor binding protein (IGFBP)-4, an inhibitor of IGF action that mediates cell growth and survival signals. PAPP-A is expressed by different cell types, and thus no longer could be considered to be just "pregnancy-associated"[]. It is involved in rapid yet strictly controlled growth and development, including wound healing, bone remodelling, folliculogenesis, placental development, and atherosclerosis [].
This repeat of unknown function was identified in Diptera proteins; later on, it was also found in a wide range of invertebrates and vertebrates [], such as natterin [, ]. It is often present in proteins comprising only two or four DM9 repeats and occasionally is linked to other domains at its N- or C-terminal. In the mannose-specific lectin CGL1 from Crassostrea gigas, DM9 exhibit high binding specificity and avidity toward D-mannose residue and served as a pattern recognition receptor (PRR) with a broad range of recognition spectrum to various pathogen-associated molecular patterns []. It also mediates immune recognition and cellular encapsulation [].
Baculovirus occlusion-derived virus (ODV) derives its envelope from an intranuclear membrane source. N-terminal amino acid sequences of the Autographa californica MNPV nuclear polyhedrosis virus (AcMNPV) envelope protein ODV-E66 is highly hydrophobic. This defined hydrophobic domain was shown to direct the protein, E66,to induce membrane microvesicles within a baculovirus-infected cell nucleus and the viral envelope. In addition, it was suggested that movement of this protein into the nuclear envelope may initiate through cytoplasmic membranes, such as endoplasmic reticulum, and that transport into the nucleus may be mediated through the outer and inner nuclear membrane [].
Nuclear factor interleukin-3-regulated protein (NFIL3, also known as E4BP4) was first identified as a transcriptional repressor capable of binding an activating transcription factor (ATF) DNA consensus sequence site in the adenovirus E4 promoter []. Later, it was independently identified as a transactivator of the IL3 promoter in human T cells [].E4BP4 is involved in a several biological processes, including transcriptional control of the circadian clock, neuron growth and survival, osteoblast function, and regulation of ovulation []. It is essential for the development of NK cells and CD8alpha(+) conventional dendritic cells, and is also involved in macrophage activation, polarisation of CD4(+) T cell responses and B cell class switching to IgE [].
The transcription factor DP (dimerization partner) forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer []and negatively regulates the G1-S transition. Though originally the role of DP in transcriptional activation was thought to be facilitating the binding of E2F to target DNA, it was latter shown that the C-terminal acidic region of DP1 binds strongly to the PH domain of p62 of TFIIH and acts as a transactivation domain [].
The TGFBI (transforming growth factor-beta-induced protein ig-h3) gene was originally identified as a gene induced by transforming growth factor-beta stimulation in adenocarcinoma cells. Later, it was found to interact with a number of extracellular matrix (ECM) proteins, including fibronectin, biglycan, decorin, and several types of collagen. It also acts as a ligand for several integrins, including alhpa3beta1, alphavbeta5, alphavbeta3, and alphambeta2. It may function as a secreted factor involved in cell adhesion, proliferation, and migration [].Mutations in the TGFBI gene cause several types of corneal dystrophy [, , , , ].
Members of this largely uncharacterised family share a motif approximating DXH(X25)GDXXD(X25)GNHD as found in several phosphoesterases, including the nucleases SbcD and Mre11, and a family of uncharacterised archaeal putative phosphoesterases. In this family, the His residue in GNHD portion of the motif is not conserved. The member MJ0936, one of two from Methanocaldococcus jannaschii (Methanococcus jannaschii), was shown []to act on model phosphodiesterase substrates; a divalent cation was required. This entry also represents Vps29 which is part of the retromer complex in yeast [].This entry also includes vacuolar protein sorting-associated protein 29 (vps29), which is an essential component of the retromer complex, a conserved complex required in endosome-to-Golgi retrograde transport [].
This family consists of proteins closely related to Ax21. Ax21 was thought to be secreted by a type I-secretion system and to activate XA21-mediated immunity. Later, it was found that Ax21 secretion does not depend on the predicted type I secretion system and that it is processed by the general secretion (Sec) system. In fact, Ax21 is an outer membrane protein, secreted in association with outer membrane vesicles []. It does not trigger plant XA21-mediated immune response, as previously thought []. This role is performed by RaxX [].
Sodium leak channel non-selective protein (NALCN) was first described as a voltage-independent, cation-nonselective channel which is permeable to sodium, potassium and calcium ions [], however, it was recently reported to be selective only for monovalent cations and to be blocked by extracellular divalent cations []. Coexpression of NALCN, UNC79, UNC80, and NALF1 results in voltage-dependent NALCN currents [, ]. It is responsible for the background sodium ion leak current in neurons and controls neuronal excitability. It is activated either by neuropeptides substance P or neurotensin []. NALCN is required for normal respiratory rhythm and neonatal survival [].
This entry represents the mitochondrial/chloroplastic transcription termination factors (MTERFs). In humans four MTERFs have been identified (MTERF1-4). MTERF1 was first identified as a factor responsible for terminating heavy strand transcription at a specific site at the leu-tRNA, thereby modulating the ratio of mitochondrial ribosomal RNA to mRNA []. Later, MTERF1 was found to stimulate transcriptional initiation []and appeared to be in the control of mitochondrial replication pausing []. From a structural study, it binds to dsDNA containing the termination sequence and unwinds the DNA molecule, promoting base eversion, which is critical for transcription termination [].
These vegatative storage proteins are close relatives of the plant acid phosphatases () and are limited to members of the Phaseoleae including Glycine max (Soybean) and Phaseolus vulgaris (Kidney bean). These proteins are highly expressed in the leaves of repeatedly depodded plants [, ]. Vegetative storage protein (VSP) differs most strikingly from the acid phosphatases in the lack of the conserved nucleophilic aspartate residue in the N terminus, thus, they should be inactive as phosphatases. This issue was confused by the publication in 1992 of an article claiming activity for the G. max VSP. In 1994 this assertion was refuted by the separation of the activity from the VSP [].
This entry represents the central domain of archaeal protein HerA, which is a DNA helicase able to utilise either 3' or 5' single-stranded DNA extensions for loading and subsequent DNA duplex unwinding []. It forms a complex with NurA nuclease, this complex has the 5'-3' DNA end resection activity and is essential for cell viability in the crenarchaeon Sulfolobus islandicus []. Studies in Sulfolobus tokodaii revealed that HerA was able to unwind blunt-ended double-stranded DNA with low efficiency and it was also able to unwind Holliday junction, splayed-arm DNA, as well as 5'- or 3'-overhang with high efficiency []. This domain includes the the central RecA-like catalytic core a flanking four-helix bundle [].
The transcription factor DP (dimerization partner) forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer []and negatively regulates the G1-S transition. Though originally the role of DP in transcriptional activation was thought to be facilitating the binding of E2F to target DNA, it was latter shown that the C-terminal acidic region of DP1 binds strongly to the PH domain of p62 of TFIIH and acts as a transactivation domain [].
These conserved animal proteins contain one copy of the calcineurin-likephosphoesterase domain () and possess motifs characteristic of a variety of enzymatically active phosphoesterases [], including acid and alkaline phosphatases, phosphoprotein phosphatases, 5'-nucleotidase, bis(5'-nucleosyl)-tetraphosphatase (symmetrical), sphingomyelin phosphodiesterase, 2',3'-cylic-nucleotide 2'-phosphodiesterase, and 3',5'-nucleotide phosphodiesterase CpdA. Two human genes have been identified. One, expressed in fetal brain, was isolated from the chromosome 11p13 region associated with mental retardation component of the WAGR (Wilms tumor, aniridia, genitourinary anomalies, mental retardation) syndrome. The other, expressed in adult tissues, was mapped to chromosome 22 [].
This entry represents the RNA recognition motif 1 (RRM1) of MARF1 (also known as Limkain-b1).MARF1 was first identified as a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 []. Later, it was found to be an essential protein for controlling meiosis and retrotransposon surveillance in mouse oocytes. It may function both as an adaptor to recruit specific RNA targets and an effector to catalyse the specific cleavages of target RNAs []. MARF1 contains an N-terminal NYN domain, two central RRMs, and C-terminal OST-HTH/LOTUS domains [, ].
This entry represents the RNA recognition motif 2 (RRM2) of MARF1 (also known as Limkain-b1).MARF1 was first identified as a novel peroxisomal autoantigen that co-localizes with a subset of cytoplasmic microbodies marked by ABCD3 []. Later, it was found to be an essential protein for controlling meiosis and retrotransposon surveillance in mouse oocytes. It may function both as an adaptor to recruit specific RNA targets and an effector to catalyse the specific cleavages of target RNAs []. MARF1 contains an N-terminal NYN domain, two central RRMs, and C-terminal OST-HTH/LOTUS domains [, ].
This entry represents the RNA recognition motif (RRM) of Acinus (called 'Apoptotic chromatin condensation inducer in the nucleus' or ACIN1). Acinus was first identified as a target of proteolytic cleavage during apoptosis and has been implicated in transcriptional control []. Later, it was found to be part of the ASAP complex (consists of Acinus, RNPS1 and SAP18) that interacts with the exon-junction complex (EJC), a messenger ribonucleoprotein complex involved in post-transcriptional regulation. The ASAP complex serves as an auxiliary component of EJC deposited at splice junction on mRNAs [, ]. Acinus contains a P-loop motif and an RNA recognition motif (RRM).
WrbA (tryptophan (W) repressor-binding protein) was discovered in Escherichia coli, where it was proposed to play a role in regulation of the tryptophan operon [], which has been put in question since then []. Instead, WrbA has been shown to have FMN-dependent NAD(P)H:quinone oxidoreductase acivity [, ]. A role in quinone detoxification has been proposed, supported by evidence suggesting its involvement in oxidative defense and/or cell signaling [, ].This entry also includes QR2 from Triphysaria versicolor. QR2 acts as a NAD(P)H:quinone oxidoreductase reducing quinones by a two-electron transfer mechanism [].
Septin and tuftelin interacting proteins (STIPs) are G-patch domain proteins involved in spliceosome disassembly []. The mouse protein, known as TFT11 was originally identified as a protein interacting with tuftelin, one of the presumed enamel matrix proteins []. The Drosophila protein STP1 was originally identified as a septin-interacting protein []. In both cases these interactions were identified by a yeast two-hybrid system and their function and direct physical association were not characterised. Subsequent studies show that these proteins are widely expressed and function as splicing factors [, ]. STIP is essential for embryogenesis in Caenorhabditis elegans [].
This family contains E. coli swarming motility protein YbiA. Mutations in YbiA cause defects in Escherichia coli swarming, but not necessarily in motility. This family was predicted to be involved in NAD-utilizing pathways, likely to act on ADP-ribose derivatives, and was been named NADAR (NAD and ADP-ribose) [, ]. More recently, YbiA has been shown to be involved in the disposal of riboflavin intermediates. It catalyzes the hydrolysis of the N-glycosidic bond in the first two intermediates of riboflavin biosynthesis, which are highly reactive metabolites, yielding relatively innocuous products [].
APM2 (adipose specific 2 or adipose most abundant 2), also known as adipogenesis regulatory factor, was so named because it was originally identified as the second most abundant transcript in adipose tissue []. It up-regulates the levels of CCAAT/enhancer binding protein alpha (C/EBP-alpha) and PPAR-gamma, and promotes adipogenic differentiation starting from the early stage of adipogenesis. It is thought to be an adipocyte lineage-specific nuclear factor that can modulate the master adipogenesis transcription factors early during differentiation []. It is also involved in regulating glucose transport in adipocytes, as well as regulating the number of preadipocytes [].APM2 it is widely dysregulated in various cancers [].
This entry represents the C-terminal domain of Tab2 from plants and cyanobacteria. Tab2 was first identified in Chlamydomonas reinhardtii ([swissprot]:Q7X8Y6) as a RNA-binding protein required for translation of the chloroplast PsaB photosystem I subunit []. Later, the Tab2 homologue from Arabidopsis (ATAB2) was found involved in the signalling pathway of light-controlled synthesis of photosystem proteins during early plant development, presumably functioning as an activator of translation with targets at PSI and PSII [, ]. Directed mutagenesis experiments carried out in Tab2 from C.reinhardtii indicated the importance of a highly conserved C-terminal tripeptide WLL for normal psaB translation [].
This entry represents the alpha-ketoglutarate dehydrogenase component 4 (Kgd4, also referred to as Ymr31/MRPS36) which was originally identified as a subunit of the mitochondrial ribosome. Through biochemical studies, it was shown that this protein co purifies with the oxoglutarate dehydrogenase complex (OGDC), also called alpha-ketoglutarate dehydrogenase complex (KGDH). Both mitochondrial ribosome 28S subunit and OGDC have a similar size and OGDC is highly abundant, being found to contaminate ribosomal preparations performed by sequential centrifugation steps []. Kgd4 plays an evolutionarily conserved role in the organization of mitochondrial α-KGDH complexes of fungi and animals which constitutes a molecular adaptor that is necessary to form a stable α-KGDH enzyme complex [, ].
This entry is represented by Bacteriophage KVP40, Orf299. The characteristics of the protein distribution suggest prophage matches in addition to the phage matches.This is a family of uncharacterised, mainly bacterial, proteins. While the functions of these proteins are unknown, an analysis has suggested that they may form a novel family within the RNASE H-like superfamily []. These proteins appear to contain all the core secondary structural elements of the RNase H-like fold and share several conserved, possible active site, residues. It was suggested, therefore, that they function as nucleases. From the taxonomic distibution of these proteins it was further inferred that they may play a role in DNA repair under stressful conditions.
Members of this family are bacterial microcin-V peptides MccV, also known as colicin V. MccV was the first antibiotic substance reported to be produced by E. coli. This antibacterial agent was initially named colicin V (ColV). However, on account of several characteristics (low molecular mass, non-inducible production, and dedicated export system), it became classified within the microcins. The structural gene cvaC encodes the 103-aa MccV precursor. The dedicated export system of MccV has been well characterized and involves two genes that form the second operon. The MccV protein has an N-terminal double glycine motif which precedes the cleavage site for the precursor protein [].
Serglycin was first identified as an intracellular proteoglycan expressed by hematopoietic cells. All inflammatory cells highly synthesize serglycin and store it in granules, where it interacts with numerous inflammatory mediators, such as proteases, chemokines, cytokines, and growth factors. Later serglycin was found to be expressed by various non-hematopoietic cell types, such as tumour cells. Serglycin promotes the aggressive phenotype of tumours and confers resistance against drugs and complement system attack [].Human serglycin consists of a small core protein containing eight serine/glycine repeats. Each serine of this repeat region is a potential GAG attachment site [].
This entry includes EAG2 (KCNH5, KV10.2) from mammals. Human EAG2 is expressed in the brain, but is also found in a range of tissues including skeletal muscle []. In medulloblastoma, it controls mitotic entry and tumour growth via regulating cell volume dynamics []. The voltage-gated potassium channel ether-a-go-go (eag) was first identified in Drosophila melanogaster based on the leg-shaking mutant phenotype [, ]. Later, the EAG (Ether-a-go-go) voltage-dependent potassium channel family was classified into 3 subfamilies: EAG, EAG-related gene (ERG), and EAG-like K+ channel (ELK). The mammalian EAG subfamily comprises of two members, termed EAG1 (KCNH1, KV10.1) and EAG2 (KCNH5, KV10.2) [].
This entry represents the spermatogenesis-associated protein 7 (SPATA7, also known as HSD3). It was first identified in human spermatocytes. Later on, it was also found expressed in multiple layers of the mature mouse retina []. Mutations in SPATA7 cause Leber congenital amaurosis 3 (LCA3), which is a severe dystrophy of the retina, typically becoming evident in the first years of life [, ]. Mutations in SPATA7 also cause autosomal recessive retinitis pigmentosa (ARRP), which is a retinal dystrophy belonging to the group of pigmentary retinopathies [].
This entry represents a short conserved sequence region found in a family of short, hydrophilic proteins that are all known or suspected phosphodiestreases. The best characterised of these is YfcE from Escherichia coli []. The physiological substrate for this protein is not known, though it is capable of hydrolysing phosphodiesters bonds in the artificial chromogenic substrate bis-p-nitrophenyl phosphate. YfcE is a tetrameric, managanese-binding protein where each monomer forms a β-sandwich fold similar to other metallophosphatases. The member MJ0936 from Methanocaldococcus jannaschii (Methanococcus jannaschii), was shown []to act on model phosphodiesterase substrates; a divalent cation was required.
ER membrane protein complex subunit 7 (EMC7) is a component of the endoplasmic reticulum membrane protein complex (EMC). This complex was first found to interact with the degradation of misfolded proteins by the ubiquitin- and proteasome-dependent process known as ER-associated degradation (ERAD) [], thus suggesting a role in the biosynthesis of transmembrane proteins. More recently, it was shown that EMC acts as a conserved co- and post-translational insertase at the endoplasmic reticulum in an energy-independent manner [, ]. EMC7 has been misnamed C11orf3, but is more correctly named C15orf24 because the gene is actually on human chromosome 15.
SKIP (SKI-interacting protein) is an essential spliceosomal component and transcriptional coregulator, which may provide regulatory coupling of transcription initiation and splicing []. SKIP was identified in a yeast 2-hybrid screen, where it was shown to interact with both the cellular and viral forms of SKI through the highly conserved region on SKIP knownas the SNW domain []. SKIP is now known to interact with a number of other proteins as well. SKIP potentiates the activity of important transcription factors, such as vitamin D receptor, CBF1 (RBP-Jkappa), Smad2/3, and MyoD. It works with Ski in overcoming pRb-mediated cell cycle arrest, and it is targeted by the viral transactivators EBNA2 and E7 [].
This entry represents the mitochondrial/chloroplastic transcription termination factors (MTERFs). In humans four MTERFs have been identified (MTERF1-4). MTERF1 was first identified as a factor responsible for terminating heavy strand transcription at a specific site at the leu-tRNA, thereby modulating the ratio of mitochondrial ribosomal RNA to mRNA []. Later, MTERF1 was found to stimulate transcriptional initiation []and appeared to be in the control of mitochondrial replication pausing []. From a structural study, it binds to dsDNA containing the termination sequence and unwinds the DNA molecule, promoting base eversion, which is critical for transcription termination [].
This entry represents a Sec39 domain, which can be found in yeast Sec39, human NBAS (neuroblastoma-amplified sequence) []and Arabidopsis MIP2 (MAG2-INTERACTING PROTEIN 2) []. They may be involved in Golgi-to-ER transport.Sec39 was originally identified as a protein involved in ER-Golgi transport in a large scale promoter shut down analysis of essential yeast genes []. A subsequent study found that Sec39p (Dsl3p) is required for Golgi-ER retrograde transport and is part of a very stable protein complex that also includes Dsl1p (in mammals ZW10), Tip20p (Rint-1) and the ER localized Q-SNARE proteins Ufe1p (syntaxin-18), Sec20p and Use1p []. This was confirmed in a genome-wide analysis of protein complexes [].
This superfamily contains E. coli swarming motility protein YbiA. Mutations in YbiA cause defects in Escherichia coli swarming, but not necessarily in motility. This family was predicted to be involved in NAD-utilizing pathways, likely to act on ADP-ribose derivatives, and was been named NADAR (NAD and ADP-ribose) [, ]. More recently, YbiA has been shown to be involved in the disposal of riboflavin intermediates. It catalyzes the hydrolysis of the N-glycosidic bond in the first two intermediates of riboflavin biosynthesis, which are highly reactive metabolites, yielding relatively innocuous products [].
These proteins are metallopeptidases belonging to MEROPS peptidase family M15 (clan MD), subfamily M15B (vanY D-Ala-D-Ala carboxypeptidase). Acquired VanA- and VanB-type glycopeptide resistance in enterococci is due to synthesis of modified peptidoglycan precursors terminating in D-lactate. As opposed to VanA-type strains which are resistant to both vancomycin and teicoplanin, VanB-type strains remain teicoplanin susceptible []. The vanY gene was necessary for synthesis of the vancomycin-inducible D,D-carboxypeptidase activity previously proposed to be responsible for glycopeptide resistance. However, this activity was not required for peptidoglycan synthesis in the presence of glycopeptides [].
Sperm-associated antigen 4 protein (SPAG4, also known as SUN4) is a SUN domain containing protein that was originally isolated from the testis []. It binds outer dense-fibre protein Odf1 and localises to microtubules of manchette and axoneme []. SPAG4 was later found expressed ubiquitously in various normal tissues and neoplastic tissues in humans at the mRNA level []. In renal cell carcinoma, SPAG4 is an independent prognostic factor and plays a crucial role in cytokinesis to defend against hypoxia-induced tetraploid formation []. It also promotes renal clear cell carcinoma migration and invasion in vitro [].This entry represents SPAG4 from mammals.
This entry represents polyisoprenyl-teichoic acid--peptidoglycan teichoic acid transferase TagU. TagU was previously known as bacterial transcriptional regulator LytR, and was originally described a 35kDa protein which is responsible for attenuation of expression of both itself and the lytABC operon. LytABC form an N-acetylmuramoyl-L-alanine amidase, which is one of the two major autolysins of B. subtilis []. This protein has since been proposed to catalyse the transfer of the anionic cell wall polymers (APs) from their lipid-linked precursor to the cell wall peptidoglycan (PG) within cell wall teichoic acid biosynthesis, hence the new name of polyisoprenyl-teichoic acid--peptidoglycan teichoic acid transferase TagU [].
This family includes proteins from bacteria and fungi. It was originally identified as a protein of unknown function from Schizosaccharomyces pombe, which was isolated from a screen to identify novel genes required for meiosis []. Members of this family have since been characterised from Streptococcus pneumoniae (A0A0H2URZ6) and Clostridium perfringens (Q8XNB2), and represent a new glycoside hydrolase family, also known as GH125. This family is the first characterized metal-independent alpha-mannosidase family []. Proteins in this family have been reported to be strict exo-alpha-1,6-mannosidases that operate via a metal-independent inverting catalytic mechanism [].
ADAMDEC1 peptidase (also known as decysin; MEROPS identifier M12.219) unusually for an ADAM peptidase lacks the disintegrin and ADAM-CR domains []. It is the only ADAM metallopeptidase in which the third zinc ligand is an aspartic acid rather than a histidine []. It is synthesized as a precursor and activated by furin []. The active enzyme has limited specificity for Leu in P1' []. ADAMDEC1 was initially thought to be involved in the immune response, because it was found in mature spleen follicular dendritic cells and germinal centres [], but it is also found in uterine stromal cells []and transcripts are highly expressed in pulmonary sarcoidosis [].
This family consists of a number of eukaryotic proteins including CXXC motif containing zinc binding protein (previously known as UPF0587 protein C1orf123). The crystal structure reveals that the protein binds a Zn2+ ion in a tetrahedral coordination with four Cys residues from two CxxC motifs. CXXC motif containing zinc binding protein was initially identified as an interaction partner for the heavy metal-associated (HMA) domain of CCS (copper chaperone for superoxide dismutase). However, it was shown that only misfolded mutant forms, lacking part of the zinc-binding sites, interact with CCS [].
Genetic interactor of prohibitin 5 (Gep5) was identified as a mitochondrial genome maintenance protein from a genome-wide screen []. It has been shown to interact with prohibitin ring complexes in the mitochondrial inner membrane []. Gep5 is detected in highly purified mitochondria in high-throughput studies []. The function of Gep5 is not clear.
A single high-scoring gene was identified in the complete genome of P. falciparum as well as a single gene from Plasmodium chabaudi. There are no obvious homologues to these genes in any non-Plasmodium organism. These observations suggest an expansion of this family in Plasmodium yoelii from a common Plasmodium ancestor gene (present in a single copy in Plasmodium falciparum).
This entry consists of ABC transporter permease proteins associated with urea transport and metabolism. They are encoded in a conserved five-gene transport operon typically found adjacent to urease genes. It was shown in Cyanobacteria that disruption leads to the loss of high-affinity urea transport activity [].
This family is represented by BlyA, a small holin found in Borrelia circular plasmids that prove to be temperate phage []. This protein was previously proposed to be a haemolysin. BlyA is small (67 residues) and contains two largely hydrophobic helices and a highly charged C terminus.
This entry represents 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate phosphatase, MtnX, which is involved in methionine salvage and belongs to the HAD-superfamily hydrolases, subfamily IB []. This enzyme is found in Bacillus subtilis and related species, paired with MtnW (). In most species that recycle methionine from methylthioadenosine, the single protein MtnC replaces the MtnW/MtnX pair. In B. subtilis, mtnX was first known as ykrX.
The transcriptional corepressor CtBP is a dehydrogenase with sequence and structural similarity to the d2-hydroxyacid dehydrogenase family. CtBP was initially identified as a protein that bound the PXDLS sequence at the adenovirus E1A C-terminal, causing the loss of CR-1-mediated transactivation. CtBP binds NAD(H) within a deep cleft, undergoes a conformational change upon NAD binding, and has NAD-dependent dehydrogenase activity [, ].
These sequences represent sepiapterin reductase, a member of the short chain dehydrogenase/reductase family. The enzyme catalyzes the last step in the biosynthesis of tetrahydrobiopterin. A similar enzyme in Bacillus cereus was isolated for its ability to convert benzil to (S)-benzoin, a property sepiapterin reductase also shares.
This family of archaeal proteins exhibits NAD salvage biosynthesis enzyme nicotinamide-nucleotide adenylyltransferase () activity. In some cases, the enzyme was tested and found also to have the activity of nicotinate-nucleotide adenylyltransferase (), an enzyme of NAD de novo biosynthesis, although with a higher Km. In some archaeal species, a number of proteins which are uncharacterised with respect to activity, are also present.
This entry represents glucosamine kinase mostly from Actinobacteria, including GlcN kinase from S. jiangxiensis (known as SjGlcNK). SjGlcNK contains a fold similar to mycobacterial maltokinases. However, SjGlcNK was unable to phosphorylate maltose or aminoglycosides in vitro. Instead, it catalyses the ATP-dependent phosphorylation of D-glucosamine (GlcN) to D-glucosamine 6-phosphate [].
This domain is found mainly in plant proteins known as Casparian strip membrane proteins (CASPs). CASPs are four-membrane-span proteins that mediate the deposition of Casparian strips in the endodermis by recruiting the lignin polymerization machinery. Interestingly, the CASP first extracellular loop was found conserved in euphyllophytes but absent in plants lacking Casparian strips [, ].
This entry includes the EXORDIUM protein and related proteins. The EXO (EXORDIUM) gene was identified as a potential mediator of brassinosteroid (BR)-promoted growth []. It mediates cell expansion in Arabidopsis leaves []. This entry also includes PHI-1, a phosphate-induced protein of unknown function from Nicotiana tabacum [].
This entry represents the N-terminal domain of proteins of the FAM65 family (FAM65A, B and C). PL48 (FAM65B, C6orf32) is associated with cytotrophoblast and lineage-specific HL-60 cell differentiation []. The N-terminal part of FAM65B was found to induce the formation of filopodia [].
This entry represents the N-terminal domain of EP400, a component of the NuA4 histone acetyltransferase complex that was first identified through its ability to bind the adenovirus E1A protein [, ]. The exact function of this domain is not known. This domain is largely low-complexity residues.
Proteins in this entry, typified by YhbH from Bacillus subtilis, are found in the genomes of nearly every endospore-forming bacterium, and in no other genomes. The gene in B. subtilis was shown to be a member of the sigma-E regulon, with mutation leading to a sporulation defect [].
In archaea the enzyme tetrahydromethanopterin S-methyltransferase is composed of eight subunits, MtrA-H. The enzyme is a membrane- associated enzyme complex which catalyzes an energy-conserving, sodium-ion-translocating step in methanogenesis from hydrogen and carbon dioxide []. Subunit MtrH catalyzes the methylation reaction and was shown to exhibit methyltetrahydromethanopterin:cob(I)alamin methyltransferase activity [].CH3-H4MPT + cob(I)alamin -->H4MPT + CH3-cob(III)alamin (H4MPT = tetrahydromethanopterin)
The Epstein-Barr virus (strain GD1) nuclear antigen 1 (EBNA1) binds to and activates DNA replication from the latent origin of replication. The crystal structure of the DNA-binding and dimerization domains were solved [], and it was found that EBNA1 appears to bind DNA via two independent regions, the core and the flanking DNA-binding domains. This DNA-binding domain has a ferredoxin-like fold.
This entry represents a presumed domain which has been predicted to contain three alpha helices. It was named the WIYLD domain based on the pattern of the ost conserved residues []. This domain appears to be specific to plant SET-domain proteins.
This is a family of Coronavirus nonstructural protein NS2. Phosphoamino acid analysis confirmed the phosphorylated nature of NS2 and identified serine and threonine as its phosphorylated amino acid residues []. It was also demonstrated that the ns2 gene product is not essential for Murine hepatitis virus replication in transformed murine cells [].
This family consists of Vpr-like accessory proteins from maedi-visna and caprine/ovine lentivirus. This small open reading frame (ORF) in maedi-visna virus (MVV) and caprine arthritis encephalitis virus (CAEV) was initially named "tat"by analogy with a similarly placed ORF in the primate lentiviruses [, ].
The ribosomal RNA large subunit methyltransferase E () methylates the 23S rRNA. It specifically methylates the uridine in position 2552 of 23s rRNA in the 50S particle using S-adenosyl-L-methionine as a substrate. It was previously known as cell division protein ftsJ.
MenD was thought to act as SHCHC synthase, but has recently been shown to act instead as SEPHCHC synthase. Conversion of SEPHCHC into SHCHC and pyruvate may occur spontaneously but is catalyzed efficiently, at least in some organisms, by MenH. 2-oxoglutarate decarboxylase/SHCHC synthase (menD) is a thiamine pyrophosphate enzyme involved in menaquinone biosynthesis [].
This entry represents proteins predicted to function as rRNA (guanine-N1-)-methyltransferases (). These enzymes specifically methylate the guanosine residue m1G in 23S rRNA. The rrmA gene was predicted to encode 23S rRNA m1G745 methyltransferase in Saccharomyces cerevisiae (Baker's yeast), and maps to the same locus as gene yebH [].
Trk transporters play a crucial roles in K(+) transport in yeasts and filamentous fungi [, ].A related transporter from plants is known as HKT1. It was identified in barley as a high-affinity potassium transporter []. In Arabidopsis, HKT1 functions as a salt tolerance determinant that controls Na+ entry into plant roots [].
Uncharacterised protein C6orf15 (also known as STG) was initially isolated in rhesus monkey taste buds []. The human homologue is also expressed in skin and tonsils []. In mice, C6orf15 has been shown to be secreted into the extracellular matrix where it binds to a number of different extracellular matrix proteins [].
AddAB is a system well described in the Firmicutes as a replacement for RecBCD in many prokaryotes for the repair of double stranded break DNA damage []. More recently, a distantly related gene pair conserved in many alphaproteobacteria was shown also to function in double-stranded break repair in Rhizobium etli. This family consists of AddB proteins of alphaproteobacterial types.
Ligand-dependent nuclear receptor-interacting factor 1 (LRIF1) was initially identified as a protein that interacts with retinoic acid receptors and other nuclear receptors []. The protein has been shown to repress the ligand-induced transcriptional activity of retinoic acid receptor alpha. Its repression activity is mediated at least in part through direct recruitment of histone deacetylases [].
This domain was first characterised as the C-terminal domain of Pab87 serine protease from Pyrococcus abyssi []. The domain is reported to play a crucial role in Pab87 octamerisation and active site compartmentalisation. Its up-and-down 8-stranded β-barrel 3D structure is reminiscent of the one found in lipocalins.
PPLPPR4, also known as plasticity-related gene 1 protein (PRG-1), was originally identified as a 2-lysophosphatidate/LPA phosphatase []. Later its activity is questioned []. PPLPPR4 facilitates axonal outgrowth during development and regenerative sprouting. In the outgrowing axons acts as an ecto-enzyme and attenuates phospholipid-induced axon collapse in neurons and facilitates outgrowth in the hippocampus [].
RhtX from Sinorhizobium meliloti 1021 and FptX from Pseudomonas aeruginosa appear to be single polypeptide transporters, from the major facilitator family for import of siderophores as a means to import iron. This function was suggested by proximity to siderophore biosynthesis genes and then confirmed by study of knockout and heterologous expression phenotypes.
The of the characterisation of two proteins from Streptomyces coelicolor has been presented []. The protein in this family was shown to have poly(A) polymerase activity and may be responsible for polyadenylating RNA in this species. It has also been shown that a nearly identical plasmid-encoded protein from Streptomyces antibioticus is a bifunctional enzyme that acts also as a guanosine pentaphosphate synthetase [].
This family of proteins contains 8 conserved cysteines. It has in the past been annotated as being one of the complex of proteins of the flagellar Fli complex. However this was due to a mis-annotation of the original Salmonella LT2 Genbank entry of 'fliB'. With all its conserved cysteines it is possibly a domain that chelates iron or zinc ions [, ].
This protein previously was designated yjbO in Escherichia coli. It is found only in genomes that havethe phage shock operon (psp), but it is only rarely encoded near other psp genes. The psp regulon is upregulated in response to a number of stress conditions, including ethanol, expression of the filamentous phage secretin protein IV and other secretins and heat shock.
Members of this protein family represent family 6 of the uracil-DNA glycosylase superfamily, where the five previously described families all act as uracil-DNA glycosylase () per se. This family, instead, acts as a hypoxanthine-DNA glycosylase, where hypoxanthine results from deamination of adenine. Activity was shown directly for members from Methanosarcina barkeri and Methanosarcina acetivorans [].
This domain, about 375 amino acids long on average, occurs mainly in Staphylococcus. It occurs as a non-repetitive N-terminal domain of LPXTG-anchored surface proteins, including SasC, Mrp, and FmtB. This region in SasC was shown to be involved in cell aggregation and biofilm formation, which may explain the methicillin resistance seen for Mrp and FmtB [].
Members of this entry represent the B subunit of dipicolinic acid synthetase, an enzyme that synthesizes a small molecule that appears to confer heat stability to bacterial endospores such as those of Bacillus subtilis. The A and B subunits are together in what was originally designated the spoVF locus for stage V of endospore formation.
Proteins in this entry, exemplified by YtfJ of Bacillus subtilis, are encoded by bacterial genomes if, and only if, the species are capable of endospore formation. YtfJ was confirmed in spores of B. subtilis; it appears to be expressed in the forespore under control of SigF [].
GTP-binding protein Rad (Ras associated with diabetes) belongs to the RGK family, a group of Ras-related GTPases that includes Rad, Gem, and Rem. Rad was first identified in the muscle of type II diabetic patients []. Rad is expressed in skeletal and cardiac muscle and may play an important role in cardiac antiarrhythmia via the strong suppression of voltage-gated L-type Ca2+ currents [, ].
RseC is thought to be a positive regulator of sigmaE activity []. However, an rseC mutant was reported to show wild-type sigmaE activity under inducing conditions or non-inducing conditions []. RseC plays a role in reduction of the SoxR iron-sulphur cluster, along with proteins encoded by the rsxABCDGE operon []. Salmonella typhimurium RseC is involved in thiamine synthesis [].
Proteins in this family are components of the mitochondrial ribosome small subunit (28S) which comprises a 12S rRNA and about 30 distinct proteins []. This protein was previously identified as Imogen 38 (NP_065585) which is a 38kDa mitochondrial autoantigen associated with type 1 diabetes []. Its relationship to the etiology of this disease remains to be clarified.
The IraM protein was originally named enhancing lycopene biosynthesis protein 1 on the basis of its ability to restore lycopene production in mutants []. It has subsequently been shown that this protein inhibits rpoS proteolysis by regulating rssB activity, thereby increasing the stability of the sigma stress factor rpoS during magnesium starvation [].
This family represents the PPP2R1A-PPP2R2A-interacting phosphatase regulators (PABIRs), also known as FAM122 which includes PABIR1, PABIR2 and PABIR3 family members (FAM122A, B and C, respectively). The most studied member is PABIR1/FAM122A, which was identified as an inhibitor of serine/threonine-protein phosphatase 2A (PP2A) activity [, ]. It potentiates ubiquitin-mediated proteasomal degradation of serine/threonine-protein phosphatase 2A catalytic subunit alpha (PPP2CA) [].
Strawberry notch proteins carry DExD/H-box groups and helicase C-terminal domains. These proteins promote the expression of diverse targets, potentially through interactions with transcriptional activator or repressor complexes []. Strawberry notch was first identified in Drosophila where functions downstream of Notch and regulates gene expression during development [, ].Protein FORGETTER 1 (included in this entry) is the A. thaliana orthologue of Strawberry notch [].