This entry represents the PR/SET domain found in PR domain zinc finger protein 8 (PRDM8). PRDM8 is a transcription factor that plays an essential role in the development of the mammalian retina [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the second PHD domain of KMT2A.Histone-lysine N-methyltransferase 2A (KMT2A; ) is the catalytic subunit of the MLL1/MLL complex, mediating methylation of 'Lys-4' of histone H3 (H3K4me), a specific tag for epigenetic transcriptional activation []. KMT2A is processed by the threonine endopeptidase taspase, releasing products N320 and C180 []. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.
This entry represents the first PHD domain of KMT2A.Histone-lysine N-methyltransferase 2A (KMT2A; ) is the catalytic subunit of the MLL1/MLL complex, mediating methylation of 'Lys-4' of histone H3 (H3K4me), a specific tag for epigenetic transcriptional activation []. KMT2A is processed by the threonine endopeptidase taspase, releasing products N320 and C180 []. KMT2A contains a CxxC (x for any residue) zinc finger domain, three plant homeodomain (PHD) fingers, a Bromodomain domain, an extended PHD (ePHD) finger, Cys2HisCys5HisCys2His, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain.
The HRD4 gene is identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation following ubiquitination of target proteins but preceding their recognition by the 26S proteasome []. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing [].
The HRD4 gene is identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterised step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome []. This region of the protein contains possibly two zinc binding motifs. Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing.
Sufu, encoding the human ortholog of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signalling []. It is a repressor of the Gli and Ci transcription factors of the Hedgehog signalling cascade [], and functions by binding these proteins and preventing their translocation to the nucleus. Sufu has been found to be a tumour-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signalling pathway []. Homologues of Sufu have been found in bacteria, though their function is not currently known.This entry represents a set of bacterial Sufu homologues which are predicted to function as transcriptional regulators.
Pre-rRNA-processing protein PNO1 is also known as Partner of NOB1 and ribosomal RNA-processing protein 20. NOB1 is a nuclear protein that forms a complex with the 19S regulatory particle of the 26S proteasome and PNO1, acting as a chaperone to join the 20S proteasome with the 19S regulatory particle. The NOB1 complex is then degraded by the mature 26S proteasome []. PNO1 is also a component of the pre-ribosomal particle, and strains lacking PNO1 are defective in ribosomal RNA processing []. PNO1 remains a component of the SSU RRP complex, in which pre-40S subunits are left associated with a limited set of proteins []. PNO1 contains a K Homology domain.
The box H/ACA ribonucleoproteins (RNPs) are protein-RNA complexes responsible for pseudouridylation, the most abundant post-transcriptional modification of cellular RNAs []. Each distinct H/ACA RNA assembles with a common set of four proteins, Cbf5 (NAP57 in rodents and dyskerin in humans), Nop10, Nhp2 (L7Ae in archaea) and Gar1 []. Shq1 is an essential assembly factor for H/ACA ribonucleoproteins (RNPs) required for ribosome biogenesis, pre-mRNA splicing, and telomere maintenance []. It interacts with Cbf5 and may function as an assembly chaperone that protects the Cbf5 protein complexes from non-specific RNA binding and aggregation before assembly of H/ACA RNA [].
This entry represents the ribonuclease Z (RNase Z) homologue, RNase BN, from bacteria. RNase BN was considered to be an exonuclease based on its ability to remove the 3'-terminal residue of tRNAs ending in CA, CU, CCU or even CCA []. In E. coli, even though different set of enzymes are used for tRNA 3'-ends processing and RNase Z is thought to be less important than the other ribonucleases. However, E. coli cells lacking the four main enzymes of the exonucleolytic pathway (RNases T, PH, D and II) are viable, whereas further mutation of RNase BN/Z in this genetic context causes an unviable phenotype [].
Polycomb repressive complex 2 (PRC2) carries out the methylation of lysine 27 of histone H3, a hallmark of repressive chromatin. Three core subunits make up the catalytic core of PRC2; the SET domain containing EZH2, the zinc-finger containing SUZ12 and the WD40 repeat protein EED. The complex forms a compact arrangement of three lobes. The middle lobe largely comprises two domains that mark the beginning of the carboxy (C)-terminal region of EZH2 (MCSS and SANT2) and the helical, C-terminal, component of the Suz12 Vefs domain. This entry describes the MCSS (also known as SANT2L) domain. There is one zinc binding (Zn1Cys3His1) which is formed solely by MCSS [, ].
This entry represents a group of transcription factors, including ARID3A/B/C from humans and protein dead ringer (Retn) from Drosophila melanogaster. ARID3A (also known as E2FBP1 or Bright) is involved in the control of cell cycle progression by the RB1/E2F1 pathway and in B-cell differentiation [, ]. ARID3B has been linked to malignant neuroblastoma []. Retn is regulator of the late development of longitudinal glia []. Mutations in the Drosophila retn gene lead to female behavioral defects and alter a limited set of neurons in the CNS. Retn also affects a male behavioral pathway activated by fruM [].
This set of sequences describe the alpha subunit of a family of known and putative heterotetrameric sarcosine oxidases. Five operons of such oxidases are found in Rhizobium loti (Mesorhizobium loti) and three in Agrobacterium tumefaciens, a high enough copy number to suggest that not all members share the same function. Sarcosine oxidase catalyzes the oxidative demethylation of sarcosine to glycine. The reaction convertstetrahydrofolate to 5,10-methylene-tetrahydrofolate []. Bacterial sarcosine oxidases have been isolated from over a dozen different organisms and fall into two major classes (1) monomeric form that contains only covalent flavin and (2) heterotetrameric (alpha, beta, gamma, delta) form that contain a covalent and noncovalent flavin, this entry represents the monomeric form.
This family consists of several RIB43A-like eukaryoticproteins. Ciliary and flagellar microtubules contain a specialised set of protofilaments, termed ribbons, that are composed of tubulin and several associated proteins. RIB43A was first characterised in the unicellular biflagellate, Chlamydomonas reinhardtii although highly related sequences are present in several higher eukaryotes including humans. The function of this protein is unknown although the structure of RIB43A and its association with the specialised protofilament ribbons and with basal bodies is relevant to the proposed role of ribbons in forming and stabilising doublet and triplet microtubules and in organising their three-dimensional structure. Human RIB43A homologues could represent a structural requirement in centriole replication in dividing cells [].
This entry represents the ePHD finger of A. thaliana histone-lysine N-methyltransferase arabidopsistrithorax-like proteins ATX1, -2, and similar proteins. The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His.ATX1 and -2 are sister paralogs originating from a segmental chromosomal duplication; they are plant counterparts of the Drosophila melanogaster trithorax (TRX) and mammalian mixed-lineage leukemia (MLL1) proteins []. ATX1 is a methyltransferase that trimethylates histone H3 at lysine 4 (H3K4me3). It also acts as a histone modifier and as a positive effector of gene expression []. ATX1 regulates transcription from diverse classes of genes implicated in biotic and abiotic stress responses. It is involved in dehydration stress signaling in both abscisic acid (ABA)-dependent and ABA-independent pathways []. ATX2 is involved in dimethylating histone H3 at lysine 4 (H3K4me2) []. ATX1 and ATX2 are multi-domain proteins that consist of an N-terminal PWWP domain, FYRN- and FYRC (DAST, domain associated with SET in trithorax) domains, a canonical PHD finger, a non-canonical ePHD finger, and a C-terminal SET domain [].
This domain superfamily has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo[, ].In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif [, ].
This domain has been termed SRA-YDG, for SET and Ring finger Associated, and because of the conserved YDG motif within the domain. Further characteristics of the domain are the conservation of up to 13 evenly spaced glycine residues and a VRV(I/V)RG motif. The domain is mainly found in plants and animals and in bacteria. In animals, this domain is associated with the Np95-like ring finger protein and the related gene product Np97, which contains PHD and RING FINGER domains and which is an important determinant in cell cycle progression. Np95 is a chromatin-associated ubiquitin ligase, binding to histones is direct and shows a remarkable preference for histone H3 and its N-terminal tail. The SRA-YDG domain contained in Np95 is indispensable both for the interaction with histones and for chromatin binding in vivo[, ].In plants the SRA-YDG domain is associated with the SET domain, found in a family of histone methyl transferases, and in bacteria it is found in association with HNH, a non-specific nuclease motif [, ].
Members of this family trimethylate 'Lys-9' of histone H3 using monomethylated H3 'Lys-9' as substrate. It also weakly methylates histone H1 (in vitro). H3 'Lys-9' trimethylation represents a specific tag for epigenetic transcriptional repression by recruiting HP1 (CBX1, CBX3 and/or CBX5) proteins to methylated histones. This enzyme mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions. H3 'Lys-9' trimethylation is also required to direct DNA methylation at pericentric repeats [, , ]. SUV39H1 (the human ortholog) is targeted to histone H3 via its interaction with RB1 and is involved in many processes, such as repression of MYOD1-stimulated differentiation[], regulation of the control switch for exiting the cell cycle and entering differentiation, repression by the PML-RARA fusion protein [], BMP-induced repression, repression of switch recombination to IgA []and regulation of telomere length [, ]. SUV39H1 is a component of the eNoSC (energy-dependent nucleolar silencing) complex, a complex that mediates silencing of rDNA in response to intracellular energy status and acts by recruiting histone-modifying enzymes. The eNoSC complex is able to sense the energy status of cell: upon glucose starvation, elevation of NAD+/NADP+ ratio activates SIRT1, leading to histone H3 deacetylation followed by dimethylation of H3 at 'Lys-9' (H3K9me2) by SUV39H1 and the formation of silent chromatin in the rDNA locus []. The activity of this enzyme has been mapped to the SET domain and the adjacent cysteine-rich regions []. The SET domain was originally identified in Su(var)3-9, E(z) and Trithorax genes in Drosophila melanogaster (Fruit fly) []. The sequence conservation pattern and structure analysis of the SET domain provides clues regarding the possible active site residues of the domain. There are three conserved sequence motifs in most of the SET domains. The N-terminal motif (I) has characteristic glycines. The central motif (II) has a distinct pattern of polar and charged residues (Asn, His). The C-terminal conserved motif (III) has a characteristic dyad of polar residues. It has been shown that deregulated SUV39H1 interferes at multiple levels with mammalian higher-order chromatin organisation []and these properties depend primarily on the SET domain [, ]. Methyltransferases (EC [intenz:2.1.1.-]) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as oxygen leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule [, , ]. All these enzymes have in common a conserved region of about 130 amino acid residues that allow them to bind SAM []. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids [, , ]. Methyltransferase are therefore involved in many essential cellular processes including biosynthesis, signal transduction, protein repair, chromatin regulation and gene silencing [, , ]. More than 230 families of methyltransferases have been described so far, of which more than 220 use SAM as the methyl donor.
This family is composed of a group of plant bZIP transcription factors with similarity to OsbZIP46, which regulates abscisic acid (ABA) signalling-mediated drought tolerance in rice [, ]. Plant bZIPs are involved in developmental and physiological processes in response to stimuli/stresses such as light, hormones, and temperature changes. bZIP factors act in networks of homo and heterodimers in the regulation of a diverse set of cellular processes []. This entry also includes ABI5 from Arabidopsis. ABI5 is a transcription factor that participates in ABA-regulated gene expression during seed development and subsequent vegetative stage by acting as the major mediator of ABA repression of growth [, ]. It is also involved in the sugar signalling response in plants [].
The nitrate reductase enzyme complex allows bacteria to use nitrate as an electron acceptor during anaerobic growth. The enzyme complex consists of a tetramer that has an alpha, beta and 2 gamma subunits. The alpha and beta subunits have catalytic activity and the gamma subunits attach the enzyme to the membrane and are b-type cytochromes that receive electrons from the quinone pool and transfers them to the beta subunit. The sequences in this family are the beta subunit for nitrate reductase I (narH) and nitrate reductase II (narY) for Gram-positive and Gram-negative bacteria. A few thermophiles and archaea also match the model. A number of the sequences in this set are experimentally characterised, these include: E.Coli NarH () and NarY () [, ], from Bacillus subtilis, and related proteins from Pseudomonas fluorescens, Paracoccus denitrificans, and Paracoccus halodenitrificans (Halomonas halodenitrificans).
The eIF5-mimic protein 1/2 (also known as basic leucine zipper and W2 domain-containing proteins 2 and 1 (BZW2 and BZW1), respectively), are paralogous human proteins containing C-terminal HEAT domains that resemble the HEAT domain of eIF5 []. BZW1 plays an important role in the cell cycle and transcriptionally control the histone H4 gene during G1/S phase []. The Drosophila ortholog, kra (krasavietz) or exba (extra bases), may be involved in translational inhibition in neural development. The structure of this C-terminal W2 domain resembles that of a set of concatenated HEAT repeats [, , ].The W2 domain has a globular fold and is exclusively composed out of α-helices [, , ]. The structure can be divided into a structural C-terminalcore onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.
A deep split separates two related families of proteins, one of which includes experimentally characterised examples of nicotinate phosphoribosyltransferase (), the first enzyme of NAD salvage biosynthesis. This entry represents the other family. Members have a different (longer) spacing of several key motifs and have an additional C-terminal domain of up to 100 residues. One argument suggesting that this family represents the same enzyme is that no species has a member of both families. Another is that the gene encoding this protein is located near other NAD salvage biosynthesis genes in Nostoc and in at least four different Gram-positive bacteria. NAD and NADP are ubiquitous in life. Most members of this family are from Gram-positive bacteria. An additional set of mutually closely related archaeal sequences score between the trusted and noise cut-offs. This entry includes pncB1 and pncB2 from Mycobacterium tuberculosis, which also play a role in NAD salvage synthesis [].
This entry represents a family of NAD-protein ADP-ribosyltransferases mostly from Myoviridae, incluidng ModB from the bacteriophage T4 [, ].Bacteriophage T4 encodes three ADP-ribosyltransferases: Alt, ModA, and ModB. The ADP-ribosylating activity of each is directed to a specific set of host proteins. ModB ADP-ribosylates a number of host proteins including ribosomal protein S1 [].Protein ADP-ribosylation is an important posttranslational modification catalyzed by a group of enzymes known as ADP-ribosyltransferases (ADP-RTs) []. ADP-RTs transfer single or multiple ADP-ribose moieties from NAD to a specific amino acid residue within a target protein, forming mono ADP-ribosylation or poly ADP-ribosylation (PARylation) []. ADP-ribosylation changes the electrostatic potential of a target protein by introducing two phosphate groups and may affect protein-DNA as well as protein-protein interactions []. Protein ADP-ribosylation plays versatile roles in multiple biological processes.
This entry represents a family of NAD-protein ADP-ribosyltransferases found in Myoviridae, including ModA from the bacteriophage T4 [].Bacteriophage T4 codes for three ADP-ribosyltransferases: Alt, ModA, and ModB. The ADP-ribosylating activity of each is directed to a specific set of host proteins. ModA is known to modify subunits of RNA-polymerase [].Protein ADP-ribosylation is an important posttranslational modification catalyzed by a group of enzymes known as ADP-ribosyltransferases (ADP-RTs) []. ADP-RTs transfer single or multiple ADP-ribose moieties from NAD to a specific amino acid residue within a target protein, forming mono ADP-ribosylation or poly ADP-ribosylation (PARylation) []. ADP-ribosylation changes the electrostatic potential of a target protein by introducing two phosphate groups and may affect protein-DNA as well as protein-protein interactions []. Protein ADP-ribosylation plays versatile roles in multiple biological processes.
This entry represents the Gyroviral VP2 protein and TT viral ORF2.Torque teno virus (TTV) is a nonenveloped and single-stranded DNA virus that was initially isolated from a Japanese patient with hepatitis of unknown aetiology, and which has since been found to infect both healthy and diseased individuals []. Numerous prevalence studies have raised questions about its role in unexplained hepatitis. ORF2 is a 150 residue protein of unknown function. Gyroviruses are small circular single stranded viruses, such as the Chicken anaemia virus. The VP2 protein contains a set of conserved cysteine and histidine residues suggesting a zinc binding domain. VP2 may act as a scaffold protein in virion assembly and may also play a role in intracellular signaling during viral replication.
Histone-lysine N-methyltransferase EHMT2 () mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin []. H3K9me is a specific tag for repression of epigenetic transcription, recruiting HP1 proteins to methylated histones. EHMT2 also monomethylates 'Lys-56' of histone H3 (H3K56me1) during the G1 phase of the cell cycle [], and methylates 'Lys-27' of histone H3 (H3K27me) []. EHMT2 also dimethylates other proteins [], including 'Lys-373' of p53 []. EHMT2 forms a heterdimer with EHMT1 [], and is a component of the E2F6.com-1 complex in G0 phase []. EHMT2 contains seven ankyrin repeats which bind the monomethylated RELA subunit of NF-kappa-B, a SET domain which interacts with WIZ [], and a pre-SET domain that binds three zinc ions via cysteine residues [].
This is the second TUDOR domain found in SETDB1 enzymes () from mammals, also known as Eggless in Drosophila []. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4 []. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts []. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions [].
This is the first TUDOR domain found in SETDB1 enzymes () from animals, also known as Eggless in Drosophila []. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4 []. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts []. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions [].
Protein in this entry are mostly annotated as the ATP phosphoribosyltransferase regulatory subunit (HisZ). However, this entry also includes some histidine-tRNA ligases (HisS or HisRS), which is the paralogue of HisZ. Despite the significant sequential and structural similarity, HisRS and HisZ have different functions []. HisRS is a class IIa aminoacyl-tRNA synthetase (ligase), while HisZ is a regulatory subunit of the hetero-octameric ATP phosphoribosyl transferase that regulate reactions initiating histidine biosynthesis [, ]. From the phylogenetic analysis, HisZ proteins form a monophyletic group that attaches outside the predominant bacterial HisRS clade []. HisZ are represented in a highly divergent set of bacteria (including an aquificale, cyanobacteria, firmicutes, and proteobacteria), but are missing from other bacteria, including mycrobacteria and certain proteobacteria []. It has been suggested that the absences of HisZ from bacteria are due to its loss during evolution [].
This entry represents histidine-tRNA ligase (HisS or HisRS) and its paralogue, ATP phosphoribosyltransferase regulatory subunit (HisZ). Despite the significant sequential and structural similarity, HisRS and HisZ have different functions []. HisRS is a class IIa aminoacyl-tRNA synthetase (ligase), while HisZ is a regulatory subunit of the hetero-octameric ATP phosphoribosyl transferase that regulate reactions initiating histidine biosynthesis [, ]. From the phylogenetic analysis, HisZ proteins form a monophyletic group that attaches outside the predominant bacterial HisRS clade []. HisZ are represented in a highly divergent set of bacteria (including an aquificale, cyanobacteria, firmicutes, and proteobacteria), but are missing from other bacteria, including mycrobacteria and certain proteobacteria []. It has been suggested that the absences of HisZ from bacteria are due to its loss during evolution [].
This entry represents a set of known and suspected archaeal N-glycosylase/DNA lyases. These DNA repair enzymes are part of the base excision repair (BER) pathway; they protect from oxidative damage by removing the major product of DNA oxidation, 8-oxoguanine (GO), from single- and double-stranded DNA substrates [].Cleavage of the N-glycosidic bond between the aberrant base and the sugar-phosphate backbone generates an apurinic (AP) site. Subsequently, the phosphodiester bond 3' from the AP site is cleaved by an elimination reaction, leaving a 3'-terminal unsaturated sugar and a product with a terminal 5'-phosphate. The protein contains two α-helical subdomains, with the 8-oxoguanine binding site located in a cleft at their interface. A helix-hairpin-helix (HhH) structural motif and a Gly/Pro-rich sequence followed by a conserved Asp (HhH-GPD motif) are present [].
This family is defined to identify a pair of paralogous 3'->5' exoribonucleases in Escherichia coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterised originally as required for the expression of virulence genes, but is now recognised as the exoribonuclease RNase R (Rnr). Its paralog in Escherichia coli and Haemophilus influenzae is designated exoribonuclease II (Rnb) []. Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cut-off to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3' exoribonucleases.
Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [].These proteins are part of a set in Streptococcus pneumoniae that undergo late induction by competence pheromone []. There is currently no data that address the function, nor is it known if induction occurs in other species. However, the proteins are predicted to be integral membrane proteins (with several transmembrane segments).
The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. This entry represents the ePHD finger of Histone-lysine N-methyltransferase 2A (KMT2A), which is a histone methyltransferase that belongs to the MLL subfamily of H3K4-specific histone lysine methyltransferases (KMT2). It regulates chromatin-mediated transcription through the catalysis of methylation of histone 3 lysine 4 (H3K4), and is frequently rearranged in acute leukemia []. KMT2A functions as the catalytic subunit in the MLL1 complex, which also contains WDR5, RbBP5, ASH2L and DPY30 as integral core subunits required for the efficient methylation activity of the complex [, ]. The MLL1 complex is highly active and specific for H3K4 methylation. KMT2A contains a CxxC (x for any residue) zinc finger domain, three PHD fingers, a Bromodomain domain, this extended PHD finger, two FY (phenylalanine tyrosine)-rich domains, and a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain [].
This family includes Staphylococcus aureus sortase, a transpeptidase that attaches surface proteins by the Thr of an LPXTG motif to the cell wall. It also includes a protein required for correct assembly of an LPXTG-containing fimbrial protein, a set of homologous proteins from Streptococcus pneumoniae, in which LPXTG proteins are common. However, related proteins are found in Bacillus subtilis and Methanobacterium thermoautotrophicum, in which LPXTG-mediated cell wall attachment is not known []. Sortase refers to a group of prokaryotic enzymes which catalyze the assembly of pilins into pili, and the anchoring of pili to the cell wall []. They act as both proteases and transpeptidases []. Sortase, a transpeptidase present in almost all Gram-positive bacteria, anchors a range of important surface proteins to the cell wall [, ]. The sortases are thought to be good targets for new antibiotics as they are important proteins for pathogenic bacteria [].
E or "early"set domains are associated with the catalytic domain of chitobiase and beta-hexosaminidases () at the C terminus. Chitobiase digests the beta, 1-4 glycosidic bonds of the N-acetylglucosamine (NAG) oligomers found in chitin, an important structural element of fungal cell wall and arthropod exoskeletons. It is thought to proceed through an acid-base reaction mechanism, in which one protein carboxylate acts as the catalytic acid, while the nucleophile is the polar acetamido group of the sugar in a substrate-assisted reaction with retention of the anomeric configuration []. The C terminus of chitobiase is composed of a beta sandwich structure []and may be related to the immunoglobulin and/or fibronectin type III superfamilies. E set domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions.
This entry represents a set of transcription factors found in some endospore-forming bacteria within the Firmicutes (low-GC Gram-positive bacteria). In some species these proteins are encoded multiple times. The best characterised protein in this entry is the prespore-specific transcription factor RsfA from Bacillus subtilis, previously known as YwfN. Expression of RsfA is controlled by sigma factor F, and it seems to improve the efficiency of sporulation by fine-tuning the expression of genes in the sigma F regulon, particularly the timing of their expression []. It also negatively regulates spoIIR and its own synthesis. A paralog in B. subtilis is designated YlbO, though its function is not known. A highly variable linker region separates two very strongly conserved sequence regions within the protein.
The FtsK domain is a hydrophilic domain of about 200 residues, which is foundin:Bacterial cell division protein ftsK (known as sporulation protein SpoIIIEin Bacillus subtilis).A set of conjugative plasmid- and conjugative transposon-encoded proteins,generally called Tra proteins. These proteins come from an extremly widerange of species, including Gram-positive and Gram-negative bacilli andcocci, Streptomyces species, Agrobacterium spp., and archaebacteria. Incases in which a function is known, the protein is required forintercellular DNA transfer.The FtsK domain contains a highly conserved putative ATP-binding P-loop motif and is assumed to be cytoplasmic. It can be found in one tothree copies and is thought to be involved in DNA translocation by couplingATP hydrolysis to movement relative to the long axis of DNA [, , ].
This family includes DUF34/metal-binding proteins from bacteria, NIF3 from budding yeasts and NIF3-like proteins from animals. This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 []and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homolog of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].NIF3 interacts with the yeast transcriptional coactivator Ngg1p which is part of the ADA complex, the exact function of this interaction is unknown [, ].The structure of the Methanocaldococcus jannaschii MJ0927 NIF3 protein has been determined [, ]. It binds to both single-stranded and double-stranded DNA [].
This family represents DUF34/metal-binding proteins (previously known as GTP cyclohydrolase 1 type 2) from bacteria.This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 []and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homolog of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].NIF3 interacts with the yeast transcriptional coactivator Ngg1p which is part of the ADA complex, the exact function of this interaction is unknown [, ].
This entry represents DUF34/metal-binding proteins (also referred to as NIF3-like protein 1) from animals. They share protein sequence similarity with budding yeast NIF3, which interacts with the yeast transcriptional coactivator Ngg1p that is part of the ADA complex [, ].This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 []and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homologue of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].
Members of this group are predicted to be metal-dependent hydrolases based on sequence analysis. They are related to Mg-dependent DNases and contain a TadD DNase domain. However, the similarity is not strong enough to confidently predict that these proteins are necessarily DNases and not some other type of metal-dependent hydrolase. Another related group is the TatD deoxyribonuclease family.Members of this group may be distantly related to a large 3D fold-based domain superfamily of metalloenzymes []. The description of this fold superfamily was based on an analysis of conservation patterns in three dimensions, and the discovery that the same active-site architecture occurs in a large set of enzymes involved primarily in nucleotide metabolism. The group is thought to include urease, dihydroorotase, allantoinase, hydantoinase, AMP-, adenine- and cytosine- deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolase, formylmethanofuran dehydrogenase, and other enzymes [].
The band-7 protein family comprises a diverse set of membrane-bound proteins characterised by the presence of a conserved domain, the band-7 domain, also known as SPFH or PHB domain []. The exact function of the band-7 domain is not known, but examples from animal and bacterial stomatin-type proteins demonstrate binding to lipids and the ability to assemble into membrane-bound oligomers that form putative scaffolds []. A variety of proteins belong to this family. These include the prohibitins, cytoplasmic anti-proliferative proteins and stomatin, an erythrocyte membrane protein. Bacterial HflC protein also belongs to this family.Note: Band 4.1 and Band 7 proteins refer to human erythrocyte membrane proteins separated by SDS polyacrylamide gels and stained with coomassie blue [].
This superfamily includes DUF34/metal-binding proteins (also known as GTP cyclohydrolase 1 type 2 proteins) from bacteria, NIF3 from budding yeasts and NIF3-like proteins from animals.This entry includes the DUF34/metal-binding protein/NIF3 proteins, which are widely distributed across superkingdoms. They were previously annotated as GTP cyclohydrolase 1 type 2 []and, recently, through a comprehensive literature review and integrative bioinformatic analyses it was revealed that annotations for these members are misleading as they were based on a single set of in vitro results examining the NIF3 homolog of Helicobacter pylori []. Actually, they have varied phenotypes with the unifying functional role as metal-binding proteins [].NIF3 interacts with the yeast transcriptional coactivator Ngg1p which is part of the ADA complex, the exact function of this interaction is unknown [, ].
Proteins containing this domain include vertebrate dual specificity protein phosphatase Laforin and plant starch excess4 (SEX4). Laforin (encoded by the EPM2A gene) is a dual-specificity phosphatase that dephosphorylates complex carbohydrates. Mutations in the EPM2A gene cause Lafora disease (LD), a fatal autosomal recessive neurodegenerative disorder characterised by the presence of progressive neurological deterioration, myoclonus, and epilepsy []. Pathologically, LD is characterised by distinctive polyglucosans, which are formations of abnormal glycogen []. Laforin prevents LD by at least two mechanisms: by preventing hyperphosphorylation of glycogen by dephosphorylating it, allowing proper glycogen formation, and by promoting the ubiquitination of proteins involved in glycogen metabolism via its interaction with malin. Laforin contains an N-terminal CBM20 (carbohydrate-binding module, family 20) domain and a C-terminal catalytic dual specificity phosphatase (DSP) domain [].Plant SEX4 (also known as DSP4) regulates starch metabolism by selectively dephosphorylating glucose moieties within starch glucan chains. It contains an N-terminal catalytic DSP domain and a C-terminal Early (E) set domain [].
Teashirt 3 is a transcriptional regulator involved in developmental processes []. It has function in association with APBB1, SET and HDAC factors as a transcriptional repressor, that inhibits the expression of CASP4. TSHZ3-mediated transcription repression involves the recruitment of histone deacetylases HDAC1 and HDAC2. It associates with chromatin in a region surrounding the CASP4 transcriptional start site(s). It regulates the development of neurons involved in both respiratory rhythm and airflow control []. It promotes maintenance of nucleus ambiguus (nA) motoneurons, which govern upper airway function, and establishes a respiratory rhythm generator (RRG) activity compatible with survival at birth. It is involved in the differentiation of the proximal uretic smooth muscle cells during developmental processes []. It is involved in the up-regulation of myocardin, that directs the expression of smooth muscle cells in the proximal ureter [].
This entry represents the RNA recognition motif (RRM) of Setd1A (also known as Set1A), a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1A is localized to euchromatic nuclear speckles and associates with a complex containing six human homologs of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2) []. Set1A contains an N-terminal RNA recognition motif (RRM), an N-SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain. In contrast to Set1B, Set1A additionally contains an HCF-1 binding motif that interacts with HCF-1 in vivo.
This entry represents the RNA recognition motif (RRM) of Setd1B (also known as Set1B), a ubiquitously expressed vertebrates histone methyltransferase that exhibits high homology to yeast Set1. Set1B is localised to euchromatic nuclear speckles and associates with a complex containing six human homologues of the yeast Set1/COMPASS complex, including CXXC finger protein 1 (CFP1; homologous to yeast Spp1), Rbbp5 (homologous to yeast Swd1), Ash2 (homologous to yeast Bre2), Wdr5 (homologous to yeast Swd3), and Wdr82 (homologous to yeast Swd2). Set1B complex is a histone methyltransferase that produces trimethylated histone H3 at Lys4 [, ]. Set1B contains an N-terminal RNA recognition motif (RRM), an N-SET domain, and a C-terminal catalytic SET domain followed by a post-SET domain.
This entry represents a subfamily of the FGGY family of carbohydrate kinases, including FGGY carbohydrate kinase domain-containing proteins from mammals and D-ribulokinase YDR109C from S. cerevisiae, which catalyse ATP-dependent phosphorylation of D-ribulose at C-5 to form D-ribulose 5-phosphate []. These proteins may function in a metabolite repair mechanism by preventing toxic accumulation of free D-ribulose formed by non-specific phosphatase activities, and may also play a role in regulating D-ribulose 5-phosphate recycling in the pentose phosphate pathway.This subfamily is closely related to a set of ribulose kinases, and many members are designated ribitol kinase. However, the member from Klebsiella pneumoniae, rbtK, from a ribitol catabolism operon, accepts D-ribulose and to a lesser extent D-arabinitol and ribitol [].
This entry represents a set of known and predicted TetR-like transcriptional regulators associated with operons encoding PEP-dependent dihydroxyacteone (Dha) kinases. The Lactococcus lactis DhaS protein has been shown to interact with the Dha-binding protein DhaQ to form a stable complex []. In the presence of Dha this complex activates transcription of the Dha kinase operon, leading to the utilisation of this substrate. Thus, unlike most TetR-like regulators, DhaS acts as an activator of gene expression rather than a repressor. The DhaS protein is composed of two monomers which fold into a wedge-shaped homodimer. Each monomer is composed of nine alpha helices and is divided into two domains: an N-terminal DNA-binding domain with a typical TetR fold, and a core domain involved in dimerisation.
This entry represents a family of TonB-dependent outer-membrane receptors which are found mainly in Xanthomonas and Caulobacter. These appear to represent the expansion of a paralogous family in that the 22 Xanthomonas axonopodis (21 in Xanthomonas campestris) and 18 Caulobacter crescentus (Caulobacter vibrioides) sequences are more closely related to each other than any of the many TonB-dependent receptors found in other species. In fact, the C. crescentus and Xanthomonas sequences are inseparable on a phylogenetic tree using a PAM-weighted neighbour-joining method, indicating that one of the two genuses may have acquired this set of receptors from the other. The mechanism by which this family is shared between Xanthomonas, a gamma proteobacterial plant pathogen and Caulobacter, an alpha proteobacterial aquatic organism is unclear.
This homology domain, GlyGly-CTERM, shares a species distribution with rhombosortase (), a subfamily of rhomboid-like intramembrane serine proteases []. It is probably a recognition sequence for protein sorting and then cleavage by rhombosortase. Shewanella species have the largest number of target proteins per genome, up to thirteen. The domain occurs at the extreme carboxyl-terminus of a diverse set of proteins, most of which are enzymes with conventional signal sequences and with hydrolytic activities: nucleases, proteases, agarases, etc. The agarase AgaA from Vibro sp. strain JT0107 is secreted into the medium, while the same protein heterologously expressed in E. coli is retained in the cell fraction []. This suggests cleavage and release in species with this domain. Both this suggestion, and the chemical structure of the domain (motif, hydrophobic predicted transmembrane helix, cluster of basic residues) closely parallels that of the LPXTG/sortase system and the PEP-CTERM/exosortase(EpsH) system. For this reason, the putative processing enzyme is designated rhombosortase.
Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane []. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region []. This entry represents the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members is unknown, but transport seems likely. The crystal structure of the C terminus of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 5 (PRDM5). PRDM5 is a sequence-specific DNA-binding transcription factor that represses transcription at least in part by recruitment of the histone methyltransferase EHMT2/G9A and histone deacetylases such as HDAC1 []. Mutation of the PRDM5 gene has been linked to Brittle cornea syndrome 2 (BCS2) [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 1 (PRDM1, also known as BLIMP-1). PRDM1 is a transcriptional repressor that is essential for cellular development. It is essential for the differentiation of B and T cells []. In Caenorhabditis elegans, it regulates the spatiotemporal cell migration pattern []. The degradation of PRDM1 by DRE-1/FBXO11-dependent mechanism regulates the C. elegans developmental timing and maturation [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 16 (PRDM16, also known as MEL1). PRDM16 and PRDM3 share protein sequences similarities and have functions in hematopoietic and neuronal stem cells and in adipose tissue differentiation. They also act as oncoproteins (Mel1 and Evi1) involved in translocation-induced leukemia. PRDM3 and PRDM16 are H3K9me1 methyltransferases required for mammalian heterochromatin integrity [, , ].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 3 (PRDM3, also known as MECOM or EVI-1). PRDM3 and PRDM16 share protein sequences similarities and have functions in hematopoietic and neuronal stem cells and in adipose tissue differentiation. They also act as oncoproteins (Mel1 and Evi1) involved in translocation-induced leukemia. PRDM3 and PRDM16 are H3K9me1 methyltransferases required for mammalian heterochromatin integrity [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 13 (PRDM13). PRDM13 mediates the balance of inhibitory and excitatory neurons in somatosensory circuits []. It is also required to achieve precise neuronal specification during mouse development []. The duplication of PRMD13 causes North Carolina macular dystrophy [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 14 (PRDM14). PRDM14 acts on regulating epigenetic modifications in the cells, playing a key role in the regulation of cell pluripotency, epigenetic reprogramming, differentiation and development [, ]. Aberrant PRDM14 expression is associated with tumorigenesis, cell migration and cell chemotherapeutic drugs resistance [, ].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
This entry represents the PR/SET domain found in PR domain zinc finger protein 12 (PRDM12). PRDM12 is essential for human pain perception []. It is also a key regulator of sensory neuronal specification in Xenopus []. It has been shown to initiate and maintain the expression of TrkA in developing nociceptors [].The PRDM family members are characterised by the presence of a N-terminal PR (PRDI-BF1 and RIZ1 homology) domain followed by multiple zinc fingers which confer DNA binding activity. PR domains are only distantly related to the classical SET methyltransferase domains []. They are involved in epigenetic regulation of gene expression through their intrinsic histone methyltransferase activity or via interactions with other chromatin modifying enzymes [].
Eukaryotic protein kinases [, , , ]are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases.Rho kinases are serine/threonine kinases that are important in cell migration, cell proliferation and cell survival. Disorders of the central nervous system including stroke, inflammatory and demyelinating diseases, Alzheimer's disease and neuropathic pain may be linked to abnormal activation of Rho kinases [].This entry represents a set of Rho-associated, coiled-coil-containing, protein kinases. They phosphorylate a large number of important signalling proteins and help regulate the assembly of the actin cytoskeleton. Proteins in this entry have been shown to play a role in smooth muscle formation, and promote the formation of stress fibres and of focal adhesion complexes [, ].
Stn1 is a component of the CST complex, a complex that binds to single-stranded DNA and is required to protect telomeres from DNA degradation. The CST complex binds single-stranded DNA with high affinity in a sequence-independent manner, while isolated subunits bind DNA with low affinity by themselves. In addition to telomere protection, the CST complex has probably a more general role in DNA metabolism at non-telomeric sites [, ]. The C-terminal domain of Stn1 has two winged helix-turn-helix (wHTH) motifs, wHTH1 and wHTH2. This superfamily represents the wHTH1 motif, which is structurally similar to that in RPA32 with an additional large insertion between helices α2 and α3, unique to Stn1 [, ]. This additional wHTH1 motif may allow interaction with a different set of proteins that function at telomeres such as Ctc1 [].
TALPID3 is the name of a classical chicken mutant with abnormal limb patterning and malformations in other regions of the embryo. It was so called because their paddle-shaped limbs resemble those of the mole (Talpa) []. Besides the limbs, there is a set of malformations including face, skeleton, and vascular defects []. TALPID3 gene was then identified and found to be a target for Hedgehog signalling []. In chickens, TALPID3 protein is a centrosomal protein required for the function of both Gli repressor and activator in the intracellular Hedgehog pathway []. Similar to the chicken TALPID3 mutants, the mouse mutants also lack primary cilia and have face and neural tube defects. Moreover, it has defects in left/right asymmetry [].
This entry represents a domain of about 143 amino acids that may occur singly or in up to 23 tandem repeats in very large proteins in the genus Vibrio, and in related species such as Legionella pneumophila, Photobacterium profundum, Rhodopseudomonas palustris, Shewanella pealeana, and Aeromonas hydrophila. Proteins with these domains represent a subset of a broader set of proteins with a particular signal for type 1 secretion, consisting of several glycine-rich repeats modeled by , followed by a C-terminal domain modeled by . Proteins with this domain tend to share several properties with the RtxA (Repeats in Toxin) protein of Vibrio cholerae, including a large size often containing tandemly repeated domains and a C-terminal signal for type 1 secretion.
Stn1 is a component of the CST complex, a complex that binds to single-stranded DNA and is required to protect telomeres from DNA degradation. The CST complex binds single-stranded DNA with high affinity in a sequence-independent manner, while isolated subunits bind DNA with low affinity by themselves. In addition to telomere protection, the CST complex has probably a more general role in DNA metabolism at non-telomeric sites [, ]. This entry represents the C-terminal domain of Stn1, which has two winged helix-turn-helix (wHTH) motifs, wHTH1 and wHTH2 [, ]. wHTH1 is structurally similar to that in RPA32 with an additional large insertion between helices alpha2 and alpha3, unique to Stn1, and it may allow interaction with a different set of proteins that function at telomeres such as Ctc1 []. wHTH2 is most similar to the DNA-binding wHTH motifs of the pur operon repressor and RepE replication initiator, but it does not bind double-stranded DNA [].
Genome duplication is precisely regulated by cyclin-dependent kinases CDKs, which bring about the onset of S phase by activating replication origins and then prevent relicensing of origins until mitosis is completed. The optimum sequence motif for CDK phosphorylation is S/T-P-K/R-K/R, and Drc1-Sld2 is found to have at least 11 potential phosphorylation sites. Drc1 is required for DNA synthesis and S-M replication checkpoint control. Drc1 associates with Cdc2 and is phosphorylated at the onset of S phase when Cdc2 is activated. Thus Cdc2 promotes DNA replication by phosphorylating Drc1 and regulating its association with Cut5 []. Sld2 and Sld3 represent the minimal set of S-CDK substrates required for DNA replication [].This entry also includes ATP-dependent DNA helicase Q4, which may be involved in chromosome segregation and has been associated with various diseases [, ].
The protozoan parasite that causes Chagas' disease, Trypanosoma cruzi, contains a 24kDa protein that is recognised by antisera from both humans andexperimental animals infected with this organism. Near its C terminus are two regions that have sequence similarity with E-F hand Ca2+-bindingproteins []. Indeed, the native trypanosome protein exhibits low Ca2+-binding capacity and high Ca2+-binding affinity, consistent with bindingvia E-F hand structures. Immunofluorescence assays have suggested that theprotein is localised to the trypanosome's flagellum. This observation,coupled with the protein's Ca2+-binding properties, suggests that it mayparticipate in molecular processes associated with the high motility of the parasite []. A set of similar 24kDa proteins, termed calflagins, are contained withinthe flagellum of Trypanosoma brucei. These contain three EF-hand Ca2+-binding domains and one degenerate EF-hand motif [].
This entry represents the EF-hand domain found in 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase delta-3 (PLC-delta3).PLC-delta-3 is essential for trophoblast and placental development []. It locates at the cleavage furrow where it may participate in cytokinesis []. PI-PLC-delta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. In addition, PI-PLC-delta3 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, which may be responsible transporting PI-PLC-delta3 from the cell nucleus [].
The band-7 protein family comprises a diverse set of membrane-bound proteins characterised by the presence of a conserved domain, the band-7 domain, also known as SPFH or PHB domain []. The exact function of the band-7 domain is not known, but examples from animal and bacterial stomatin-type proteins demonstrate binding to lipids and the ability to assemble into membrane-bound oligomers that form putative scaffolds []. A variety of proteins belong to this family. These include the prohibitins, cytoplasmic anti-proliferative proteins and stomatin, an erythrocyte membrane protein. Bacterial HflC protein also belongs to this family.Note: Band 4.1 and Band 7 proteins refer to human erythrocyte membrane proteins separated by SDS polyacrylamide gels and stained with coomassie blue [].
This entry represents a domain found in enoyl-CoA hydratase/isomerase HIBYL-CoA-H and related proteins.The enoyl-CoA hydratase/isomerase family contains a diverse set of enzymes including: enoyl-CoA hydratase, napthoate synthase, carnitate racemase, 3-hydroxybutyryl-CoA dehydratase and dodecanoyl-CoA delta-isomerase. This entry represents a subset of the enoyl-CoA hydratase/isomerase family. Proteins in this entry includes 3-hydroxyisobutyryl-CoA hydrolases (HIBYL-CoA-H) from eukaryotes and their homologues from bacteria. Human HIBYL-CoA-H is a mitochondrial enzyme that catalyses the fifth step in the valine catabolic pathway in eukaryotes, namely the conversion of 3-hydroxyisobutyryl-CoA to free 3-hydroxyisobutyrate [, ]. It also hydrolyses 3-hydroxypropionyl-CoA, giving it a dual role in a secondary pathway of propionate metabolism []. Deficiency of this enzyme is associated with Leigh-like disease [, ].
This NYN domain is found in Meiosis regulator and mRNA stability factor 1 (MARF1, also known as limkain-b1) [, ]and in uncharacterised proteins. The NYN domains are found in the eukaryotic proteins typified by the Nedd4-binding protein 1 and the bacterial YacP-like proteins. The NYN (for Nedd4-BP1, YacP-like Nuclease) domain shares a common protein fold with two other previously characterised groups of nucleases, namely the PIN (PilT N-terminal) and FLAP/5' -->3' exonuclease superfamilies. These proteins share a common set of 4 acidic conserved residues that are predicted to constitute their active site. Based on the conservation of the acidic residues and structural elements it has been suggested that PIN and NYN domains are likely to bind only a single metal ion, unlike the FLAP/5' -->3' exonuclease superfamily, which binds two metal ions []. Based on conserved gene neighbourhoods the bacterial members are likely to be components of the processome/degradosome that process tRNAs or ribosomal RNAs.
This entry includes UHRF1/2 from animals and ORTHRUS 1-5 from Arabidopsis. They are ubiquitin-like proteins with PHD and RING finger domains. UHRF1, also known as ICBP90, is a transcription and cell cycle regulator and a methyl K9 H3-specific binding protein []. UHRF2 is a ubiquitin E3 ligase for cell cycle proteins, such as CCND1 and CCNE1 []. It can also act as a SUMO E3 ligase for ZNF131 []. This family also includes UHRF1-like protein from Cryptococcus neoformans, which binds hemimethylated DNA and is involved in DNA methylation maintenance. Unlike the human orthologue, UHRF1-like lacks the Tudor H3K9me reader and RING E3 ligase domains found in its human ortholog [].In plants, ORTHRUS family members are E3 ligases mediating DNA methylation status in vivo. ORTH1-ORTH5 are predicted to encode proteins that contain one plant homeodomain (PHD), two really interesting new gene (RING) domains, and one set ring associated (SRA) domain [].
Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains []. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [, ]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons. This domain represents the winged helix-turn-helix DNA-binding domain which is located close to the N terminus of HrcA. This domain is also found at the N terminus of a set of uncharacterised proteins that have two C-terminal CBS domains.
This is a hydrophobic pore-forming domain found towards the N-terminal of RTX toxins [].Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior [, ]. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C terminus of the exotoxin. The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I secretion system, and are important virulence factors in Gram-negative bacteria, such as Escherichia coli (), Actinobacillus pleuropneumoniae () and Kingella kingae (). They consist of a hydrophobic pore-forming domain at the N-terminal that harbors four putative transmembrane α-helices, a typical glycine-rich repeats segment and a C-terminal signal sequence []. The glycine-rich repeats are essential for binding calcium, and are critical for the biological activity of the secreted toxins []. They can be divided into two different groups, (i) hemolysins, which cause cause the lysis of erythrocytes and exhibit toxicity towards a wide range of cell types from various species; and (ii) leukotoxins, that exhibit narrow cell type and species specificity due to cell-specific binding through the beta2-integrins expressed on the cell surface of leukocytes []. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl-modification enzyme, required to convert RtxA to its active form [].Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX toxins. Recent work on its relative rtxC gene product HlyC []has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. To cause pathogenicity, the HlyA toxin must first bind Ca2+ ions to the set of glycine-rich repeats and then be activated by HlyC []. This has been demonstrated both in vitroand in vivo.
PRY is a 50-60 amino acids domain associated with SPRY domains, adjacent to its N-terminal. The SPRY domain () is a protein-protein interaction module involved in many important signaling pathways [, ]. Distant homologues are domains in butyrophilin/marenostrin/pyrin, evolutionarily more ancient than SPRY/B30.2 counterpart. PRY and SPRY domains are structurally very similar and consist of a beta sandwich fold [, ]. Ca2+-release from the sarcoplasmic or endoplasmic reticulum, the intracellular Ca2+ store, is mediated by the ryanodine receptor (RyR) and/or the inositol trisphosphate receptor (IP3R).The proteins identified by the PRY domain, clearly fall into 3 sets which can be defined by their combination of signatures:This group contains an immunoglobulin domain N-terminal to the PRY and butyrophilin domains. Butyrophilins are glycoproteins that are expressed on the apical surfaces of secretory cells in lactating mammary tissue and which may function in the secretion of milk-fat droplets.This group contain a RING-finger domain N-terminal to the PRY domain. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions. There are two different variants, the C3HC4-type and a C3H2C3-type, which is clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger' is not found associated with this group of proteins. This set of proteins are described as TRIM (TRIpartite Motif) family members and are involved in cellular compartmentalisation []. The TRIM family sequences are defined by a Ring finger domain, a B-box type1 (B1) and a B-box type 2 (B2) followed by a coiled-coil (CC) region []. Genes belonging to this family are implicated in a variety of processes such as development and cell growth and are involved in human disease.Many of these proteins, with the PRY domain have a number of C-terminal signatures, SPRY, RFP-like (also known as B30.2 domain or PRYSPRY) and butyrophilin domain [].The third set of proteins have the C-terminal signatures but have no N-terminal RING-finger or immunoglobulin domain signatures. These proteins have not been functionally described.
The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [, , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability []. This entry is encoded within the CRISPR-associated RAMP module, a set of six genes found together in prokaryotic genomes []. This gene cluster is found only in species with CRISPR repeats, usually near the repeats themselves. Because most of the six genes (but not those encoding this entry) contain RAMP domains, and because its appearance in a genome appears to depend on other CRISPR-associated Cas genes, the set is designated the CRISPR RAMP module. This entry, typified by TM1794 from Thermotoga maritima, is designated Cmr2.
Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior [, ]. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C terminus of the exotoxin. The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I secretion system, and are important virulence factors in Gram-negative bacteria, such as Escherichia coli (), Actinobacillus pleuropneumoniae () and Kingella kingae (). They consist of a hydrophobic pore-forming domain at the N-terminal that harbors four putative transmembrane α-helices, a typical glycine-rich repeats segment and a C-terminal signal sequence []. The glycine-rich repeats are essential for binding calcium, and are critical for the biological activity of the secreted toxins []. They can be divided into two different groups, (i) hemolysins, which cause cause the lysis of erythrocytes and exhibit toxicity towards a wide range of cell types from various species; and (ii) leukotoxins, that exhibit narrow cell type and species specificity due to cell-specific binding through the beta2-integrins expressed on the cell surface of leukocytes []. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl-modification enzyme, required to convert RtxA to its active form [].Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX toxins. Recent work on its relative rtxC gene product HlyC []has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. To cause pathogenicity, the HlyA toxin must first bind Ca2+ ions to the set of glycine-rich repeats and then be activated by HlyC []. This has been demonstrated both in vitroand in vivo.
This domain describes the C-terminal region of RTX toxins, which contains a secretion signal []. RTX toxins may interact with lipopolysaccharide (LPS) to functionally impair and eventually kill leukocytes []. This region is found in association with the RTX N-terminal domain () and multiple hemolysin-type calcium-binding repeats (). Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior [, ]. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C terminus of the exotoxin. The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I secretion system, and are important virulence factors in Gram-negative bacteria, such as Escherichia coli (), Actinobacillus pleuropneumoniae () and Kingella kingae (). They consist of a hydrophobic pore-forming domain at the N-terminal that harbors four putative transmembrane α-helices, a typical glycine-rich repeats segment and a C-terminal signal sequence []. The glycine-rich repeats are essential for binding calcium, and are critical for the biological activity of the secreted toxins []. They can be divided into two different groups, (i) hemolysins, which cause cause the lysis of erythrocytes and exhibit toxicity towards a wide range of cell types from various species; and (ii) leukotoxins, that exhibit narrow cell type and species specificity due to cell-specific binding through the beta2-integrins expressed on the cell surface of leukocytes []. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structural component of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl-modification enzyme, required to convert RtxA to its active form [].Escherichia coli haemolysin (HlyA) is often quoted as the model for RTX toxins. Recent work on its relative rtxC gene product HlyC []has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. To cause pathogenicity, the HlyA toxin must first bind Ca2+ ions to the set of glycine-rich repeats and then be activated by HlyC []. This has been demonstrated both in vitroand in vivo.
Peroxisome proliferator-activated receptor (PPAR) gamma coactivator 1 alpha (PGC-1 alpha) and PGC-1 beta are two members of the PGC-1 family and important regulators of mitochondrial metabolism. They are expressed at high levels in heart and skeletal muscle and can induce mitochondrial biogenesis and oxidative capacity. PGC-1 alpha and PGC-1 beta regulate DNA-binding transcription factors, such as estrogen-related receptors (ERRS). ERR beta and ERR gamma are orphan nuclear receptors that act both downstream and parallel to PGC-1 coactivators to control the expression of a broad set of genes important for energy homeostasis.PGC-1/ERR-induced regulator in muscle 1 (Perm1) is a downstream effector of PGC-1 and ERRs [], regulating muscle-specific pathways important for energy metabolism and contractile function. Perm1 does not function as a classical coactivator, but has been suggested to act by regulating signalling pathways in a tissue-selective manner to enable PGC-1/ERRs to induce specific sets of genes. It has been shown that Perm1 is also required for the efficient expression of the glucose transporter Glut4, thus, it may affect glucose uptake, and of the mitochondrial creatine kinase Ckmt2 [].
Endocytosis and intracellular transport involve several mechanistic steps: (1) for the internalisation of cargo molecules, the membrane needs to bend to form a vesicular structure, which requires membrane curvature and a rearrangement of the cytoskeleton; (2) following its formation, the vesicle has to be pinched off the membrane; (3) the cargo has to be subsequently transported through the cell and the vesicle must fuse with the correct cellular compartment.Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs)-domain, which is required for their in vivofunction and their ability to tubulate membranes []. The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases.
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE [].Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the N-terminal domain of SopE and SopE2. The function of this domain is unknown.
The type III secretion system of Gram-negative bacteria is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. Effector proteins secreted by the type III system do not possess a secretion signal, and are considered unique because of this. Salmonella spp. secrete an effector protein called SopE that is responsible for stimulating the reorganisation of the host cell actin cytoskeleton, and ruffling of the cellular membrane []. It acts as a guanyl-nucleotide-exchange factor on Rho-GTPase proteins such as Cdc42 and Rac. As it is imperative for the bacterium to revert the cell back to its "normal"state as quickly as possible, another tyrosine phosphatase effector called SptP reverses the actions brought about by SopE []. Recently, it has been found that SopE and its protein homologue SopE2 can activate different sets of Rho-GTPases in the host cell []. Far from being a redundant set of two similar type III effectors, they both act in unison to specifically activate different Rho-GTPase signalling cascades in the host cell during infection.This entry represents the guanine nucleotide exchange factor domain of SopE. This domain has an α-helical structure consisting of two three-helix bundles arranged in a lamdba shape [, ].
Clusterin is a vertebrate glycoprotein [], the exact function of which is not yet clear. Clusterin expression is complex, appearing as different forms indifferent cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initialleader peptide. This ~60kDa pre-sCLU protein is further glycosylated and proteolytically cleaved into alpha- and beta-subunits, held together by disulphide bonds.External sCLU is an 80kDa protein and may act as a molecular chaperone, scavenging denatured proteins outside cells following specific stress-induced injury such as heat shock. sCLU possesses nonspecific binding activity to hydrophobic domains of various proteins in vitro[].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth andsurvival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associatewith Ku70 and is minimally required for cell death.Clusterin is synthesized as a precursor polypeptide of about 400 amino acids which is post-translationally cleaved to form two subunits of about 200 amino acids each. The two subunits are linked by five disulphide bonds to form anantiparallel ladder-like structure []. In each of the mature subunits the five cysteines that are involved in disulphide bonds are clustered in domains of about 30 amino acids located in the central part of the subunits.This entry represents the N-terminal domain of the clusterin precursor.
Clusterin (Clu), also known as apolipoprotein J, is a vertebrate glycoprotein []. Clusterin expression is complex, appearing as different forms indifferent cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initial leader peptide. This ~60kDa pre-sCLU protein is proteolytically cleaved into alpha- and beta-subunits and further glycosylated to form mature disulfide-linked heterodimeric secretory CLU (sCLU). sCLU is an 80kDa protein and acts as a molecular chaperone, scavenging denatured proteins outside cells [, ]. sCLU possesses nonspecific binding activity to hydrophobic domains of various non-native proteins [], binds to some bacteria and bacterial proteins [], and interacts with different immune molecules [].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth andsurvival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associatewith Ku70 and is minimally required forcell death. The sCLU protein is cytoprotective and anti-apoptotic, whereas the nCLU protein is pro-apoptotic [, , ].
Clusterin (Clu), also known as apolipoprotein J, is a vertebrate glycoprotein []. Clusterin expression is complex, appearing as different forms indifferent cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initial leader peptide. This ~60kDa pre-sCLU protein is proteolytically cleaved into alpha- and beta-subunits and further glycosylated to form mature disulfide-linked heterodimeric secretory CLU (sCLU). sCLU is an 80kDa protein and acts as a molecular chaperone, scavenging denatured proteins outside cells [, ]. sCLU possesses nonspecific binding activity to hydrophobic domains of various non-native proteins [], binds to some bacteria and bacterial proteins [], and interacts with different immune molecules [].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth andsurvival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associatewith Ku70 and is minimally required for cell death. The sCLU protein is cytoprotective and anti-apoptotic, whereas the nCLU protein is pro-apoptotic [, , ].This family also includes clusterin-like protein 1 (CLUL1), which is expressed specifically in cone photoreceptor cells []and is likely to be necessary for normal cone function [].
Clusterin is a vertebrate glycoprotein [], the exact function of which is not yet clear. Clusterinexpression is complex, appearing as different forms indifferent cell compartments. One set of proteins is directed for secretion, and other clusterin species are expressed in thecytoplasm and nucleus. The secretory form of the clusterin protein (sCLU) is targeted to the ER by an initialleader peptide. This ~60kDa pre-sCLU protein is further glycosylated and proteolytically cleaved into alpha- and beta-subunits, held together by disulphide bonds.External sCLU is an 80kDa protein and may act as a molecular chaperone, scavenging denatured proteins outside cells following specific stress-induced injury such as heat shock. sCLU possesses nonspecific binding activity to hydrophobic domains of various proteins in vitro[].A specific nuclear form of CLU (nCLU) acts as a pro-death signal, inhibiting cell growth andsurvival. ThenCLU protein has two coiled-coil domains, one at its N terminus that is unable to bind Ku70, and a C-terminal coiled-coil domain that is uniquely able to associatewith Ku70 and is minimally required for cell death.Clusterin is synthesized as a precursor polypeptide of about 400 amino acids which is post-translationally cleaved to form two subunits of about 200 amino acids each. The two subunits are linked by five disulphide bonds to form anantiparallel ladder-like structure []. In each of the mature subunits the five cysteines that are involved in disulphide bonds are clustered in domains of about 30 amino acids located in the central part of the subunits.This entry represents the C-terminal domain of the custerin precursor.
Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.
Myotubularin-related protein 13 (MTMR13), also known as SET-binding factor 2 (SBF2), belongs to the myotubularin family. It may function as a guanine nucleotide exchange factor (GEF) that activates Rab28 (a Rab GTPase) []. Loss of MTMR13 leads to the Charcot-Marie-Tooth 4B (CMT4B) peripheral neuropathy, which is a recessive demyelinating form of Charcot-Marie-Tooth disease, a disorder of the peripheral nervous system [, ].The family of myotubularin (MTM) phosphoinositide phosphatases includes catalytically inactive members, or pseudophosphatases, which contain inactivating substitutions in the phosphatase domain. MTMR13 is one of them. MTMR13 exists in cells independently as a homodimer, as well as in complex with a homodimer of MTMR2. Association with MTMR2 dramatically increases MTMR2 enzymatic activity []. MTMR13 contain an N-terminal DENN domain, a PH-GRAM domain, an inactive PTP domain, a SET interaction domain, a coiled-coil domain, and a C-terminal PH domain. The GRAM domain, found in myotubularins, glucosyltransferases, and other putative membrane-associated proteins, is part of a larger motif with a pleckstrin homology (PH) domain fold. This entry represents the PH-GRAM domain of MTMR13.
Endocytosis and intracellular transport involve several mechanistic steps: (1) for the internalisation of cargo molecules, the membrane needs to bend to form a vesicular structure, which requires membrane curvature and a rearrangement of the cytoskeleton; (2) following its formation, the vesicle has to be pinched off the membrane; (3) the cargo has to be subsequently transported through the cell and the vesicle must fuse with the correct cellular compartment.Members of the Amphiphysin protein family are key regulators in the early steps of endocytosis, involved in the formation of clathrin-coated vesicles by promoting the assembly of a protein complex at the plasma membrane and directly assist in the induction of the high curvature of the membrane at the neck of the vesicle. Amphiphysins contain a characteristic domain, known as the BAR (Bin-Amphiphysin-Rvs)-domain, which is required for their in vivofunction and their ability to tubulate membranes []. The crystal structure of these proteins suggest the domain forms a crescent-shaped dimer of a three-helix coiled coil with a characteristic set of conserved hydrophobic, aromatic and hydrophilic amino acids. Proteins containing this domain have been shown to homodimerise, heterodimerise or, in a few cases, interact with small GTPases. This entry identifies several fungal BAR domain-containing proteins, such as Gvp36, that are not detected by [].
This entry represents a subset of the YdjC-like family of uncharacterised proteins. The Acidithiobacillus ferrooxidans ATCC 23270 protein (AFE_0976) is encoded in the same locus as the genes for squalene-hopene cyclase (SHC, ) and other proteins associated with the biosynthesis of hopanoid natural products. Similarly, in Ralstonia eutropha (strain JMP134) (Alcaligenes eutrophus) this protein (Reut_B4902) is encoded adjacent to the genes for HpnAB, IspH and HpnH (), although SHC itself is encoded elsewhere in the genome. Notably, this protein (here named HpnK) and three others form a conserved set (HpnIJKL) which occurs in a subset of all genomes containing the SHC enzyme. This relationship was discerned using the method of partial phylogenetic profiling []. This group includes Zymomonas mobilis, the organism where the initial hopanoid biosynthesis locus was described consisting of the genes for HpnA-E and SHC (HpnF) []. Continuing past SHC are found genes encoding a phosphorylase enzyme (ZMO0873, i.e. HpnG, ) and a radical SAM enzyme (ZMO0874), HpnH. Although discontinuous in Z. mobilis, we continue the gene symbol sequence with HpnIJKL.
This entry represents a family of NAD-protein ADP-ribosyltransferases found in Myoviridae (phages with contractile tails). It includes the ALT protein from the bacteriophage T4Protein ADP-ribosylation is an important posttranslational modification catalyzed by a group of enzymes known as ADP-ribosyltransferases (ADP-RTs) []. ADP-RTs transfer single or multiple ADP-ribose moieties from NAD to a specific amino acid residue within a target protein, forming mono ADP-ribosylation or poly ADP-ribosylation (PARylation) []. ADP-ribosylation changes the electrostatic potential of a target protein by introducing two phosphate groups and may affect protein-DNA as well as protein-protein interactions []. Protein ADP-ribosylation plays versatile roles in multiple biological processes.Bacteriophage T4 codes for three ADP-ribosyltransferases: Alt, ModA, and ModB. The ADP-ribosylating activity of each is directed to a specific set of host proteins. Among the three phage-encoded T4 mono-ADP-RTs, the Alt protein has the broadest range of target proteins, which include one of the two alpha subunits of host RNA polymerase [].The T4 Alt protein initially acts as a structural component of the phage head. At the time of infection, it enters the host cell with phage DNA and immediately displays enzymatic activity [].
Phosphoinositide-specific phospholipase C (PI-PLC), also known as 1-phosphatidylinositol 4,5-bisphosphate phosphodiesterase, plays a role in the inositol phospholipid signaling by hydrolysing phosphatidylinositol-4,5-bisphosphate to produce the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). These cause the increase of intracellular calcium concentration and the activation of protein kinase C (PKC), respectively.The PLC family in murine or human species is comprised of multiple subtypes. On the basis of their structure, they have been divided into five classes, beta (beta-1, 2, 3 and 4), gamma (gamma-1 and 2), delta (delta-1, 3 and 4), epsilon, zeta, and eta types [, ].PLC-delta-3 is essential for trophoblast and placental development []. It locates at the cleavage furrow where it may participate in cytokinesis []. PI-PLC-delta3 contains a core set of domains, including an N-terminal pleckstrin homology (PH) domain, four atypical EF-hand motifs, a PLC catalytic core, and a single C-terminal C2 domain. The PLC catalytic core domain is a TIM barrel with two highly conserved regions (X and Y) split by a highly degenerate linker sequence. In addition, PI-PLC-delta3 possesses a classical leucine-rich nuclear export sequence (NES) located in the EF hand motifs, which may be responsible transporting PI-PLC-delta3 from the cell nucleus [].
This family is defined to identify a pair of paralogous 3' exoribonucleases in Escherichia coli, plus the set of proteins apparently orthologous to one or the other in other eubacteria. VacB was characterised originally as required for the expression of virulence genes, but is now recognised as the exoribonuclease RNase R (Rnr). Its paralog in Escherichia coli and Haemophilus influenzae is designated exoribonuclease II (Rnb). Both are involved in the degradation of mRNA, and consequently have strong pleiotropic effects that may be difficult to disentangle. Both these proteins share domain-level similarity (RNB, S1) with a considerable number of other proteins, and full-length similarity scoring below the trusted cut off to proteins associated with various phenotypes but uncertain biochemistry; it may be that these latter proteins are also 3' exoribonucleases.Competence is the ability of a cell to take up exogenous DNA from its environment, resulting in transformation. It is widespread among bacteria and is probably an important mechanism for the horizontal transfer of genes. DNA usually becomes available by the death and lysis of other cells. Competent bacteria use components of extracellular filaments called type 4 pili to create pores in their membranes and pull DNA through the pores into the cytoplasm. This process, including the development of competence and the expression of the uptake machinery, is regulated in response to cell-cell signalling and/or nutritional conditions [].
KRIT1, also known as CCM1, a Rap1-binding protein, is expressed in endothelial cells where it is present in cell-cell junctions and associated with junctional proteins []. Together with CCM2/MGC4607 and CCM3/PDCD10, KRIT1 constitutes a set of proteins, mutations of which are found in cerebral cavernous malformations which are characterized by cerebral hemorrhages and vascular malformations in the central nervous system. KRIT-1 possesses four ankyrin repeats, a FERM domain, and multiple NPXY sequences, one of which is essential for integrin cytoplasmic domain-associated protein-1alpha (ICAP1alpha) binding and all of which mediate binding of CCM2. KRIT-1 localization is mediated by its FERM domain [].The FERM domain has a cloverleaf tripart structure composed of: (1) FERM_N (A-lobe or F1); (2) FERM_M (B-lobe, or F2); and (3) FERM_C (C-lobe or F3). The C-lobe/F3 within the FERM domain is part of the PH domain family. Like most other ERM members they have a phosphoinositide-binding site in their FERM domain. The FERM C domain is the third structural domain within the FERM domain. The FERM domain is found in the cytoskeletal-associated proteins such as ezrin, moesin, radixin, 4.1R, and merlin. These proteins provide a link between the membrane and cytoskeleton and are involved in signal transduction pathways. The FERM domain is also found in protein tyrosine phosphatases (PTPs) , the tyrosine kinases FAK and JAK, in addition to other proteins involved in signaling. This domain is structurally similar to the PH and PTB domains and consequently is capable of binding to both peptides and phospholipids at different sites [, ].
DNA-directed DNA polymerase () catalyzes DNA-template-directed extension of the 3'-end of an RNA strand by one nucleotide at a time. DNA polymerase III is a complex, multi-chain enzyme responsible for most of the replicative synthesis in bacteria. The enzyme also has 3' to 5' exonuclease activity. It has a core composed of alpha, epsilon and theta chains, that associate with a tau subunit which allows the core dimerisation to form the PolIII' complex. PolIII' associates with the gamma complex (gamma, delta, delta', psi and chi chains) and with the beta chain. This domain is the N-terminal half of the delta' subunit of DNA polymerase III. Delta' is homologous to the gamma and tau subunits, which form an outgroup for phylogenetic comparison. The gamma/tau branch of the tree is much more tightly conserved than the delta' branch, and some members of that branch score more highly against this model than some proteins classified as delta'. The noise cut-off is set to detect weakly scoring delta' subunits rather than to exclude gamma/tau subunits.
Salmonella, and related proteobacteria, secrete large amounts of proteins into the culture media. The major secreted proteins are either flagellar proteins or virulence factors [], secreted through the flagellar or virulence export structures respectively. Both secretion systems penetrate the inner and outer membranes and their components bear substantial sequence similarity. Both the flagellar and needle like pilus look fairly similar to each other []. The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact with the host. It is believed that the family of type III flagellar and pilus inner membrane proteins are used as structural moieties in a complex with several other subunits []. One such set of inner membrane proteins, labeled "S"here for nomenclature purposes, includes the Salmonella and Shigella SpaS, the Yersinia YscU, Rhizobium Y4YO, and the Erwinia HrcU genes, Salmonella FlhB and Escherichia coli EscU [, , , ].Many of the proteins, in this entry, undergo autocatalytic cleavage promoted by cyclization of a conserved asparagine. These proteins belong to the MEROPS peptidase family N6.
Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior []. There have been four secretion systems described in animal enteropathogens such as Salmonella and Yersinia, with further sequence similarities in plant pathogens like Ralstonia and Erwinia []. The type III secretion system is of great interest, as it is used to transport virulence factors from the pathogen directly into the host cell []and is only triggered when the bacterium comes into close contact withthe host. The protein subunits of the system are very similar to those of bacterial flagellar biosynthesis []. However, while the latter forms aring structure to allow secretion of flagellin and is an integral part ofthe flagellum itself [], type III subunits in the outer membranetranslocate secreted proteins through a channel-like structure.It is believed that the family of type III inner membrane proteins are used as structural moieties in a complex with several other subunits []. One such set of inner membrane proteins, labeled "R"here for nomenclature purposes, includes the Salmonella and Shigella SpaR, the Yersinia YscT, Rhizobium Y4YN, and the Erwinia HrcT genes []. The flagellar protein FliR also shares similarity, probably due to evolution of the type III secretion system from the flagellar biosynthetic pathway.
This entry represents the C-terminal domain present the transcriptional regulator KstR that regulates a large set of genes responsible for cholesterol catabolism. This is important for Mycobacterium tuberculosis during infection, both at an early stage in the macrophage phagosome and later within the necrotic granuloma [].TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity []. The TetR proteins identified in overm ultiple genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response []. TetR regulates the expression of the membrane-associated tetracycline resistance protein, TetA, which exports the tetracycline antibioticout of the cell before it can attach to the ribosomes and inhibit protein synthesis []. TetR blocks transcription from the genes encoding both TetA and TetR in the absence of antibiotic. The C-terminal domain is multi-helical and is interlocked in the homodimer with the helix-turn-helix (HTH) DNA-binding domain [].
This entry represents the second ePHD finger of Histone-lysine N-methyltransferase 2C (KMT2C).The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. KMT2C, also known as MLL3, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription []. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP) []. KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4 []. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis []. KMT2C contains several PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains.
This entry represents the first ePHD finger of Histone-lysine N-methyltransferase 2C (KMT2C).The extended plant homeodomain (ePHD) zinc finger is characterized as Cys2HisCys5HisCys2His. KMT2C, also known as MLL3, is a histone H3 lysine 4 (H3K4) lysine methyltransferase that functions as a circadian factor contributing to genome-scale circadian transcription []. It is a component of a large complex that acts as a coactivator of multiple transcription factors, including the bile acid (BA)-activated nuclear receptor, farnesoid X receptor (FXR), a critical player in BA homeostasis. The MLL3 complex is essential for p53 transactivation of small heterodimer partner (SHP) []. KMT2C is also a part of activating signal cointegrator-2 (ASC-2)-containing complex (ASCOM) that contains the transcriptional coactivator nuclear receptor coactivator 6 (NCOA6), KMT2C and its paralog MLL4 []. The ASCOM complex is critical for nuclear receptor (NR) activation of bile acid transporter genes and is down regulated in cholestasis []. KMT2C contains several PHD fingers, two ePHD fingers, an ATPase alpha beta signature, a high mobility group (HMG)-1 box, a SET (Suppressor of variegation, Enhancer of zeste, Trithorax) domain and two FY (phenylalanine tyrosine)-rich domains.
Within the bacterial flagellum, the basal-body rod, the hook, the hook-associated proteins (HAPs), and the helical filament together constitute an axial substructure whose elements share structural features and a commonexport pathway []. This entry represents the hook-associated protein 1 (HAP1, also known as FlgK) []. The structure of FlgK from Burkholderia pseudomallei has been revealed []. The amino acid sequences of the hook protein and of the three hook-associated proteins of Salmonella typhimurium have been deduced from the DNA sequences of their structural genes (flgE, flgK, flgL and fliD respectively).These sequences have been compared with each other and with those for the filament protein (flagellin) and four rod proteins. The hook proteinwas found to be most similar to the distal rod protein (FlgG) and theproximal hook-associated protein (HAP1), which are thought to be attached to the proximal anddistal ends of the hook, the similarities being most pronounced near the N-and C-termini.It is thought that the axial proteins may adopt amphipathic α-helicalconformations at their N- and C-termini. These regions of the filament andhook are believed to be responsible for quaternary interactions betweensubunits. Interaction between N- and C-terminal α-helices may beimportant in the formation of the axial structures of the flagellum.Although consensus sequences have been noted, no consensus extends to theentire set of axial proteins. Thus the basis for recognition of a proteinfor export by the flagellum-specific pathway remains to be identified.