|  Help  |  About  |  Contact Us

Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, diseases, strains, ontology terms, etc. (e.g. Pax6, Parkinson, ataxia)
  • Use OR to search for either of two terms (e.g. OR mus) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. Balb* for partial matches or mus AND NOT embryo to exclude a term

Search results 1 to 100 out of 464 for Set

Category restricted to ProteinDomain (x)

<< First    < Previous  |  Next >    Last >>
0.084s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Type: Domain
Description: The SET domain is a 130 to 140 amino acid, evolutionary well conservedsequence motif that was initially characterised in the Drosophila proteins Su(var)3-9, Enhancer-of-zeste and Trithorax. In addition to these chromosomal proteins modulating gene activities and/or chromatin structure, the SET domain is found in proteins of diverse functions ranging from yeast to mammals, but also including some bacteria and viruses [, ].The SET domains of mammalian SUV39H1 and 2 and fission yeast clr4 have been shown to be necessary for the methylation of lysine-9 in the histone H3 N terminus []. However, this histone methyltransferase (HMTase) activity is probably restricted to a subset of SET domain proteins as it requires the combination of the SET domain with the adjacent cysteine-rich regions, one located N-terminally (pre-SET) and the other posterior to the SET domain (post-SET). Post- and pre- SET regions seem then to play a crucial role when it comes to substrate recognition and enzymatic activity [, ].The structure of the SET domain and the two adjacent regions pre-SET and post-SET have been solved [, , ]. The SET structure is all beta, but consists only in sets of few short strands composing no more than a couple of small sheets. Consequently the SET structure is mostly defined by turns and loops. An unusual feature is that the SET core is made up of two discontinual segments of the primary sequence forming an approximate L shape [, , ]. Two of the most conserved motifs in the SET domain are constituted by (1) a stretch at the C-terminal containing a strictly conserved tyrosine residue and (2) a preceding loop inside which the C-terminal segment passes forming a knot-like structure, but not quite a true knot. These two regions have been proven to be essential for SAM binding and catalysis, particularly the invariant tyrosine where in all likelihood catalysis takes place [, ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in Set3 and Set4 from fungi. They contain both SET and PHD domains. In budding yeasts, Set3 forms a single complex, Set3C, with Snt1, YIL112w, Sif2, Cpr1, and two putative histone deacetylases, Hos2 and NAD-dependent Hst1. Set3C is the yeast analog of the mammalian HDAC3/SMRT complex [].
Protein Domain
Type: Homologous_superfamily
Description: The SET domain is a 130 to 140 amino acid, evolutionary well conserved sequence motif that was initially characterised in the Drosophila proteins Su(var)3-9, Enhancer-of-zeste and Trithorax [, ]. In eukaryotic organisms, it appears in proteins with an important role in regulating chromatin-mediated gene transcriptional activation and silencing. In viruses,bacteria and archaea, its function is not clear yet []. This superfamily includes eukaryotic proteins with histone methyltransferase activity, which requires the combination of the SET domain with the adjacent cysteine-rich regions, one located N-terminally (pre-SET) and the other posterior to the SET domain (post-SET). Post- and pre- SET regions seem then to play a crucial role when it comes to substrate recognition and enzymatic activity [, ]. Other SET domain-containing proteins function as transcription factors (such as PR domain zinc finger protein 1 from humans []). The structure of the SET domain and the two adjacent regions pre-SET and post-SET have been solved [, , ]. The SET domain structure is all-β, but consists only in sets of few short strands composing no more than a couple of small sheets. Consequently the SET structure is mostly defined by turns and loops. An unusual feature is that the SET core is made up of two discontinuous segments of the primary sequence forming an approximate L-shape [, , ]. Two of the most conserved motifs in the SET domain are constituted by a stretch at the C-terminal containing a strictly conserved tyrosine residue and a preceding loop inside which the C-terminal segment passes forming a knot-like structure, but not quite a true knot. These two regions have been proven to be essential for SAM binding and catalysis, particularly the invariant tyrosine where in all likelihood catalysis takes place [, ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain of SET and MYND domain-containing protein 1 (SMYD1). SMYD1 functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD1 plays a critical role in cardiomyocyte differentiation, cardiac morphogenesis and myofibril organisation, as well as in the regulation of endothelial cells (ECs) [, ]. It is expressed in vascular endothelial cells, it has been shown that knockdown of SMYD1 in endothelial cells impairs EC migration and tube formation [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [, , , ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in budding yeast Efm1 and fission yeast Set10. They are S-adenosyl-L-methionine-dependent protein-lysine N-methyltransferases. Efm1 monomethylates elongation factor 1-alpha (TEF1/TEF2) at 'Lys-30' [, ], while Set10 methylates ribosomal protein L23 (rpl23a and rpl23b) [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain of SET and MYND domain-containing protein 4 (SMYD4). SMYD4 functions as a potential tumour suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha []. In zebrafish, SMYD4 is ubiquitously expressed in early embryos and becomes enriched in the developing heart; mutants show a strong defect in cardiomyocyte proliferation, which lead to a severe cardiac malformation [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [, , , ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain of SET and MYND domain-containing protein 5 (SMYD5, also termed protein NN8-4AG, or retinoic acid-induced protein 15)). SMYD5 functions as a histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions []. It plays an important role in chromosome integrity by regulating heterochromatin and repressing endogenous repetitive DNA elements during differentiation []. In zebrafish embryogenesis, it plays pivotal roles in both primitive and definitive hematopoiesis [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [, , , ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain of SET and MYND domain-containing protein 3 (SMYD3). SMYD3 functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4 []. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex []. It is overexpressed in colorectal, breast, prostate, and hepatocellular tumours, and has been implicated as an oncogene in human malignancies []. Methylation of MEKK2 by SMYD3 is important for regulation of the MEK/ERK pathway, suggesting the possibility of selectively targeting SMYD3 in RAS-driven cancers [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [, , , ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD6 and related proteins.SET domain methyltransferases can be involved both in translational and transcriptional roles. N-lysine methyltransferase SETD6 is a SET domain protein that specifically monomethylates 'Lys-310' of the RELA subunit of NF-kappa-B complex, leading to down-regulation of NF-kappa-B transcription factor activity []. Homologues in yeast monomethylate 60S ribosomal protein L42 (RPL42A and RPL42B) at 'Lys-55' [, ].
Protein Domain
Type: Domain
Description: LegAS4 () is a type IV secretion system effector of Legionella pneumophila. It contains a SET domain that is involved in the modification of Lys4 of histone H3 (H3K4) in the nucleolus of the host cell, thereby enhancing heterochromatic rDNA transcription. It also contains an ankyrin repeat domain of unknown function at its C-terminal region []. This entry represents the SET domain found in LegAS4 and related proteins.
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in a group of ribulose-bisphosphate carboxylase]-lysine N-methyltransferases (RBCMTs) and related proteins from plants.In pea (Pisum sativum), the protein-lysine methyltransferase (PsLSMT, also known as RBCMT) catalyses the trimethylation of Lys-14 in the large subunit (LS) of ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco) []. Arabidopsis homologue of RBCMT, LSMT, is a protein-lysine methyltransferase methylating chloroplastic fructose 1,6-bisphosphate aldolases []. The sequence conservation pattern and structure analysis of the SET domain provides clues regarding the possible active site residues of the domain. There are three conserved sequence motifs in most of the SET domain. The N-terminal motif (I) has characteristic glycines. The central motif (II) has a distinct pattern of polar and charged residues (Asn, His). The C-terminal conserved motif (III) has a characteristic dyad of polar residues and the hydrophobic residue tyrosine.
Protein Domain
Type: Family
Description: The ribosomal protein L12ab (Rpl12ab) in Saccharomyces cerevisiae is modified by methylation at both arginine and lysine residues. Rkm2 (ribosomal lysine methyltransferase 2) is responsible for the predominant epsilon-trimethylation at lysine 10 of Rpl12ab [].This entry includes Rkm2 and other SET domain proteins that may also be lysine methyltransferases.
Protein Domain
Type: Domain
Description: This entry represents the SET domain of SET and MYND domain-containing protein 2 (SMYD2). SMYD2 functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1 [, ]. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). It plays a role in myofilament organisation in both skeletal and cardiac muscles via Hsp90 methylation []. SMYD2 overexpression is associated with tumour cell proliferation and a worse outcome in human papillomavirus-unrelated nonmultiple head and neck carcinomas []. It regulates leukemia cell growth such that diminished SMYD2 expression upregulates SET7/9, thereby possibly shifting leukemia cells from growth to quiescence state associated with resistance to DNA damage associated with Acute Myeloid Leukemia (AML) [].The SMYD family consists of five members including SMYD1/2/3/4/5. They contain two highly conserved structural and functional domains, the SET and MYND domains. The SET domain is involved in lysine methylation, while the MYND domain is involved in protein-protein interaction. They are essential in several mammalian developmental pathways [, , , ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD5, which is a chromatin regulator required for brain development []. SETD5 is essential for regulating histone acetylation during gene transcription []. Haploinsufficiency of SETD5 is implicated in syndromic autism spectrum disorder (ASD) [].
Protein Domain
Type: Domain
Description: The SET domain is a protein-protein interaction domain found in protein lysine methyltransferase enzymes. This entry represents a domain of unknown function which is associated with the SET domain and found in histone lysine methyltransferases [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD4, which is a cytosolic and nuclear functional lysine methyltransferase that plays a crucial role in breast carcinogenesis []. However, its specific substrates and modification sites remain to be disclosed. Proteins containing this domain also include budding yeast Rkm2, which is a ribosomal protein lysine methyltransferase responsible for trimethylation of the lysine residue at position 3 of Rpl12A and Rpl12B [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD3 and related proteins.SETD3 is a protein-histidine N-methyltransferase that specifically mediates methylation of actin at 'His-73' []. It was initially reported to have histone methyltransferase activity and methylate 'Lys-4' and 'Lys-36' of histone H3 (H3K4me and H3K36me). However, this conclusion was based on mass spectrometry data wherein mass shifts were inconsistent with a bona fide methylation event. In vitro, the protein-lysine methyltransferase activity is weak compared to the protein-histidine methyltransferase activity [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD7, an enzyme that specifically monomethylate Lys-4 of histone H3, thereby creating a specific tag for epigenetic transcriptional activation. Methylation of lysine residues in the N-terminal tails of histones is thought to represent an important component of the mechanism that regulates chromatin structure. SETD7 plays a central role in the transcriptional activation of genes such as collagenase and insulin. It is recruited by IPF1/PDX-1 to the insulin promoter, leading to activate transcription. SETD7 also has methyltransferase activity toward non-histone proteins, including TAF10 and p53/TP53. SETD7 monomethylates Lys-189 of TAF10, which increases the affinity of TAF10 for RNA polymerase II. SETD7 monomethylates Lys-372 of p53/TP53, which stabilises p53/TP53 and increases p53/TP53-mediated transcriptional activation [, ]. SETD7 also methylates non-histone proteins, including estrogen receptor alpha (ERa), suggesting it has a role in diverse biological processes. ERa methylation by Set7/9 stabilises ERa and activates its transcriptional activities, which are involved in the carcinogenesis of breast cancer. In a high-throughput screen, treatment of human breast cancer cells (MCF7 cells) with cyproheptadine, a Set7/9 inhibitor, decreased the expression and transcriptional activity of ERa, thereby inhibiting estrogen-dependent cell growth [, ].These enzymes contain a SET domain, which is necessary but not sufficient for histone methyltransferase activity []. Human SETD7 contains an N-terminal β-sheet domain in addition to the conserved SET domain []. Mutagenesis studies identified two residues in the C terminus of the protein that appear essential for catalytic activity toward lysine-4 of histone H3; cofactor AdoMet binds to this domain [].
Protein Domain
Type: Family
Description: This entry represents histone-lysine N-methyltransferase (SETD7 or SET7/9) (), which contains a SET domain []. This enzyme specifically monomethylate Lys-4 of histone H3, thereby creating a specific tag for epigenetic transcriptional activation. Methylation of lysine residues in the N-terminal tails of histones is thought to represent an important component of the mechanism that regulates chromatin structure. As such SETD7 plays a central role in the transcriptional activation of genes such as collagenase and insulin. It is recruited by IPF1/PDX-1 to the insulin promoter, leading to activate transcription. SETD7 also has methyltransferase activity toward non-histone proteins, including TAF10 and p53/TP53. SETD7 monomethylates Lys-189 of TAF10, which increases the affinity of TAF10 for RNA polymerase II. SETD7 monomethylates Lys-372 of p53/TP53, which stabilises p53/TP53 and increases p53/TP53-mediated transcriptional activation []. These enzymes contain a SET domain, which is necessary but not sufficient for histone methyltransferase activity []. Human SETD7 contains an N-terminal β-sheet domain in addition to the conserved SET domain []. Mutagenesis studies []identified two residues in the C terminus of the protein that appear essential for catalytic activity toward lysine-4 of histone H3; cofactor AdoMet binds to this domain.
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in EZH1.The Polycomb Repressive Complex 2 (PRC2) is a chromatin modifying complex that consists of three core components: EED, SUZ12 and one of the two histone H3K27 methyltransferases, EZH1 or EZH2 []. The PRC2 complex catalyses di- and trimethylation of histone H3 lysine 27 (H3K37me2/3), which has a repressive role. Even though EZH1 and EZH2 form similar PRC2 complexes, they exhibit contrasting repressive roles. In terms of their expression in mice, EZH1 is more abundant in nonproliferative adult organs, while EZH2 expression is tightly associated with proliferation [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in EZH2.The Polycomb Repressive Complex 2 (PRC2) is a chromatin modifying complex that consists of three core components: EED, SUZ12 and one of the two histone H3K27 methyltransferases, EZH1 or EZH2 []. The PRC2 complex catalyses di- and trimethylation of histone H3 lysine 27 (H3K37me2/3), which has a repressive role. Even though EZH1 and EZH2 form similar PRC2 complexes, they exhibit contrasting repressive roles. In terms of their expression in mice, EZH1 is more abundant in nonproliferative adult organs, while EZH2 expression is tightly associated with proliferation [].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in KMT2E.The protein known as histone-lysine N-methyltransferase 2E (KMT2E, mixed-lineage leukemia 5 or MLL5) has been shown to lack key residues and an essential loop in the the SET-I subdomain required for binding substrate and cofactors []. MLL5 had been thought to be specific for mono- and dimethylation of 'Lys-4' of histone H3 (H3K4me1 and H3K4me2; which are specific tags for epigenetic transcriptional activation), and a component of an MLL5 complex, but the paper describing this alleged complex has been retracted by the authors []. MLL5 mediates hematopoietic cell homeostasis, spermatogenesis, cell cycle progression, and survival [, ]. It is recruited to gene-rich euchromatic regions via the interaction of its plant homeodomain finger with the histone mark H3K4me3 []. Overexpression of the MLL5 gene induced cell cycle arrest in G(1) phase [].
Protein Domain
Type: Domain
Description: Post-translational modification of the core histones H2A, H2B, H3 and H4 play an important part in chromatin biology. Histone H4 lysine 20 methylation (H4K20me) is critical for the biological processes that ensure genome integrity, such as DNA damage repair, DNA replication and chromatin compaction []. Suv4-20 family members are a group of histone H4K20 methyltransferases that play an important role in epigenetic regulation. Lower eukaryotes have a single Suv4-20, while mammals have two closely related Suv4-20 paralogs, SUV420H1 and SUV420H2 []. In mammals H4K20me2 is generated primarily by SUV420H1/KMT5B (which can also produce H4K20me3), while H4K20me3 is mostly generated by SUV420H2/KMT5C [].This entry represents the SET domain found in KMT5C (also known as SUV420H2). KMT5C acts as an upstream epigenetic regulator of epithelial/mesenchymal state control [].
Protein Domain
Type: Domain
Description: Post-translational modification of the core histones H2A, H2B, H3 and H4 play an important part in chromatin biology. Histone H4 lysine 20 methylation (H4K20me) is critical for the biological processes that ensure genome integrity, such as DNA damage repair, DNA replication and chromatin compaction []. Suv4-20 family members are a group of histone H4K20 methyltransferases that play an important role in epigenetic regulation. Lower eukaryotes have a single Suv4-20, while mammals have two closely related Suv4-20 paralogs, SUV420H1 and SUV420H2 []. In mammals H4K20me2 is generated primarily by SUV420H1/KMT5B (which can also produce H4K20me3), while H4K20me3 is mostly generated by SUV420H2/KMT5C [].This entry represents the SET domain found in Suv4-20 from lower eukaryotes [].
Protein Domain
Type: Domain
Description: Post-translational modification of the core histones H2A, H2B, H3 and H4 play an important part in chromatin biology. Histone H4 lysine 20 methylation (H4K20me) is critical for the biological processes that ensure genome integrity, such as DNA damage repair, DNA replication and chromatin compaction []. Suv4-20 family members are a group of histone H4K20 methyltransferases that play an important role in epigenetic regulation. Lower eukaryotes have a single Suv4-20, while mammals have two closely related Suv4-20 paralogs, SUV420H1 and SUV420H2 []. In mammals H4K20me2 is generated primarily by SUV420H1/KMT5B (which can also produce H4K20me3), while H4K20me3 is mostly generated by SUV420H2/KMT5C [].This entry represents the SET domain found in KMT5B (also known as SUV420H1). Suv420h1 knockout mice die perinatally and show overall growth retardation []. In humans KMT5B may affect neurodevelopment and has been linked to Mental retardation, autosomal dominant 51 [, ].
Protein Domain
Type: Domain
Description: This entry represents the SET domain found in SETD2 from animals, ASHH2 from plants and Set2 from fungi. Proteins containing this domain are a group of histone methyltransferases that methylates histone H3 to form H3K36me [, ].Yeast Set2 is involved in transcription elongation as well as in transcription repression []. The methyltransferase activity of budding yeast Set2 requires the recruitment to the RNA polymerase II, which is CTK1 dependent [, , , , , , ]. Plant ASHH2 is required for the correct expression of genes essential to reproductive development [].SETD2 acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate [, ]. SETD2 is also required for DNA double-strand break repair and activation of the p53-mediated checkpoint []. SETD2-inactivation has been linked to tumour development []. SETD2 also methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy []. Moreover, SETD2 is also involved in interferon-alpha-induced antiviral defense by mediating both monomethylation of STAT1 at 'Lys-525' and catalyzing H3K36me3 on promoters of some interferon-stimulated genes (ISGs) to activate gene transcription [].SETD2 has been linked to several human diseases, including Renal cell carcinoma (RCC) [], Luscan-Lumish syndrome (LLS) [], Leukemia, acute lymphoblastic (ALL) []and Leukemia, acute myelogenous (AML) [, ].
Protein Domain
Type: Family
Description: The function of SETD9 is not clear. Sequence and structure-based models suggest it could be a methyltransferase [].
Protein Domain
Type: Domain
Description: E or 'early' set domains are associated with the catalytic domain of galactose oxidase at the C-terminal end. Galactose oxidase is an extracellular monomeric enzyme which catalyzes the stereospecific oxidation of a broad range of primary alcohol substrates, and possesses a unique mononuclear copper site essential for catalyzing a two-electron transfer reaction during the oxidation of primary alcohols to corresponding aldehydes. The second redox active centre necessary for the reaction was found to be situated at a tyrosine residue. The C-terminal domain of galactose oxidase may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end, and may be involved in homodimeric/tetrameric/dodecameric interactions. Members of this family include members of the alpha amylase family, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase, among others [, , , , ].
Protein Domain
Type: Domain
Description: Histone-lysine N-methyltransferase ATXR3, also known as protein SET DOMAIN GROUP 2 (SDG2), is a histone methyltransferase specifically required for trimethylation of Lys-4 of histone H3 (H3K4me3) and is crucial for both sporophyte and gametophyte development [, ]. This domain includes the POST_SET domain located at the C ternimus of SDG2, although there is no significant sequence homology with other Arabidopsis SDGs outside of the SET domain [].
Protein Domain
Type: Family
Description: This entry represents a group of yeast lysine methyltransferases, including Efm1 and Rkm1. They are S-adenosylmethionine-dependent lysine methyltransferases. Rkm1 monomethylates ribosomal protein S18 (RPS18A and RPS18B) at 'Lys-48' and dimethylates ribosomal protein L23 (RPL23A and RPL23B) at 'Lys-106' and 'Lys-110' []. Efm1 monomethylates elongation factor 1-alpha (TEF1/TEF2) at 'Lys-30' [, ].
Protein Domain
Type: Domain
Description: This is the C-terminal domain found in PvuRts1I, a modification-dependent restriction endonuclease that recognizes 5-hydroxymethylcytosine (5hmC) as well as 5-glucosylhydroxymethylcytosine (5ghmC) in double-stranded DNA in bacteria. Structural analysis indicates that it has the typical SRA (SET and RING associated) domain fold [].
Protein Domain
Type: Domain
Description: This entry represents the N-terminal Early set domain found in GlgB. GlgB is a glycogen branching enzyme responsible for the transfer of chains of approximately seven alpha-(1,4)-linked glucosyl residues to other similar chains (in new alpha-(1,6) linkages) in the biosynthesis of glycogen []. The branching enzyme is responsible for the degree of alpha(1,6) branch linkages found in polysaccharides [].This E or "early"set domains are associated with the catalytic domain of glycogen branching enzymes at the N-terminal end. The N-terminal domain of the 1,4 alpha glucan branching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions [].
Protein Domain
Type: Domain
Description: This entry represents the N-terminal Early set domain found in bacterial glycogen debranching enzymes and isoamylases from bacteria and plants. Glycogen debranching enzymes have 4-alpha-glucanotransferase activity, that transfers a segment of the 1,4-alpha-D-glucan to a new 4-position in an acceptor, which may be glucose or another 1,4-alpha-D-glucan, and amylo-1,6-glucosidase activity, which catalyses the endohydrolysis of 1,6-alpha-D-glucoside linkages at points of branching in chains of 1,4-linked alpha-D-glucose residues []. Isoamylase is one of the starch-debranching enzymes that catalyse the hydrolysis of alpha-1,6-glucosidic linkages specific in alpha-glucans such as amylopectin or glycogen [, ].This E or "early"set domains are associated with the catalytic domain of glycogen branching enzymes at the N-terminal end. The N-terminal domain of the 1,4 alpha glucan branching enzyme may be related to the immunoglobulin and/or fibronectin type III superfamilies. These domains are associated with different types of catalytic domains at either the N-terminal or C-terminal end and may be involved in homodimeric/tetrameric/dodecameric interactions [].
Protein Domain
Type: Homologous_superfamily
Description: This superfamily contains a set of hypothetical bacterial proteins.
Protein Domain
Type: Family
Description: This family represents a set of probable methyltransferases.
Protein Domain
Type: Family
Description: This subgroup is named after the two known reaction types in this set of proteins.
Protein Domain
Type: Family
Description: This entry represents a set of uncharacterised proteins found in bacteria and archaea.
Protein Domain
Type: Family
Description: This set of hypothetical proteins is produced by prokaryotes pertaining to the Bacillus genus.
Protein Domain
Type: Domain
Description: This domain is found in a set of hypothetical bacterial proteins.
Protein Domain
Type: Family
Description: This entry represents a set of hypothetical bacterial and archaeal proteins.
Protein Domain
Type: Family
Description: This domain is found in a set of hypothetical bacterial proteins.
Protein Domain
Type: Family
Description: This entry is a set of hypothetical proteins. Their function has not been described yet.
Protein Domain
Type: Family
Description: This entry represents a set of uncharacterised bacterial proteins that are likely integral membrane proteins.
Protein Domain
Type: Domain
Description: SPK is a domain of unknown function that is found in some SET and PHD domain-containing proteins and protein kinases.
Protein Domain
Type: Repeat
Description: This repeat is found in tandem in a set of lipoproteins. The alignment contains a Y-X4-D motif.
Protein Domain
Type: Domain
Description: Members of this entry are a set of hypothetical archaeal proteins. Their exact function has not, as yet, been defined.
Protein Domain
Type: Family
Description: This family is a set of glycosyl hydrolase enzymes including cycloisomaltooligosaccharide glucanotransferase ([intenz:2.4.1.-]) and dextranase () activities.
Protein Domain
Type: Domain
Description: This entry represents a set of SLT-like domains found in proteobacteria. SLT domains containing proteins are enzymes that degrade murein.
Protein Domain
Type: Family
Description: This entry represents a set of protein sequences found in Plasmodium species. An interesting feature is five perfectly conserved Trp residues.
Protein Domain
Type: Domain
Description: This entry represents a domain found at the C terminus of a set of single-stranded DNA-specific exonucleases, including RecJ. Its function has not, as yet, been determined.
Protein Domain
Type: Domain
Description: This domain is found to the N terminus of the SET domain in the EZH2 protein []. It is a zinc binding domain.
Protein Domain
Type: Homologous_superfamily
Description: This superfamily represents the C-terminal of a presumed intracellular domain found in a set of bacterial presumed transporter proteins. The region is about 160 amino acids in length.
Protein Domain
Type: Domain
Description: This domain is likely to be the 'Class I' region just N-terminal to the first set of transmembrane helices that is involved in 1,3-beta-glucan synthesis itself [].
Protein Domain
Type: Family
Description: This family contains a diverse set of enzymes including: enoyl-CoA hydratase, 1,4-dihydroxy-2-naphthoyl-CoA synthase (napthoate synthase), carnitinyl-CoA dehydratase (carnitine racemase), 3-hydroxybutyryl-CoA dehydratase and enoyl-CoA delta isomerase (dodecanoyl-CoA delta-isomerase).
Protein Domain
Type: Domain
Description: This entry represents a domain found in uncharacterised fungal proteins that contain a set of C-terminal transmembrane helices. This entry may be related to .
Protein Domain
Type: Repeat
Description: This motif occurs in a small set of bacterial proteins. It has two transmembrane regions, and often occurs as tandem repeats. The are no conserved catalytic residues.
Protein Domain
Type: Domain
Description: This conserved region identifies a set of hypothetical protein sequences from the Metazoa and Ascomycota which include SHQ1 from Saccharomyces cerevisiae.
Protein Domain
Type: Family
Description: This set of conserved hypothetical protein has a phylogenetic range that closely matches that of , and has a putative C-terminal protein targeting signal.
Protein Domain
Type: Domain
Description: This entry represetns a predicted glycosyltransferase domain found in a set of prokaryotic proteins that includes putative glucosyltransferases involved in bacterial capsule biosynthesis [, ].
Protein Domain
Type: Family
Description: This set of bacteriophage proteins have no known function, though it has been suggested that they may function as tail proteins [].
Protein Domain
Type: Family
Description: This entry represents a small set of proteins with weak similarity to the sequences in Pfam family , which describes the cytochrome C oxidase subunit IV [].
Protein Domain
Type: Family
Description: This family belongs to the larger set of probable enzymes modeled by . Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterised.
Protein Domain
Type: Family
Description: This family belongs to the larger set of probable enzymes defined by . Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterised.
Protein Domain
Type: Family
Description: This family belongs to the larger set of probable enzymes defined by . Members are found primarily in the Actinobacteria (Mycobacterium, Streptomyces, etc.). The family is uncharacterised.
Protein Domain
Type: Domain
Description: This domain is currently found in streptomyces bacteria, in a set of bacterial proteins with no known function. Most proteins contain two copies of this domain [].
Protein Domain
Type: Family
Description: Members of this family are found in a set of hypothetical bacterial proteins. Their exact function has not, as yet, been determined.
Protein Domain
Type: Domain
Description: This domain of unknown function is found in a limited set of Bradyrhizobium proteins. There appears to be a periodic -DG- motif in the domain.
Protein Domain
Type: Domain
Description: This region is a presumed intracellular domain found in a set of bacterial presumed transporter proteins. The region is about 160 amino acids in length.
Protein Domain
Type: Homologous_superfamily
Description: This entry is a set of hypothetical proteins. Their function has not been described yet. The structure contains two similar beta-x-beta(4) motifs swapped with the first strands.
Protein Domain
Type: Family
Description: This entry includes ASHR2 from Arabidopsis. ASHR2 (also known as SDG39) contains the a disrupted SET domain in which the N-terminal one-third of the SET domain is separated from the C-terminal two-thirds of the domain by 50 to 120 amino acids [].
Protein Domain
Type: Family
Description: Synonym: UDP-galactose 4-epimerase UDP-glucose 4-epimerase ()interconverts UDP-glucose and UDP-galactose which are precursors of glucose- andgalactose-containing exopolysaccharides (EPS) []. Arabidopsis thaliana has five genes encoding functional UDP-D-glucose/UDP-D-galactose 4-epimerase [].A set of related proteins, someof which are tentatively identified as UDP-glucose-4-epimerase in Thermotoga maritima, Bacillus halodurans, and several archaea, but deeply branched from this set and lacking experimental evidence, are not included in this family.
Protein Domain
Type: Family
Description: Phr peptides are short peptides, best conserved in their amino-terminal regions, that are almost always encoded immediately downstream of a Rap phosphatase. A portion of the Phr peptide is secreted, enters another cell, and forms a quorum-sensing system by inhibiting its Rap phosphatase partner. The set of Phr peptides recognised by this entry is different from the PhrC/PhrF set recognised by [].
Protein Domain
Type: Family
Description: SET domain methyltransferases can be involved both in translational and transcriptional roles. N-lysine methyltransferase SETD6 is a SET domain protein that specifically monomethylates 'Lys-310' of the RELA subunit of NF-kappa-B complex, leading to down-regulation of NF-kappa-B transcription factor activity []. Homologues in yeast monomethylate 60S ribosomal protein L42 (RPL42A and RPL42B) at 'Lys-55' [, ].
Protein Domain
Type: Domain
Description: This entry comprises a diverse set of domains related to the Glyoxalase domain. The exact specificity of these proteins is uncertain. Proteins containing this domain include CFP32 from Mycobacterium tuberculosis []and DnrV from Streptomyces peucetius [].
Protein Domain
Type: Family
Description: This entry describes a set of dehydrogenases belonging to the glucose-methanol-choline oxidoreductase (GMC oxidoreductase) family. Members of the family are restricted to Actinobacterial genome contexts containing also members of families and , and are proposed to be uniform in function.
Protein Domain
Type: Domain
Description: This entry represents the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have a RhoGEF domain () at their C-terminal end.
Protein Domain
Type: Repeat
Description: This motif is found singly or as up to five tandem repeats in a small set of bacterial proteins. There are two or three α-helices, and possibly a β-strand.
Protein Domain
Type: Family
Description: This family represents the phage-tail-tube protein from a set of Siphoviridae from Gammaproteobacteria. Tail tube proteins polymerise with the assistance of the Tail-tip complex, a tape measure protein and two chaperones. Infectivity of host is delivered through the tube [].
Protein Domain
Type: Repeat
Description: This entry represents a set of repeats named ALF (Alanine-rich, AL - conserved Phenylalanine, F), which are found in a small family of secreted proteins of no known function. They may be involved in signal transduction [].
Protein Domain
Type: Family
Description: This entry represents a set of hypothetical bacterial proteins containing a core of six α-helices, where one central helix is surrounded by the other five. The exact function of this family has not, as yet, been determined [].
Protein Domain
Type: Family
Description: This protein family was first noted as a paralogous set in Porphyromonas gingivalis, but it is more widely distributed among the Bacteroidetes. The protein family is now renamed GLPGLI after its best-conserved motif.
Protein Domain
Type: Domain
Description: This entry represents the pseudonuclease domain (PND) from the FANCM protein []. This domain is part of the PD(D/E)XK superfamily but does not appear to have a full set of catalytic residues [].
Protein Domain
Type: Homologous_superfamily
Description: This domain is found in a set of hypothetical bacterial proteins that have a N-terminal domain related to the glycoside hydrolase family 57 (). The exact function of this domain has not, as yet, been defined.
Protein Domain
Type: Domain
Description: This domain is often found in the N-terminal region of proteins carrying the SET domain, such as the SETDB1 protein from Humans. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3 [].
Protein Domain
Type: Repeat
Description: This entry represents a 40 residue repeat that is often found in tandem in a small set of bacterial cell surface proteins. The function of this region is not known.
Protein Domain
Type: Family
Description: This entry represents a group of fungal mitochondria proteins, known as Pet127, that stimulate mitochondrial RNA degradation [, ]. Pet127 has been classified as part of the PD-(D/E)XK nuclease superfamily including a full set of active site residues [].
Protein Domain
Type: Family
Description: This family consists of a set of at least 17 paralogous proteins in Myxococcus xanthus (strain DK 1622). Members are about 200 amino acids in length. No other homologuess are known; the function is unknown.
Protein Domain
Type: Family
Description: This set of proteins includes PP_3335 from Pseudomonas putida and AZL_007950, a member of a putative biosynthetic cluster from Azospirillum sp. B510. The function of these proteins is unknown.
Protein Domain
Type: Family
Description: 4-aminobutyrate aminotransferase eukaryotic () is a class III pyridoxal-phosphate-dependent aminotransferase. The enzyme catalyses the conversion of 4-aminobutanoate and 2-oxoglutarate into succinate semialdehyde and L-glutamate. The degree of sequence difference between this set and known bacterial examples is greater than the distance between either set the most similar enzyme with distinct function, and so the prokaryotic and eukaryotic sets have been placed into separate families. This family describes known eukaryotic examples of the enzyme. Alternate names include GABA transaminase, gamma-amino-N-butyrate transaminase, and beta-alanine--oxoglutarate aminotransferase.
Protein Domain
Type: Domain
Description: This region is found in a number of histone lysine methyltransferases (HMTase), N-terminal to the SET domain; it is generally described as the pre-SET domain.Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils and stabilising the SET domain.The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site []when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [].
Protein Domain
Type: Family
Description: This entry includes a group of histone methyltransferases. The enzyme activity has been mapped to the SET domain [], originally identified in Drosophila melanogaster (Fruit fly) Su(var)3-9, E(z) and Trithorax proteins [].The mixed lineage leukemia (MLL) gene encodes a very large nuclear protein homologous to Drosophila trithorax (Trx). MLL is required for the proper maintenance of HOX gene expression during development and hematopoiesis []. The Trithorax group of proteins has been implicated in a variety of processes, including centromeric and telomeric silencing and cell-cycle regulation []. The sequence conservation pattern and structural analysis of the SET domain provides clues regarding possible active site residues. There are three conserved sequence motifs in most of the SET domain, two of which had been reported earlier. The N-terminal motif (I) has characteristic glycines. The central motif (II) has a distinct pattern of polar and charged residues (Asn, His). The C-terminal motif (III) has a characteristic dyad of polar residues and the hydrophobic residue tyrosine.
Protein Domain
Type: Domain
Description: This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception [], the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain [, ]. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities []. The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site []. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity [].
Protein Domain
Type: Family
Description: Members of this entry represent a set of proteins related to, yet architecturally different from, the activating protein for the glycine radical-containing, oxygen-sensitive ribonucleoside-triphosphate reductase (RNR, see ). Members of this entry are found paired with members of a similarly divergent set of anaerobic ribonucleoside-triphosphate reductases. Identification of these proteins as RNR activating proteins is partly from pairing with the candidate RNR and further supported by the finding that upstream of these operons are examples of a conserved regulatory element that is found in nearly all bacteria and that occurs specifically upstream of operons for all three classes of RNR genes [].
Protein Domain
Type: Family
Description: The proteins in this set represent a paralogous family of Plasmodium yoelii genes preferentially located in the subtelomeric regions of the chromosomes. These genes are generally very short (ca. 50 residues). There are no obvious homologues to these genes in any other organism.
Protein Domain
Type: Family
Description: This is a set of proteins that share low levels of sequence similarity but similar lengths and similar patterns of charged, hydrophobic, and Gly/Pro residues. Most members belong to phage of Gram-positive bacteria. Several are identified as phage major tail proteins.
Protein Domain
Type: Family
Description: This set of sequences describe a small family of uncharacterised proteins only found so far in alpha and gamma proteobacteria and in the Cyanobacterium Anabaena sp. (strain PCC 7120). The gene for this protein is associated with nitrogenase genes. This family shows sequence similarity to glutaredoxin-dependent arsenate reductase that converts arsentate to arsenite for disposal.
Protein Domain
Type: Family
Description: This entry represents a set of phage proteins are typically about 400-500 amino acids in length, although some members are considerably shorter. An article on Methanobacterium phage psiM2 () calls the member from that phage, ORF9, a putative large terminase subunit, and ORF8 a candidate terminase small subunit. Most proteins in this family have an apparent P-loop nucleotide-binding sequence toward the N terminus.
Protein Domain
Type: Domain
Description: This domain, Associated With SET, of unknown function is found in eukaryotic proteins of unknown function. This domain, as the name suggests, is often found in association with the SET domain (), suggesting a role in gene regulation by methylation of lysine residues in histones and other proteins.
Protein Domain
Type: Family
Description: This family represents one out of two closely related orthologous sets of proteins that, so far, are found only in but, are universal among, the Archaea.This orthologue set includes MJ1210 from Methanocaldococcus jannaschii (Methanococcus jannaschii) and AF0525 from Archaeoglobus fulgidus, but not MJ0106 or AF1251. The proteins are of unknown function.
Protein Domain
Type: Family
Description: This set of DNA binding proteins are found exclusively in the archaea and show homology to the origin recognition complex subunit 1/cell division control protein 6 (Orc1/Cdc6) family in eukaryotes. Several members may be found in a genome and interact with each other. The Cdc6/Orc1 protein from the archaeon Pyrococcus furiosus specifically binds to the oriC region [].