|  Help  |  About  |  Contact Us

Search our database by keyword

- or -

Examples

  • Search this entire website. Enter identifiers, names or keywords for genes, diseases, strains, ontology terms, etc. (e.g. Pax6, Parkinson, ataxia)
  • Use OR to search for either of two terms (e.g. OR mus) or quotation marks to search for phrases (e.g. "dna binding").
  • Boolean search syntax is supported: e.g. Balb* for partial matches or mus AND NOT embryo to exclude a term

Search results 401 to 464 out of 464 for Set

Category restricted to ProteinDomain (x)

<< First    < Previous  |  Next >    Last >>
0.019s

Categories

Category: ProteinDomain
Type Details Score
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.NGFI-B is an early response protein and orphan member of the steroidhormone receptor superfamily []. It is expressed in the lung, brain and superior cervical ganglia, and high levels are also seen in adrenaltissue. While members of this superfamily typically bind DNA as dimers, NGFI-B binds as a monomer. A domain separate from the NGFI-B zinc fingers(the so-called A box) has been identified and is required for recognitionof two adenine-thymidine base pairs at the 5' end of the NGFI-B DNA bindingelement [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Human NOR-1 mRNA has been detected in adult heart and skeletal muscle, aswell as in foetal brain, indicating that its expression is not restricted to events that occur during neural development []. It has been shown that in a skeletal myxoid chondrosarcoma, the EWS gene becomes fused to NOR1 [].The chimaeric EWS-NOR gene encodes a EWS-NOR fusion protein in which the C-terminal RNA-binding domain of EWS is replaced by the entire NOR protein,comprising a long N-terminal domain, a central DNA binding domain and aC-terminal ligand-binding/dimerisation domain [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Retinoic acid-related orphan receptors (RORs) are orphan NRs related to retinoic acid receptors and include ROR-alpha, ROR-beta and ROR-gamma, which are also referred to as RORA, RORB and RORC. ROR-alpha, ROR-beta and ROR-gamma regulate circadian rhythms with ROR-alpha playing the central role []. ROR-alpha has a key role in the development of the cerebellum. ROR-beta is necessary for the proliferation and differentiation of retinal cells. ROR-gamma is required for lymph-node organogenesis [].
Protein Domain
Type: Family
Description: The CRISPR-Cas system is a prokaryotic defense mechanism against foreign genetic elements. The key elements of this defense system are the Cas proteins and the CRISPR RNA. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a family of DNA direct repeats separated by regularly sized non-repetitive spacer sequences that are found in most bacterial and archaeal genomes []. CRISPRs appear to provide acquired resistance against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).The defense reaction is divided into three stages. In the adaptation stage, the invader DNA is cleaved, and a piece of it is selected to be integrated as a new spacer into the CRISPR locus, where it is stored as an identity tag for future attacks by this invader. During the second stage (the expression stage), the CRISPR RNA (pre-crRNA) is transcribed and subsequently processed into the mature crRNAs. In the third stage (the interference stage), Cas proteins, together with crRNAs, identify and degrade the invader [, , ].The CRISPR-Cas systems have been sorted into three major classes. In CRISPR-Cas types I and III, the mature crRNA is generally generated by a member of the Cas6 protein family. Whereas in system III the Cas6 protein acts alone, in some class I systems it is part of a complex of Cas proteins known as Cascade (CRISPR-associated complex for antiviral defense). The Cas6 protein is an endoribonuclease necessary for crRNA production whereas the additional Cas proteins that form the Cascade complex are needed for crRNA stability []. This entry represents a Cas family of proteins that includes TM1792 from Thermotoga maritima. It is part of the broad RAMP superfamily collection of CRISPR-associated proteins. It is the fourth of a recurring set of six proteins, four of are in the RAMP superfamily, that we designate the CRISPR RAMP module.
Protein Domain
Type: Family
Description: This entry includes the dishevelled (Dsh) proteins and dixin (also known as coiled-coil-DIX1, Ccd1).The transduction mechanism requires dishevelled protein (Dsh), a cytoplasmic phosphoprotein that acts directly downstream of frizzled []. In addition to its role in Wnt signalling, Dsh is also involved in generating planar polarity in Drosophila and has been implicated in the Notch signal transduction cascade. Furthermore, Dsh plays a vital molecular role in neural tube closure. Disruption of dishevelled signalling in Xenopus and inactivation of the genes encoding dishevelled 1 and dishevelled 2 (Dvl1 and Dvl2) in the mouse yield similar phenotypes, in which the neural tube fails to close []. Three human and mouse homologues of Dsh have been cloned (DVL-1 to 3); it is believed that these proteins, like their Drosophila counterpart, are involved in signal transduction. Human and murine orthologues share more than 95% sequence identity and are each 40-50% identical to Drosophila Dsh.Sequence similarity amongst Dsh proteins is concentrated around three conserved domains: at the N terminus lies a DIX domain (mutations mapping to this region reduce or completely disrupt Wg signalling); a PDZ (or DHR) domain, often found in proteins involved in protein-protein interactions, lies within the central portion of the protein (point mutations within this module have been shown to have little effect on Wg-mediated signal transduction); and a DEP domain is located towards the C terminus and is conserved among a set of proteins that regulate various GTPases. Whilst genetic and molecular assays have shown this module to be dispensable for Wg signalling, it is thought to be important in planar polarity signalling in flies [].Ccd1 is another DIX domain-possessing protein that forms complexes with the Dishevelled homologue Dvl and Axin. Ccd1 is a positive regulator of Wnt signalling []. It regulates JNK activation by AXIN1 and DVL2 [].
Protein Domain
Type: Homologous_superfamily
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important super-family of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, celldifferentiation and homeostasis. Members of the superfamily include thesteroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminalligand-binding domains, these nuclear receptors contain a highly-conserved,N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand,steroid hormone receptors are thought to be weakly associated with nuclearcomponents; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of thembeing implicated in diseases such as cancer, diabetes, hormone resistancesyndromes, etc. While several NRs act as ligand-inducible transcriptionfactors, many do not yet have a defined ligand and are accordingly termed "orphan"receptors. During the last decade, more than 300 NRs have beendescribed, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The glucocorticoid receptor consists of 3 functional and structuraldomains: an N-terminal (modulatory) domain; a DNA binding domain thatmediates specific binding to target DNA sequences (ligand-responsiveelements); and a hormone binding domain. The N-terminal domain is uniqueto the glucocorticoid receptors; it spans the first 440 residues, and isprimarily responsible for transcriptional activation. The smaller (around65 residues), highly-conserved central portion of the protein is the DNA binding domain, which plays a role in DNA binding specificity, homo-dimerisation and in interactions with other proteins. The hormone binding domain comprises approximately 250 residues at the C terminus of thereceptor. This domain mediates receptor activity via interaction with heatshock proteins and cyclophilins, or with hormone.
Protein Domain
Type: Family
Description: Secretion of virulence factors in Gram-negative bacteria involves transportation of the protein across two membranes to reach the cell exterior []. Four principal exotoxin secretion systems have been described. In the type II and IV secretion systems, toxins are first exported to the periplasm by way of a cleaved N-terminal signal sequence; a second set of proteins is used for extracellular transport (type II), or the C terminus of the exotoxin itself is used (type IV). Type III secretion involves at least 20 molecules that assemble into a needle; effector proteins are then translocated through this without need of a signal sequence. In the Type I system, a complete channel is formed through both membranes, and the secretion signal is carried on the C terminus of the exotoxin. The RTX (repeats in toxin) family of cytolytic toxins belong to the Type I secretion system, and are important virulence factors in Gram-negative bacteria. As well as the C-terminal signal sequence, several glycine-richrepeats are also found. These are essential for binding calcium, and are critical for the biological activity of the secreted toxins []. All RTX toxin operons exist in the order rtxCABD, RtxA protein being the structuralcomponent of the exotoxin, both RtxB and D being required for its export from the bacterial cell; RtxC is an acyl-carrier-protein-dependent acyl- modification enzyme, required to convert RtxA to its active form [].Escherichia coli haemolysin (HlyA) is often quotedas the model for RTX toxins. Recent work on its relative rtxC gene product HlyC []has revealed that it provides the acylation aspect for post-translational modification of two internal lysine residues in the HlyA protein. Other residues, including His23 and two conserved tyrosine residues, also appear to be important [].
Protein Domain
Type: Domain
Description: The tripartite DENN (after differentially expressed in neoplastic versusnormal cells) domain is found in several proteins that share common structuralfeatures and have been shown to be guanine nucleotide exchange factors (GEFs)for Rab GTPases, which are regulators of practically all membrane traffickingevents in eukaryotes. The tripartite DENN domain is composed of three distinctmodules which are always associated due to functional and/or structuralconstraints: upstream DENN or uDENN, the better conserved central or core orcDENN, and downstream or dDENN regions. The tripartite DENN domain is foundassociated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear,although it appears to represent a good candidate for a GTP/GDP exchangeactivity [, , , , ].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP).Human mitogen-activated protein kinase activating protein containing deathdomain (MADD). It is orthologous to Rab3GEP.Caenorhabditis elegans regulator of presynaptic activity aex-3, theortholog of Rab3GEP.Mouse Rab6 interacting protein 1 (Rab6IP1).Human SET domain-binding factor 1(SBF1).Human suppressor of tumoreginicity 5 (ST5).Human C-MYC promoter-binding protein IRLB.The DENN domain forms a heart-shaped structure, with the N-terminal residues forming one and the C-terminal residues forming the secondone. The N-terminal half forms the uDENN domain and consists of a centralantiparallel β-sheet layered between one helix and two helices. A longrandom-coil region links the two lobes. The C-terminal lobe is composed ofthe cDENN and dDENN domains. The cDENN domain is an alpha/beta three layeredsandwich domain with a central sheet of 5-strands. The dDENN domain is an all-alpha helical domain, whose core contains two alpha-hairpins which divergerapidly in sequence [, ].This domain represents the entire tripartite DENN domain.
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Liver X receptors (LXRs) are nuclear receptors that regulate the metabolism of several important lipids, including oxysterols []. There are two LXR isoforms, termed alpha and beta, which, upon activation, form heterodimers with retinoid X receptors and bind to an LXR response element found in the promoter region of their target genes. In addition to their involvement in lipid metabolism, LXRs also act as key regulators of macrophage function, and have roles in inflammation and immunity [].
Protein Domain
Type: Family
Description: NR0B1 (also known as DAX-1) is an orphan nuclear receptor involved in the development and maintenance of the steroid hormone pathway. It also plays a role in the development of the embryo and maintenance of pluripotent embryonic stem cells []. Mutations of the DAX-1 gene cause X-linked adrenal hypoplasia congenita (XL-AHC), a developmental disorder of the adrenal gland that results in profound hormonal deficiencies and is lethal if untreated [].NR0B2 lacks a conventional DNA binding domain (DBD) and represses the transcriptional activity of various nuclear receptors [].Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.
Protein Domain
Type: Homologous_superfamily
Description: Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor protein HrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains []. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [, ]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons. This superfamily represents the inserted dimerising domain of HrcA.
Protein Domain
Type: Family
Description: Like all apoptotic cell death, T cell receptor (TCR)-mediated death can bedivided into two phases: an inductive phase and an effector phase. The effector phase includes a sequence of steps that are common to apoptosis inmany cell types, which, if not interrupted, will lead to cell death. Theinduction phase, which often requires the expression of new genes, consistsof a set of signals that activate the effector phase. Outside the thymus,most, if not all, of the TCR-mediated apoptosis of mature T cells (sometimesreferred to as activation-induced cell death (AICD)) is induced through thesurface antigen Fas pathway: activation through the TCR induces expressionof the Fas (CD95) ligand (FasL); the expression of FasL on either aneighbouring cell, or on the Fas-bearing cell, induces trimerisation of Fas,which then initiates a signal-transduction cascade, leading to apoptosis of the Fas-bearing cell. This commitment stage requires the activation of keydeath-inducing enzymes, termed caspases, which act by cleaving proteins that are essential for cell survival and proliferation[, ].Fas is also known to be essential in the death of hyperactivated peripheralCD4+ cells: in the absence of Fas, mature peripheral T cells do not die, butthe activated cells continue to proliferate, producing cytokines that leadto grossly enlarged lymph nodes and spleen. Fas belongs to the tumournecrosis factor receptor (TNFR) family of cysteine-rich type I membranereceptors; its ligand (FasL) is expressed on activated lymphocytes, NK cells,platelets, certain immune-privileged cells and some tumour cells [, ].Defects in the Fas-FasL system are associated with various disease syndromes.Mice with non-functional Fas or FasL display characteristics of lympho-proliferative disorder, such as lymphadenopathy, splenomegaly, and elevated secretion of IgM and IgG. These mice also secrete anti-DNA autoantibodiesand rheumatoid factor [].
Protein Domain
Type: Family
Description: The bacterial core RNA polymerase complex, which consists of five subunits, is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme []. RNA polymerase recruits alternative sigma factors as a means of switching on specific regulons. Most bacteria express a multiplicity of sigma factors. Two of these factors, sigma-70 (gene rpoD), generally known as the major or primary sigma factor, and sigma-54 (gene rpoN or ntrA) direct the transcription of a wide variety of genes. The other sigma factors, known as alternative sigma factors, are required for the transcription of specific subsets of genes.With regard to sequence similarity, sigma factors can be grouped into two classes, the sigma-54 and sigma-70 families. Sequence alignments of the sigma70 family members reveal four conserved regions that can be further divided into subregions eg. sub-region 2.2, which may be involved in the binding of the sigma factor to the core RNA polymerase; and sub-region 4.2, which seems to harbor a DNA-binding 'helix-turn-helix' motif involved in binding the conserved -35 region of promoters recognised by the major sigma factors [, ]. The plastids of higher plants originating from an ancestral cyanobacterial endosymbiont also contain sigma factors that are encoded by a small family of nuclear genes. All plastid sigma factors belong to the superfamily of sigmaA/sigma70 and have sequences homologous to the conserved regions 1.2, 2, 3, and 4 of bacterial sigma factors [].This entry describes sigma-70 factors in Myxococcus xanthus (strain DK 1622) and in other members of the Mycococcales. Each of the six members in M. xanthusis is encoded near a gene for a predicted serine/threonine kinase. Members of this family show sequence similarity to members of Pfam family (region 4 of sigma-70 like sigma-factors), a helix-turn-helix family in which trusted and noise cutoffs deliberately are set artificially high and which therefore has many false negatives.
Protein Domain
Type: Domain
Description: Zinc finger (Znf) domains are relatively small protein motifs which contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains bind zinc, but many do not; instead binding other metals such as iron, or no metal at all. For example, some family members form salt bridges to stabilise the finger-like folds. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates [, , , , ]. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few []. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target. This entry represents a region, which contains a CXXCX(19)CXXC motif suggestive of both zinc fingers and thioredoxin, usually found at the N terminus of prokaryotic proteins. One partially characterised gene, agmX, is among a large set in Myxococcus whose interruption affects adventurous gliding motility [].
Protein Domain
Type: Family
Description: Galanin is a peptide hormone that controls various biological activities []. Galanin-like immuno-reactivity has been found in the central and peripheral nervous systems of mammals, with high concentrations demonstrated in discrete regions of the central nervous system, including the median eminence, hypothalamus, arcuate nucleus, septum, neuro-intermediate lobe of the pituitary, and the spinal cord. Its localisation within neurosecretory granules suggests that galanin may function as a neurotransmitter, and it has been shown to coexist with a variety of other peptide and amine neurotransmitters within individual neurons [].Although the precise physiological role of galanin is uncertain, it has a number of pharmacological properties: it stimulates food intake, when injected into the third ventricle of rats; it increases levels of plasma growth hormone and prolactin, and decreases dopamine levels in the median eminence []; and infusion into humans results in hyperglycemia and glucose intolerance, and inhibits pancreatic release of insulin, somatostatin and pancreatic peptide. Galanin also modulates smooth muscle contractility within the gastro-intestinal and genito-urinary tracts, all such activities suggesting that the hormone may play an important role in the nervous modulation of endocrine and smooth muscle function [].This family represents the 124 amino acid precursor protein to galanin. The precursor includes a signal peptide, galanin (29 amino acids), and a 60-amino acid galanin mRNA-associated peptide. In the precursor, galanin includes a C-terminal glycine and is flanked on each side by dibasic tryptic cleavage sites. The deduced amino acid sequence of rat galanin is 90% similar to porcine galanin, with all three amino acid differences in the C-terminal heptapeptide. The predicted galanin mRNA-associated peptide includes a 35-amino acid sequence that is 78% similar to the previously reported porcine analogue. This sequence is set off by a single basic tryptic cleavage site and includes a 17-amino acid region that is nearly identical to the porcine counterpart. The high interspecies conservation suggests a biological role for this putative peptide.
Protein Domain
Type: Domain
Description: Interferon (IFN)-gamma is a dimeric glycoprotein produced by activated T cells and natural killer cells. Although originally isolated based on itsantiviral activity, IFN-gamma also displays powerful anti-proliferative and immunomodulatory activities, which are essential for developing appropriate cellular defences against a variety of infectious agents. The first step in eliciting these responses is the specific high affinity interaction of IFN-gamma with its cell-surface receptor (IFN-gammaRalpha); the complex then interacts with at least one of a family of additional species-specific accessory factors (AF-1 or IFN-gammabeta), which convey different cellular responses. One such response is the association and phosphorylation of two protein tyrosine kinases (Jak-1 and Jak-2), which in turn stimulate nuclear transcription activators [].This entry includes:The human IFN-gamma receptor 1 (IFN-gammaR1), a member of the hematopoietic cytokine receptor superfamily. It is expressed in a membrane-bound form in many cell types, and is over-expressed in tumour cells. It comprises an extracellular portion of 229 residues, a single transmembrane region, and a cytoplasmic domain of 221 residues. As with other members of its superfamily, the cytokine-binding sites are formed by a small set of closely-spaced surface loops that extend from a β-sheet core, much like antigen-binding sites on antibodies. The extracellular IFN-gammaR monomer comprises two domains (D1 and D2 domains), each resembling an Ig-like fold with fibronectin type III topology [, , ]. The signalling complex comprises two IFN-gammaR1 chains and two IFN-gammaR2 chains, which dimerises in an IFN-gamma-driven fashion [].The vaccinia virus interferon (IFN)-gamma receptor (IFN-gammaR) is a 43kDa soluble glycoprotein that is secreted from infected cells early during infection. IFN-gammaR from vaccinia virus, cowpoxvirus and camelpox virus exist naturally as homodimers, whereas the cellular IFN-gammaR dimerizes only upon binding the homodimeric IFN-gamma. The existence of the virus protein as a dimer in the absence of ligand may provide an advantage to the virus in efficient binding and inhibition of IFN-gamma in solution [].This is the D2 domain, which is involved in forming receptor-receptor contacts [].
Protein Domain
Type: Domain
Description: Prokaryotic cells have a defence mechanism against a sudden heat-shock stress. Commonly, they induce a set of proteins that protect cellular proteins from being denatured by heat. Among such proteins are the GroE and DnaK chaperones whose transcription is regulated by a heat-shock repressor proteinHrcA. HrcA is a winged helix-turn-helix repressor that negatively regulates the transcription of dnaK and groE operons by binding the upstream CIRCE (controlling inverted repeat of chaperone expression) element. In Bacillus subtilis this element is a perfect 9 base pair inverted repeat separated by a 9 base pair spacer. The crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima has been reported at 2.2A resolution. HrcA is composed of three domains: an N-terminal winged helix-turn-helix domain (WHTH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel β-sheet composed of three β-strands sided by four α-helices. HrcA crystallises as a dimer, which is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains []. The structural studies suggest that the inactive form of HrcA is the dimer and this is converted to its DNA-binding form by interaction with GroEL, which binds to a conserved C-terminal sequence region [, ]. Comparison of the HrcA-CIRCE complexes from B. subtilis and Bacillus thermoglucosidasius (Geobacillus thermoglucosidasius), which grow at vastly different ranges of temperature shows that the thermostability profiles were consistent with the difference in the growth temperatures suggesting that HrcA can function as a thermosensor to detect temperature changes in cells []. Any increase in temperature causes the dissociation of the HrcA from the CIRCE complex with the concomitant activation of transcription of the groE and dnaK operons. This entry represents the C terminus of HrcA, consisting of the GAF-like domain with the inserted dimerising domain.
Protein Domain
Type: Homologous_superfamily
Description: Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases () show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S]iron-sulphur clusters []. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S]cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S]cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein.The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold [].The localisation of iron hydrogenases can be cytoplasmic or periplasmic. Periplasmic iron hydrogenases in Desulfovibrio consists of a large subunit (HydA) and a small subunit (HydB) [].
Protein Domain
Type: Domain
Description: Basement membranes are sheet-like extracellular matrices found at the basal surfaces of epithelia and condensed mesenchyma. By preventing cell mixing and providing a cell-adhesive substrate, they play crucial roles in tissue development and function. Basement membranes are composed of an evolutionarily ancient set of large glycoproteins, which includes members of the laminin family, collagen IV, perlecan and nidogen/entactin []. Nidogen/entactin is an important basement membrane component, which promotes cell attachment, neutrophil chemotaxis, trophoblast outgrowth, and angiogenesis and interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. It consists of three globular regions, G1-G3. G1 and G2 are connected by a thread-like structure, whereas that between G2 and G3 is rod-like [, ].The nidogen G2 region binds to collagen IV and perlecan. The nidogen G2structure is composed of two domains, an N-terminal EGF-like domain and a much larger β-barrel domain of ~230 residues. The nidogen G2 β-barrel consists of an 11-stranded β-barrel of complex topology, the interior of which is traversed by the hydrophobic, predominantly alpha helical segment connecting strands C and D. The N-terminal half of the barrel comprises two β-meanders (strands A-C and D-F) linked by the buried α-helical segment. The polypeptide chain then crosses the bottom of the barrel and forms a five-stranded Greek key motif in the C-terminal half of the domain. Helix alpha3 caps the top of the barrel and forms the interface to the EGF-like domain. The nidogen G2 β-barrel domain has unexpected structural similarity to green fluorescent proteins of Cnidaria, suggesting that they derive from a common ancestor. A large surface patch on the barrel surface is strikingly conserved in all metazoan nidogens. Site-directed mutagenesis demonstrates that the conserved residues in the conserved patch are involved in the binding of perlecan, and possibly also of collagen IV [].A similar domain is also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The oestrogen receptors (ERs) are steroid or nuclear hormone receptors that act as transcription regulators involved in diverse physiological functions. Oestrogen receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. The ER consists of three functional and structural domains: an N-terminal modulatory domain, a highly conserved DNA-binding domain that recognises specific sequences (), and a C-terminal ligand-binding domain (). This entry represents oestrogen receptors and oestrogen-related receptors, which are members of the subfamily 3 of nuclear receptors []. Oestrogen-related receptors (ERR-alpha, ERR-beta, and ERR-gamma) are orphan nuclear receptors whose physiological ligands have not yet been identified. Although ERRs are closely related to oestrogen receptors(ERs) they do not respond to oestrogens [].
Protein Domain
Type: Family
Description: The band-7 protein family comprises a diverse set of membrane-bound proteins characterised by the presence of a conserved domain, the band-7 domain, also known as SPFH or PHB domain. The exact function of the band-7 domain is not known, but examples from animal and bacterial stomatin-type proteins demonstrate binding to lipids and the ability to assemble into membrane-bound oligomers that form putative scaffolds [].A variety of proteins belong to the band-7 family. These include the stomatins, prohibitins, flottins and the HflK/C bacterial proteins. Eukaryotic band 7 proteins tend to be oligomeric and are involved in membrane-associated processes. Stomatins are involved in ion channel function, prohibitins are involved in modulating the activity of a membrane-bound FtsH protease and the assembly of mitochondrial respiratory complexes, and flotillins are involved in signal transduction and vesicle trafficking [].Stomatin, also known as human erythrocyte membrane protein band 7.2b [], was first identified in the band 7 region of human erythrocyte membrane proteins. It is an oligomeric, monotopic membrane protein associated with cholesterol-rich membranes/lipid rafts. Human stomatin is ubiquitously expressed in all tissues; highly in hematopoietic cells, relatively low in brain. It is associated with the plasma membrane and cytoplasmic vesicles of fibroblasts, epithelial and endothelial cells [].Stomatin is believed to be involved in regulating monovalent cation transport through lipid membranes. Absence of the protein in hereditary stomatocytosis is believed to be the reason for the leakage of Na+and K+ions into and from erythrocytes []. Stomatin is also expressed in mechanosensory neurons, where it may interact directly with transduction components, including cation channels [].Stomatin proteins have been identified in various organisms, including Caenorhabditis elegans. There are nine stomatin-like proteins in C. elegans, MEC-2 being the one best characterised []. In mammals, other stomatin family members are stomatin-like proteins SLP1, SLP2 and SLP3, and NPHS2 (podocin), which display selective expression patterns []. Stomatin family members are oligomeric, they mostly localise to membrane domains, and in many cases have been shown to modulate ion channel activity.The stomatins and prohibitins, and to a lesser extent flotillins, are highly conserved protein families and are found in a variety of organisms ranging from prokaryotes to higher eukaryotes, whereas HflK and HflC homologues are only present in bacteria [].This entry represents the stomatins and stomatin-like proteins, including podicin, from a wide range of eukaryotes, bacteria, archaea and viruses. It excludes the HflK and HflC proteins, prohibitins and flotillins.
Protein Domain
Type: Family
Description: This group represents QueF-like proteins, closely related to (QueF/YkvM) but containing an additional N-terminal domain. They are predicted to function as NADPH-dependent nitrile oxidoreductase based on sequence similarity to , and to catalyse the NADPH-dependent reduction of 7-cyano-7-deazaguanineto7-aminomethyl-7-deazaguanine, a late step in the biosynthesis of queuosine, a 7-deazaguanine modified nucleoside found in tRNA(GUN) of bacteria and eukaryotes.Queuosine (Q) is an example of a highly modified nucleoside located in the anticodon wobble position 34 of tRNAs specific for Tyr, His, Asp, and Asn. With few exceptions (such as yeast and mycoplasma), it is widely distributed in most prokaryotic and eukaryotic phyla []. Q is based on a very unusual 7-deazaguanosine core, which is further modified by addition of a cyclopentendiol ring [].This group of proteins belongs to the T fold structural superfamily and is related to GTP cyclohydrolase FolE. QueF-like proteins form two groups, type I proteins exemplified by Bacillus subtilis YkvM () and type II proteins exemplified by Escherichia coli YqcD (). The type I proteins are comparable in size with bacterial and mammalian FolE, whereas the type II proteins are larger and are predicted to be comprised of two domains, similar to plant FolE [].In members of this entry, the N-terminal domain has often been annotated as a membrane-spanning domain, but transmembrane prediction programs run on YqcD do not detect any transmembrane segments []. Instead, the QueF motif can be easily detected in this domain, whereas the flanking and invariant cysteine and glutamate residues (Cys-190 and Glu-230 in E. coli YqcD) are only present in the C-terminal domain. The splitting of active-site residues between the two domains of YqcD is very similar to that seen in two-domain FolE, in which neither domain contains the full set of active site residues nor is active when expressed separately. Further, the pattern of active-site splitting is the same in both proteins, with a similarly located conserved central sequence motif split from two flanking sequences, which are 40 residues apart. The splitting of the YqcD active site suggests that a gene duplication occurred, with each domain retaining some of the residues of the putative active site []. As in two-domain FolE, such a duplication event and redistribution of active-site residues could allow the YqcD proteins to evolve a simpler quaternary structure than the QueF proteins [].
Protein Domain
Type: Homologous_superfamily
Description: Arteriviruses are enveloped, positive-stranded RNA viruses and includepathogens of major economic concern to the swine- and horse-breedingindustries:Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus replicase gene is composed of two open reading frames (ORFs).ORF1a is translated directly from the genomic RNA, whereas ORF1b can beexpressed only by ribosomal frameshifting, yelding a 1ab fusion protein. Bothreplicase gene products are multidomain precursor proteins which areproteolytically processed into functional nonstructural proteins (nsps) by acomplex proteolytic cascade that is directed by four (PRRSV/LDV) or three(EAV) proteinase domains encoded in ORF1a. The arterivirus replicaseprocessing scheme involves the rapid autoproteolytic release of two or threeN-terminal nsps (nsp1 (or nsp1alpha/1beta) and nsp2) and thesubsequent processing of the remaining polyproteins by the "main protease"residing in nsp4, together resulting in a set of 13or 14 individual nsps. The arterivirus nsp1 region contains a tandem ofpapain-like cysteine autoprotease domains (PCPalpha and PCPbeta), but in EAVPCPalpha has lost its enzymatic activity, resulting in the 'merge' ofnsp1alpha and nsp1beta into a single nsp1 subunit. Thus, instead of threeself-cleaving N-terminal subunits, EAV has two: nsp1 and nsp2. The PCPalphaand PCPbeta domains mediate the nsp1alpha|1beta and nsp1beta|2cleavages,respectively. The catalytic dyad of PCPalpha and PCPbeta domains is composedof Cys and His residues. In EAV, a Lys residue is found in place of thecatalytic Cys residue, which explains the proteolytic deficiency of the EAVPCPalpha domain [, , , ]. The PCPalpha and PCPbeta domains form respectivelypeptidase families C31 and C32.The PCPalpha and PCPbeta domains have a typical papain fold, which consists ofa compact global region containing sequentially connected left (L) and right(R) parts in a so-called standard orientation. The L subdomain of PCPalphaconsists of four α-helices, while the R subdomain is formed by threeantiparallel beta strands []. The L subdomain of the PCBbetaconsists of three α-helices, while the R subdomain is formed by fourantiparallel β-strands []. The Cys and His residues faceeach other at the L-R interface and form the catalytic centre of the PCPalphaand PCPbeta domains [, ].This entry represents the PCPbeta domain (peptidase C32) superfamily.
Protein Domain
Type: Homologous_superfamily
Description: Arteriviruses are enveloped, positive-stranded RNA viruses and includepathogens of major economic concern to the swine- and horse-breedingindustries:Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus replicase gene is composed of two open reading frames (ORFs).ORF1a is translated directly from the genomic RNA, whereas ORF1b can beexpressed only by ribosomal frameshifting, yelding a 1ab fusion protein. Bothreplicase gene products are multidomain precursor proteins which areproteolytically processed into functional nonstructural proteins (nsps) by acomplex proteolytic cascade that is directed by four (PRRSV/LDV) or three(EAV) proteinase domains encoded in ORF1a. The arterivirus replicaseprocessing scheme involves the rapid autoproteolytic release of two or threeN-terminal nsps (nsp1 (or nsp1alpha/1beta) and nsp2) and thesubsequent processing of the remaining polyproteins by the "main protease"residing in nsp4, together resulting in a set of 13or 14 individual nsps. The arterivirus nsp1 region contains a tandem ofpapain-like cysteine autoprotease domains (PCPalpha and PCPbeta), but in EAVPCPalpha has lost its enzymatic activity, resulting in the 'merge' ofnsp1alpha and nsp1beta into a single nsp1 subunit. Thus, instead of threeself-cleaving N-terminal subunits, EAV has two: nsp1 and nsp2. The PCPalphaand PCPbeta domains mediate the nsp1alpha|1beta and nsp1beta|2 cleavages,respectively. The catalytic dyad of PCPalpha and PCPbeta domains is composedof Cys and His residues. In EAV, a Lys residue is found in place of thecatalytic Cys residue, which explains the proteolytic deficiency of the EAVPCPalpha domain [, , , ]. The PCPalpha and PCPbeta domains form respectively MEROPSpeptidase families C31 and C32.The PCPalpha and PCPbeta domains have a typical papain fold, which consists ofa compact global region containing sequentially connected left (L) and right(R) parts in a so-called standard orientation. The L subdomain of PCPalphaconsists of four α-helices, while the R subdomain is formed by threeantiparallel beta strands []. The L subdomain of the PCBbetaconsists of three α-helices, while the R subdomain is formed by fourantiparallel β-strands []. The Cys and His residues faceeach other at the L-R interface and form the catalytic centre of the PCPalphaand PCPbeta domains [, ].This entry represents the PCPalpha domain (peptidase C31) superfamily.
Protein Domain
Type: Family
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Type VI annexins that are found in various secretory cells, e.g. B- and T-cells (where it is found in greater concentrations in mature cells), and the lactation ducts of non-lactating human breasts. The observation that the protein is absent in lactating breasts suggests that it inhibits secretion. The type VI class may also play a part in the regulation of some calcium channels, and its presence may cause arrest of cell growth, before the DNA-replication stage, in cells growing at low serum concentrations. This annexin class is unusual in containing eight repeats of the conserved domain rather than the usual four. It is thus believed that the protein has arisen from a gene duplication event.
Protein Domain
Type: Family
Description: Wnt proteins constitute a large family of secreted signalling molecules thatare involved in intercellular signalling during development. The name derives from the first 2 members of the family to be discovered: int-1 (mouse) and wingless (Wg) (Drosophila) []. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer's disease [].Wnt signal transduction proceeds initially via binding to their cellsurface receptors - the so-called frizzled proteins. This activates thesignalling functions of B-catenin and regulates the expression of specificgenes important in development []. More recently, however, several non-canonical Wnt signalling pathways have been elucidated that act independently of B-catenin. In both cases, the transduction mechanismrequires dishevelled protein (Dsh), a cytoplasmic phosphoprotein that actsdirectly downstream of frizzled []. In addition to its role in Wnt signalling, Dsh is also involved in generating planar polarity in Drosophila and has been implicated in the Notch signal transduction cascade. Three human and mouse homologues of Dsh have been cloned (DVL-1 to 3); it is believed that these proteins, like their Drosophila counterpart, are involved in signal transduction. Human and murine orthologues share more than 95% sequence identity and are each 40-50% identical to Drosophila Dsh.Sequence similarity amongst Dsh proteins is concentrated around three conserved domains: at the N terminus lies a DIX domain (mutations mapping to this region reduce or completely disrupt Wg signalling); a PDZ (or DHR) domain, often found in proteins involved in protein-protein interactions, lies within the central portion of the protein (point mutations within this module have been shown to have little effect on Wg-mediated signal transduction); and a DEP domain is located towards the C terminus and is conserved among a set of proteins that regulate various GTPases (whilst genetic and molecular assays have shown this module to be dispensable for Wg signalling, it is thought to be important in planar polarity signalling in flies []). Therefore the requirement of these domains for distinct signaling pathways varies: the DIX domain is essential for B-catenin activation, the DEP domain is implicated in the activation of the JNK pathway, while the PDZ domain is requiredfor both [].
Protein Domain
Type: Homologous_superfamily
Description: The immunoglobulin (Ig) like fold, which consists of a β-sandwich of seven or more strands in two sheets with a greek-key topology, is one of the most common protein modules found in animals. Many different unrelated proteins share an Ig-like fold, which is often involved in interactions, commonly with other Ig-like domains via their β-sheets []. Of these, the "early"set (E set) domains are possibly related to the immunoglobulin () and/or fibronectin type III () Ig-like protein superfamilies. Ig-like E set domains include:C-terminal domain of certain transcription factors, such as the pro-inflammatory transcription factor NF-kappaB, and the T-cell transcription factors NFAT1 and NFAT5 [].Ig-like domains of sugar-utilising enzymes, such as galactose oxidase (C-terminal domain), sialidase (linker domain), and maltogenic amylase (N-terminal domain).C-terminal domain of arthropod haemocyanin, where many loops are inserted into the fold. These proteins act as dioxygen-transporting proteins.C-terminal domain of class II viral fusion proteins. These envelope glycoproteins are responsible for membrane fusion with target cells during viral invasion.Cytomegaloviral US (unique short) proteins. These type I membrane proteins help suppress the host immune response by modulating surface expression of MHC class I molecules [].Molybdenium-containing oxidoreductase-like dimerisation domain found in enzymes such as sulphite reductase.ML domains found in cholesterol-binding epididymal secretory protein E1, and in a major house-dust mite allergen; ML domains are implicated in lipid recognition, particularly the recognition of pathogen-related products.Rho-GDI-like signalling proteins, which regulate the activity of small G proteins [].Cytoplasmic domain of inward rectifier potassium channels such as Girk1 and Kirbac1.1. These channels act as regulators of excitability in eukaryotic cells.N-terminal domain of transglutaminases, including coagulation factor XIII; many loops are inserted into the fold in these proteins. These proteins act to catalyse the cross-linking of various protein substrates [].Filamin repeat rod domain found in proteins such as the F-actin cross-linking gelation factor ABP-120. These proteins interact with a variety of cellular proteins, acting as signalling scaffolds [].Arrestin family of proteins, which contain a tandem repeat of two elaborated Ig-like domains contacting each other head-to-head. These proteins are key to the redirection of GPCR signals to alternative pathways [].C-terminal domain of arginine-specific cysteine proteases, such as Gingipain-R, which act as major virulence factors of Porphyromonas gingivalis (Bacteroides gingivalis).Copper-resistance proteins, such as CopC, which act as copper-trafficking proteins [].Cellulosomal scaffoldin proteins, such as CipC module x2.1. These proteins act as scaffolding proteins of cellulosomes, which contain cellulose-degrading enzymes [].Quinohaemoprotein amine dehydrogenases (A chain), which contain a tandem repeat of two Ig-like domains. These proteins function in electron transfer reactions.Internalin Ig-like domains, which are truncated and fused to a leucine-rich repeat domain. These proteins are required for host cell invasion of Listeria species.
Protein Domain
Type: Family
Description: Pseudouridine synthases catalyse the isomerisation of uridine to pseudouridine (Psi) in a variety of RNA molecules, and may function as RNA chaperones. Pseudouridine is the most abundant modified nucleotide found in all cellular RNAs. There are four distinct families of pseudouridine synthases that share no global sequence similarity, but which do share the same fold of their catalytic domain(s) and uracil-binding site and are descended from a common molecular ancestor. The catalytic domain consists of two subdomains, each of which has an α+β structure that has some similarity to the ferredoxin-like fold (note: some pseudouridine synthases contain additional domains). The active site is the most conserved structural region of the superfamily and is located between the two homologous domains. These families are [, ]:Pseudouridine synthase I, TruA.Pseudouridine synthase II, TruB,which contains and additional C-terminal PUA domain.Pseudouridine synthase RsuA. RluB, RluE and RluF are also part of this family.Pseudouridine synthase RluA. TruC, RluC and RluD belong to this family.Pseudouridine synthase TruD, which has a natural circular permutation in the catalytic domain, as well as an insertion of a family-specific α+β subdomain.TruB is responsible for the pseudouridine residue present in the T loops of virtually all tRNAs. TruB recognises the preformed 3-D structure of the T loop primarily through shape complementarity. It accesses its substrate uridyl residue by flipping out the nucleotide and disrupts the tertiary structure of tRNA [].This model is built on a seed alignment of bacterial proteins only. Saccharomyces cerevisiae protein YNL292w (Pus4) has been shown to be the pseudouridine 55 synthase of both cytosolic and mitochondrial compartments, active at no other position on tRNA and the only enzyme active at that position in the species. A distinct yeast protein YLR175w, (centromere/microtubule-binding protein CBF5) is an rRNA pseudouridine synthase, and the archaeal set is much more similar to CBF5 than to Pus4. It is unclear whether the archaeal proteins found by this model are tRNA pseudouridine 55 synthases like TruB, rRNA pseudouridine synthases like CBF5, or (as suggested by the absence of paralogs in the Archaea) both. CBF5 likely has additional, eukaryotic-specific functions.
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The oestrogen receptors (ERs) are steroid or nuclear hormone receptors that act as transcription regulators involved in diverse physiological functions. Oestrogen receptors function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner. The ER consists of three functional and structural domains: an N-terminal modulatory domain, a highly conserved DNA-binding domain that recognises specific sequences (), and a C-terminal ligand-binding domain (). The N-terminal modulatory domain spans the first 180 residues and contains the activation function 1 (AF1) region. Nuclear receptors differ considerably with respect to AF1 activity and regulation, as it is a poorly conserved region []. There is another activation function region, namely AF2, which resides in the C-terminal end of the ligand-binding domain. Transcription activation is facilitated by both AF1 and AF2, which appear to act synergistically in the ER complex [, ]. For example, the ER can recruit TIF2 (transcription intermediary factor 2) via the AF1 and AF2 regions, whose synergistic action results in the activation of transcription.
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The retinoic acid (retinoid X) receptor consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domainthat mediates specific binding to target DNA sequences (ligand-responsiveelements); and a hormone binding domain. The N-terminal domain differs between retinoic acid isoforms; the small highly-conserved DNA-bindingdomain (~65 residues) occupies the central portion of the protein; and the ligand binding domain lies at the receptor C terminus.This entry represents retinoidX receptors. It also represents hepatocyte nuclear factor 4 (HNF4), which is a nuclear receptor protein expressed in the liver and kidney, and functions as a key regulator of many metabolic pathways. HNF4 was originally classified as an orphan receptor. Linoleic acid has now been identified as the endogenous ligand for HNF4 in mammalian cells [].
Protein Domain
Type: Family
Description: The lipocalins are a diverse, interesting, yet poorly understood family of proteins composed, in the main, of extracellular ligand-binding proteinsdisplaying high specificity for small hydrophobic molecules [, ]. Functionsof these proteins include transport of nutrients, control of cell regulation, pheromone transport, cryptic colouration and the enzymatic synthesisof prostaglandins.The crystal structures of several lipocalins have been solved and show a novel 8-stranded anti-parallel β-barrel fold well conserved within thefamily. Sequence similarity within the family is at a much lower level andwould seem to be restricted to conserved disulphides and 3 motifs, whichform a juxtaposed cluster that may act as a common cell surface receptorsite []. By contrast, at the more variable end of the fold are found an internal ligand binding site and a putative surface for the formation of macromolecular complexes []. The anti-parallel β-barrel fold is alsoexploited by the fatty acid-binding proteins (which function similarly bybinding small hydrophobic molecules), by avidin and the closely relatedmetalloprotease inhibitors, and by triabin. Similarity at the sequencelevel, however, is less obvious, being confined to a single short N-terminal motif.The lipocalin family can be subdivided into kernal and outlier sets. Thekernal lipocalins form the largest self-consistent group, comprising the subfamily of tick histamine-binding proteins. The outlier lipocalins form several smaller distinct subgroups: the OBPs, the von Ebner's gland proteins, alpha-1-acid glycoproteins, tick histamine binding proteins and the nitrophorins.The tick histamine binding proteins are the most recently identified set of outlier lipocalins. The structure of one tick histamine binding protein hasbeen solved []and has shown the proteins to have the characteristic lipocalin fold but without any appreciable sequence similarity. The tickhistamine binding proteins are secreted into the saliva of the ixodid tick Rhipicephalus appendiculatus and share functional similarity with the nitrophorins, sequestering histamine at the wound site. Because the tickhistamine binding proteins outcompete histamine receptors, they are able toovercome host inflammatory and immune responses. This enables the ticks tofeed for extended periods, lasting from days to several weeks, and are able to gorge themselves on large blood meals increasing their body mass 100 fold.Unlike nitrophorins, the tick proteins do not bind haem (or other cofactor),but ligate histamine directly in two rigid orthogonally-arranged binding sites, at opposing ends of the lipocalin anti-parallel β-barrel, whichhave an unusually polar character.
Protein Domain
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].The rhodopsin-like GPCRs (GPCRA) represent a widespread protein family that includes hormone, neurotransmitter and light receptors, all of which transduce extracellular signals through interaction with guanine nucleotide-binding (G) proteins. Although their activating ligands vary widely in structure and character, the amino acid sequences of the receptors are very similar and are believed to adopt a common structural framework comprising 7 transmembrane (TM) helices [, , ].The human APJ gene which encodes this receptor was originally cloned in 1993 using a set of primers based on the 7 conserved TM domains. The putative sequence is closest in terms of identity (40-50% in the TM regions) to the angiotensin receptor (AT1); however, angiotensin II shows no affinity for the receptor []. It is a receptor for apelin receptor early endogenous ligand (APELA) and apelin (APLN) hormones, which are coupled to G proteins and inhibit adenylate cyclase activity []. The mature transcript encodes a preproprotein that yields a 13 amino acid active peptide from the C-terminal end. Apelin has a similar mRNA distribution to angiotensin II and the active peptides share some similarity. It plays a role in regulation of blood vessel formation, blood pressure, heart contractility and heart failure [, , ].
Protein Domain
Type: Domain
Description: Wnt proteins constitute a large family of secreted signalling molecules thatare involved in intercellular signalling during development. The name derives from the first 2 members of the family to be discovered: int-1 (mouse) and wingless (Wg) (Drosophila) []. It is now recognised that Wnt signalling controls many cell fate decisions in a variety of different organisms, including mammals. Wnt signalling has been implicated in tumourigenesis, early mesodermal patterning of the embryo, morphogenesis of the brain and kidneys, regulation of mammary gland proliferation and Alzheimer's disease [].Wnt signal transduction proceeds initially via binding to their cellsurface receptors - the so-called frizzled proteins. This activates thesignalling functions of B-catenin and regulates the expression of specificgenes important in development []. More recently, however, several non-canonical Wnt signalling pathways have been elucidated that act independently of B-catenin. In both cases, the transduction mechanismrequires dishevelled protein (Dsh), a cytoplasmic phosphoprotein that actsdirectly downstream of frizzled []. In addition to its role in Wnt signalling, Dsh is also involved in generating planar polarity in Drosophila and has been implicated in the Notch signal transduction cascade. Three human and mouse homologues of Dsh have been cloned (DVL-1 to 3); it is believed that these proteins, like their Drosophila counterpart, are involved in signal transduction. Human and murine orthologues share more than 95% sequence identity and are each 40-50% identical to Drosophila Dsh.Sequence similarity amongst Dsh proteins is concentrated around three conserved domains: at the N terminus lies a DIX domain (mutations mapping to this region reduce or completely disrupt Wg signalling); a PDZ (or DHR) domain, often found in proteins involved in protein-protein interactions, lies within the central portion of the protein (point mutations within this module have been shown to have little effect on Wg-mediated signal transduction); and a DEP domain is located towards the C terminus and is conserved among a set of proteins that regulate various GTPases (whilst genetic and molecular assays have shown this module to be dispensable for Wg signalling, it is thought to be important in planar polarity signalling in flies []). Therefore the requirement of these domains for distinct signaling pathways varies: the DIX domain is essential for B-catenin activation, the DEP domain is implicated in the activation of the JNK pathway, while the PDZ domain is requiredfor both [].This entry represents a domain found in the C-terminal of Dsh proteins.
Protein Domain
Type: Family
Description: The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms, including bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. They function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression [, ].SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170kDa, and share a five-domain structure, with globular N- and C-terminal domains separated by a long(circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residuesthat are typical of flexible regionsin a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which has been shown by mutational studies to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases []. All SMC proteins appear to form dimers, either forming homodimers, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers form core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMC dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains.Proteins in this entry include SMC1/2/3/4 from Saccharomyces cerevisiae. SMC1-SMC3 heterodimer is part of the cohesin complex, which is required for sister chromatid cohesion in mitosis and meiosis []. SMC2-SMC4 heterodimer is part of the condensin complex, which is required for chromosome condensation during both mitosis and meiosis [, ].
Protein Domain
Type: Family
Description: The SMC (structural maintenance of chromosomes) family of proteins, exist in virtually all organisms, including bacteria and archaea. The SMC proteins are essential for successful chromosome transmission during replication and segregation of the genome in all organisms. They function together with other proteins in a range of chromosomal transactions, including chromosome condensation, sister-chromatid cohesion, recombination, DNA repair and epigenetic silencing of gene expression [].SMCs are generally present as single proteins in bacteria, and as at least six distinct proteins in eukaryotes. The proteins range in size from approximately 110 to 170kDa, and share a five-domain structure, with globular N- and C-terminal domains separated by a long(circa 100 nm or 900 residues) coiled coil segment in the centre of which is a globular ''hinge'' domain, characterised by a set of four highly conserved glycine residuesthat are typical of flexible regions in a protein. The amino-terminal domain contains a 'Walker A' nucleotide-binding domain (GxxGxGKS/T), which has been shown by mutational studies to be essential in several proteins. The carboxy-terminal domain contains a sequence (the DA-box) that resembles a 'Walker B' motif (XXXXD, where X is any hydrophobic residue), and a LSGG motif with homology to the signature sequence of the ATP-binding cassette (ABC) family of ATPases []. All SMC proteins appear to form dimers, either forming homodimers, as in the case of prokaryotic SMC proteins, or heterodimers between different but related SMC proteins. The dimers form core components of large multiprotein complexes. The best known complexes are cohesin, which is responsible for sister-chromatid cohesion, and condensin, which is required for full chromosome condensation in mitosis. SMC dimers are arranged in an antiparallel alignment. This orientation brings the N- and C-terminal globular domains (from either different or identical protamers) together, which unites an ATP binding site (Walker A motif) within the N-terminal domain with a Walker B motif (DA box) within the C-terminal domain, to form a potentially functional ATPase. Protein interaction and microscopy data suggest that SMC dimers form a ring-like structure which might embrace DNA molecules. Non-SMC subunits associate with the SMC amino- and carboxy-terminal domains.This entry represents the SMC protein from bacteria and archaea [, , ].
Protein Domain
Type: Family
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Human annexin type VIII, it was first identified in 1989 []; the 327 amino acidprotein was designated vascular anticoagulant (VAC) beta. It was found to beexpressed in placenta, lung endothelia, skin, liver and kidney. It sharesfour internal repeats with other annexin family members, but its 5' codingDNA sequence and untranslated region are unique []. The type VIII gene was found to be selectively over-expressed in acute promyelocytic leukaemia, anddifferentially expressed by chondrocytes, suggesting a putative link with longitudinal growth of the vertebrate skeleton [].
Protein Domain
Type: Family
Description: The band-7 protein family comprises a diverse set of membrane-bound proteins characterised by the presence of a conserved domain, the band-7 domain, also known as SPFH or PHB domain. The exact function of the band-7 domain is not known, but examples from animal and bacterial stomatin-type proteins demonstrate binding to lipids and the ability to assemble into membrane-bound oligomers that form putative scaffolds [].A variety of proteins belong to the band-7 family. These include the stomatins, prohibitins, flottins and the HflK/C bacterial proteins. Eukaryotic band 7 proteins tend to be oligomeric and are involved in membrane-associated processes. Stomatins are involved in ion channel function, prohibitins are involved in modulating the activity of a membrane-bound FtsH protease and the assembly of mitochondrial respiratory complexes, and flotillins are involved in signal transduction and vesicle trafficking [].Stomatin, also known as human erythrocyte membrane protein band 7.2b [], was first identified in the band 7 region of human erythrocyte membrane proteins. It is an oligomeric, monotopic membrane protein associated with cholesterol-rich membranes/lipid rafts. Human stomatin is ubiquitously expressed in all tissues; highly in hematopoietic cells, relatively low in brain. It is associated with the plasma membrane and cytoplasmic vesicles of fibroblasts, epithelial and endothelial cells [].Stomatin is believed to be involved in regulating monovalent cation transport through lipid membranes. Absence of the protein in hereditary stomatocytosis is believed to be the reason for the leakage of Na+and K+ions into and from erythrocytes []. Stomatin is also expressed in mechanosensory neurons, where it may interact directly with transduction components, including cation channels [].Stomatin proteins have been identified in various organisms, including Caenorhabditis elegans. There are nine stomatin-like proteins in C. elegans, MEC-2 being the one best characterised []. In mammals, other stomatin family members are stomatin-like proteins SLP1, SLP2 and SLP3, and NPHS2 (podocin), which display selective expression patterns []. Stomatin family members are oligomeric, they mostly localise to membrane domains, and in many cases have been shown to modulate ion channel activity.The stomatins and prohibitins, and to a lesser extent flotillins, are highly conserved protein families and are found in a variety of organisms ranging from prokaryotes to higher eukaryotes, whereas HflK and HflC homologues are only present in bacteria [].This entry matches Stomatin, HflK and HflC proteins.
Protein Domain
Type: Family
Description: This family consists of (uracil-5-)-methyltransferases from bacteria, archaea and eukaryotes. They are class I-like SAM-binding methyltransferases.Methyltransferases (MTs) (EC 2.1.1.-) constitute an important class of enzymes present in every life form. They transfer a methyl group most frequently from S-adenosyl L-methionine (SAM or AdoMet) to a nucleophilic acceptor such as nitrogen, oxygen, sulfur or carbon leading to S-adenosyl-L-homocysteine (AdoHcy) and a methylated molecule. The substrates that are methylated by these enzymes cover virtually every kind of biomolecules ranging from small molecules, to lipids, proteins and nucleic acids. MTs are therefore involved in many essential cellular processes including biosynthesis, signaltransduction, protein repair, chromatin regulation and gene silencing [, , ].More than 230 different enzymatic reactions of MTs have been described so far, of which more than 220 use SAM as the methyl donor. A review published in 2003 []divides all MTs into 5 classes based on the structure of their catalytic domain (fold):class I: Rossmann-like α/βclass II: TIM beta/α-barrel α/βclass III: tetrapyrrole methylase α/βclass IV: SPOUT α/β class V: SET domain all β A more recent paper []based on a study of the Saccharomyces cerevisiae methyltransferome argues for four more folds:class VI: transmembrane all αclass VII: DNA/RNA-binding 3-helical bundle all αclass VIII: SSo0622-like α+βclass IX: thymidylate synthetase α+βThe vast majority of MTs belong to the Rossmann-like fold (Class I) whichconsists in a seven-stranded β-sheet adjoined by α-helices. The β-sheet contains a central topological switch-point resulting in a deep cleft inwhich SAM binds. Class I MTs display two conserved positions, the first one isa GxGxG motif (or at least a GxG motif) at the end of the first β-strandwhich is characteristic of a nucleotide-binding site and is hence used to bindthe adenosyl part of SAM, the second conserved position is an acidic residueat the end of the second β-strand that forms one hydrogen bond to eachhydroxyl of the SAM ribose part. The core of these enzymes is composed byabout 150 amino acids that show very strong spatial conservation. Catechol O-MT (EC 2.1.1.6) is the canonical Class I MT considering that it consists inthe exact consensus structural core with no extra domain [].
Protein Domain
Type: Family
Description: This clade of sequences are the closest homologues to the PhnX enzyme, phosphonoacetaldehyde (Pald) hydrolase (phosphonatase, ). This phosphonatase-like enzyme and PhnX itself are members of the haloacid dehalogenase (HAD) superfamily having a a number of distinctive features that set them apart from typical HAD enzymes. The typical HAD N-terminal motif DxDx(T/V) here is DxAGT and the usual conserved lysine prior to the C-terminal motif is instead an arginine. Also distinctive of phosphonatase, and particular to its bi-catalytic mechanism is a conserved lysine in the variable "cap"domain []. This lysine forms a Schiff base with the aldehyde of phosphonoacetaldehyde, providing, through the resulting positive charge, a polarization of the C-P bond necesary for cleavage as well as a route to the initial product of cleavage, an ene-amine. The conservation of these elements in this phosphonatase-like enzyme suggests that the substrate is also, like Pald, a 2-oxo-ethylphosphonate. Despite this, the genomic context of members of this family are quite distinct from PhnX, which is almost invariably associated with the 2-aminoethylphosphonate transaminase PhnW (), the source of the substrate Pald. Members of this clade are never associated with PhnW, but rather associate with families of FAD-dependent oxidoreductases related to deaminating amino acid oxidases () as well as zinc-dependent dehydrogenases (). Notably, family members from Arthrobacter aurescens TC1 and Nocardia farcinica IFM 10152 are adjacent to the PhnCDE ABC cassette phosphonates transporter () typically found in association with the phosphonates C-P lyase system (). These observations suggest two possibilities. First, the substrate for this enzyme family is also Pald, the non-association with PhnW not withstanding. Alternatively, the substrate is something very closely related such as hydroxyphosphonoacetaldehyde (Hpald). Hpald could come from oxidative deamination of 1-hydroxy-2-aminoethylphosphonate (HAEP) by the associated oxidase. HAEP would not be a substrate for PhnW due to its high specificity for AEP. HAEP has been shown to be a constituent of the sphingophosphonolipid of Bacteriovorax stolpii [], and presumably has other natural sources. If Hpald is the substrate, the product would be glycoaldehyde (hydroxyacetaldehyde), and the associated alcohol dehydrogenase may serve to convert this to glycol.
Protein Domain
Type: Family
Description: Two lysine biosynthesis pathways evolved separately in organisms, the diaminopimelic acid (DAP) and aminoadipic acid (AAA) pathways. The DAP pathway synthesizes L-lysine from aspartate and pyruvate, and diaminopimelic acid is an intermediate. This pathway is utilised by most bacteria, some archaea, some fungi, some algae, and plants. The AAA pathway synthesizes L-lysine from alpha-ketoglutarate and acetyl coenzyme A (acetyl-CoA), and alpha-aminoadipic acid is an intermediate. This pathway is utilised by most fungi, some algae, the bacterium Thermus thermophilus, and probably some archaea, such as Sulfolobus, Thermoproteus, and Pyrococcus. No organism is known to possess both pathways [].There four known variations of the DAP pathway in bacteria: the succinylase, acetylase, aminotransferase, and dehydrogenase pathways. These pathways share the steps converting L-aspartate to L-2,3,4,5- tetrahydrodipicolinate (THDPA), but the subsequent steps leading to the production of meso-diaminopimelate, the immediate precursor of L-lysine, are different [].The succinylase pathway acylates THDPA with succinyl-CoA to generate N-succinyl-LL-2-amino-6-ketopimelate and forms meso-DAP by subsequent transamination, desuccinylation, and epimerization. This pathway is utilised by proteobacteria and many firmicutes and actinobacteria. The acetylase pathway is analogous to the succinylase pathway but uses N-acetyl intermediates. This pathway is limited to certain Bacillus species, in which the corresponding genes have not been identified. The aminotransferase pathway converts THDPA directly to LL-DAP by diaminopimelate aminotransferase (DapL) without acylation. This pathway is shared by cyanobacteria, Chlamydia, the archaeon Methanothermobacter thermautotrophicus, and the plant Arabidopsis thaliana. The dehydrogenase pathway forms meso-DAP directly from THDPA, NADPH, and NH4 _ by using diaminopimelate dehydrogenase (Ddh). This pathway is utilised by some Bacillus and Brevibacterium species and Corynebacterium glutamicum. Most bacteria use only one of the four variants, although certain bacteria, such as C. glutamicum and Bacillus macerans, possess both the succinylase and dehydrogenase pathways.This family of actinobacterial proteins are involved in the biosynthesis of the tetracycline antibiotic, oxytetracycline. The minimum set of enzymes required for the biosynthesis of anhydrotetracycline, the first intermediate in the synthesis of oxytetracycline, are OxyL, OxyQ, and OxyT. OxyQ catalyzes the conversion of 4-dedimethylamino-4-oxoanhydrotetracycline to yield 4-amino-4-de(dimethylamino)anhydrotetracycline (4-amino-ATC) [].
Protein Domain
Type: Domain
Description: This entry represents the dynamin-type guanine nucleotide-binding (G) domain. Members of the dynamin GTPase family appear to be ubiquitous. They catalyze diverse membrane remodelling events in endocytosis, cell division, and plastid maintenance. Their functional versatility also extends to other core cellular processes, such as maintenance of cell shape or centrosome cohesion. Members of the dynamin family are characterised by their common structure and by conserved sequences in the GTP-binding domain. The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (~280 amino acids) and the presence of two additional domains: the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity. In many dynamin family members, the basic set of domains is supplemented by targeting domains, such as: pleckstrin-homology (PH) domain, proline-rich domains (PRDs), or by sequences that target dynamins to specific organelles, such as mitochondria and chloroplasts [, , ]. The dynamin-type G domain consists of a central eight-stranded β-sheetsurrounded by seven alpha helices and two one-turn helices.It contains the five canonical guanine nucleotide binding motifs (G1-5). TheP-loop (G1) motif (GxxxxGKS/T) is also present in ATPases (Walker A motif) andfunctions as a coordinator of the phosphate groups of the bound nucleotide. Aconserved threonine in switch-I (G2) and the conserved residues DxxG ofswitch-II (G3) are involved in Mg(2+) binding and GTP hydrolysis. Thenucleotide binding affinity of dynamins is typically low, with specificity forGTP provided by the mostly conserved N/TKxD motif (G4). The G5 or G-cap motifis involved in binding the ribose moiety [, , ].Some proteins containing a dynamin-type G domain are listed below [, ]:Animal dynamin, the prototype for this family. The role of dynamin inendocytosis is well established. Additional roles were proposed in vesiclebudding from the trans-Golgi network (TGN) and the budding of caveolae fromthe plasma membrane [].Vetebrate Mx proteins, a group of interferon (IFN)-induced GTPases involvedin the control of intracellular pathogens [, ].Eukaryotic Drp1 (Dnm1 in yeast) mediates mitochondrial and peroxisomalfission.Eukaryotic Eps15 homology (EH)-domain-containing proteins (EHDs), ATPasesimplicated in clathrin-independent endocytosis and recycling fromendosomes. The dynamin-type G domains of EHDs bind to adenine rather thanto guanine nucleotide [, ].Yeast to human OPA1/Mgm1 proteins. They are found between the inner andouter mitochondrial membranes and are involved in mitochondrial fusion.Yeast to human mitofusin/fuzzy onions 1 (Fzo1) proteins, involved inmitochondrial dynamics [, ].Yeast vacuolar protein sorting-associated protein 1 (Vps1), involved invesicle trafficking from the Golgi.Escherichia coli clamp-binding protein CrfC (or Yjda), important for thecolocalization of sister nascent DNA strands after replication fork passageduring DNA replication, and for positioning and subsequent partitioning ofsister chromosomes [].Nostoc punctiforme bacterial dynamin-like protein (BDLP) [, ].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.Peroxisome proliferator-activated receptors (PPAR) are ligand-activatedtranscription factors that belong to the nuclear hormone receptor superfamily. Three cDNAs encoding PPARs have been isolated from Xenopus laevis: xPPAR alpha, beta and gamma []. All three xPPARs appear to be activated by both synthetic peroxisome proliferators and naturally occurring fatty acids, suggesting a common mode of action for all members of this subfamily of receptors []. Furthermore, the multiplicity of the receptors suggests the existence of hitherto unknown cellular signalling pathways for xenobiotics and putative endogenous ligands []. A PPAR alpha-related cDNA from mouse (designated PPAR delta, andsubsequently renamed beta) has been cloned and characterised. The alpha, beta and gamma PPAR isoforms display widely divergent patterns of expressionduring embryogenesis and in the adult []. PPAR gamma and beta are not activated by pirinixic acid, a potent peroxisome proliferator and activator of PPAR alpha; they are, however, activated by the structurally distinct peroxisome proliferator LY-171883 and linoleic acid, respectively, indicating that each isoform can act as a regulated activator oftranscription []. Thus tissue-specific responsiveness to peroxisome proliferators, including certain fatty acids, may be partly a consequence of differential expression of multiple, pharmacologically distinct PPAR isoforms [].
Protein Domain
Type: Family
Description: Peroxisome proliferator-activated receptors (PPAR) are ligand-activatedtranscription factors that belong to the nuclear hormone receptor superfamily. Three subtypes of this receptor have been discovered: PPAR alpha, beta and gamma []. They control a variety of target genes involved in lipid homeostasis, diabetes and cancer []. A human cognate of the mouse PPAR-gamma (hPPAR gamma) has been cloned froma placental cDNA library []. Sequence analysis reveals a high degree of similarity to the mouse receptor (mPPAR) and, like other PPARs, hPPAR gamma forms heterodimers with RXR alpha. hPPAR gamma is expressed strongly in adipose tissue, but significant levels are also detectable in placenta, lung and ovary []. In vitrotrans-activation data suggest hPPAR gamma is only poorly activated by xenobiotic peroxisome proliferators, although certain fatty acids and eicosanoids are potent activators of this receptor. Both mPARR and hPPAR gamma may be activated by thiazolidinedione drugs, although the receptors appear to differ in their sensitivity to these compounds. These data suggest a high degree of structural and functional similarity between mPARR and hPPAR gamma, and provide evidence for variation in human receptor structure that may result in differential sensitivity to activators []. Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.A rat orphan nuclear hormone receptor, designated Nurr1 (Nur-related factor 1), has been isolated from a brain cortex cDNA library. The proteincontains 598 amino acids and has a predicted molecular mass of 65kDa. Thededuced sequence shows strong similarity to the mouse Nurr1 and human NOT1orphan nuclear hormone receptors of the NGFI-B/Nur77/NAK1 gene subfamily []. Rat nurr1 is thought to be an immediate-early gene that is differentiallyinduced by electroconvulsive seizure vs. kindled seizures. As Nurr1 appears to be predominantly located in brain tissue, it may have a role in regulation of gene expression in the central nervous system []. Moreover, given that Nurr1 is prominently expressed in specific brain sites associated with memory acquisition and consolidation, it may be involvedin memory processing [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.In Drosophila melanogaster, the steroid hormone ecdysone triggers larval-to-adult metamorphosis, a process in which the hormone induces imaginal tissuesto generate adult structures, and larval tissues to degenerate []. Theecdysone receptor (EcR) binds DNA with high specificity at ecdysone responseelements. EcR is nuclear and is found in larval wing discs, pupal wings andin prothoracic glands. In the mosquito Aedes aegypti, 20-hydroxyecdysone plays an important rolein the regulation of egg maturation []. There are three EcR transcripts(of 4.2kb, 6kb and 11kb) in adult mosquitoes; 4.2kb mRNA is predominantly expressed in female mosquitoes during vitellogenesis. In both the fat body and ovaries of the female mosquito, the level of EcR mRNA is high at the previtellogenic period and after the onset of vitellogenesis []. Synonym(s): 1H nuclear receptor
Protein Domain
Type: Family
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].Human annexin A10 (annexin 14) was first identified in silico by searches ofdbEST with a number of divergent annexins []. The analysis revealed singlehuman and mouse ESTs corresponding to a novel and rarely expressed annexinin which three of the four tetrad core repeats lack the calcium-binding domain. It was proposed that this subtype, together with A5 annexin, gaverise to the Type VI octad through a process of gene duplication and fusion in early chordate evolution [].
Protein Domain
Type: Family
Description: The annexins (or lipocortins) are a family of proteins that bind to phospholipids in a calcium-dependent manner []. The 12 annexins common to vertebrates are classified in the annexin A family and named as annexins A1-A13 (or ANXA1-ANXA13), leaving A12 unassigned in the official nomenclature. Annexins outside vertebrates are classified into families B (in invertebrates), C (in fungi and some groups of unicellular eukaryotes), D (in plants), and E (in protists) []. Annexins are absent from yeasts and prokaryotes [].Most eukaryotic species have 1-20 annexin (ANX) genes. All annexins share a core domain made up of four similar repeats, each approximately 70 amino acids long []. Each individual annexin repeat (sometimes referred to as endonexin folds) is folded into five α-helices, and in turn are wound into a right-handed super-helix; they usually contain a characteristic 'type 2' motif for binding calcium ions with the sequence 'GxGT-[38 residues]-D/E'. Animal and fungal annexins also have variable amino-terminal domains. The core domains of most vertebrate annexins have been analysed by X-ray crystallography, revealing conservation of their secondary and tertiary structures despite only 45-55% amino-acid identity among individual members. The four repeats pack into a structure that resembles a flattened disc, with a slightly convex surface on which the Ca2+ -binding loops are located and a concave surface at which the amino and carboxyl termini come into close apposition.Annexins are traditionally thought of as calcium-dependent phospholipid-binding proteins, but recent work suggests a more complex set of functions. The family has been linked with inhibition of phospholipase activity, exocytosis and endocytosis, signal transduction, organisation of the extracellular matrix, resistance to reactive oxygen species and DNA replication [].This entry represents Annexin type XI. The N-terminal of annexin XI is hydrophobic, rich in glycine, tyrosine, and proline residues, and is larger than that of the other annexin members. Annexin XI is ubiquitously expressed in a variety of tissues and cell types of eukaryotes, but its subcellular distribution varies considerably. Some growth and differentiation conditions favour the presence of annexin XI in the nucleus, whereas others favour either a cytoplasmic distribution or both. Annexin XI is upregulated in mitotic cells and stains mitotic spindles. It is required for cytokinesis completion []. Ca2+ was found to influence both the association of annexin XI with tubulin and the nuclear or cytoplasmic subcellular localisation of annexin XI [].The human orthologue of annexin XI was found to be identical to the 56K autoantigen, found in individuals with a range of autoimmune diseases such as rheumatoid arthritis [].
Protein Domain
Type: Domain
Description: The tripartite DENN (after differentially expressed in neoplastic versus normal cells) domain is found in several proteins that share common structural features and have been shown to be guanine nucleotide exchange factors (GEFs) for Rab GTPases, which are regulators of practically all membrane trafficking events in eukaryotes. The tripartite DENN domain is composed of three distinct modules which are always associated due to functional and/or structural constraints: upstream DENN or uDENN, the better conserved central or core or cDENN, and downstream or dDENN regions. The tripartite DENN domain is found associated with other domains, such as RUN, PLAT, PH, PPR, WD-40, GRAM or C1. The function of DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchangeactivity [, , , , ].Some proteins known to contain a tripartite DENN domain are listed below:Rat Rab3 GDP/GTP exchange protein (Rab3GEP).Human mitogen-activated protein kinase activating protein containing death domain (MADD). It is orthologous to Rab3GEP.Caenorhabditis elegans regulator of presynaptic activity aex-3, the ortholog of Rab3GEP.Mouse Rab6 interacting protein 1 (Rab6IP1).Human SET domain-binding factor 1(SBF1).Human suppressor of tumoreginicity 5 (ST5).Human C-MYC promoter-binding protein IRLB.The DENN domain forms a heart-shaped structure, with the N-terminal residues forming one and the C-terminal residues forming the second one. The N-terminal half forms the uDENN domain and consists of a central antiparallel β-sheet layered between one helix and two helices. A long random-coil region links the two lobes. The C-terminal lobe is composed of the cDENN and dDENN domains. The cDENN domain is an alpha/beta three layered sandwich domain with a central sheet of 5-strands. The dDENN domain is an all-alpha helical domain, whose core contains two alpha-hairpins which divergerapidly in sequence [, ].Divergent types of the tripartite DENN domain have also been detected in other protein families []:Folliculin (FLCN), a tumor suppressor protein disrupted in various cancers and the Birt-Hogg-Dube syndrome, and Smith-Magenis syndrome chromosomal region candidate eight protein (SMCR8), which has been implicated in autophagy [].FLCN-interacting proteins (FNIP1 and FNIP2), interact with FLCN and function in conjunction with it to regulate cellular energy metabolism both in the kidney tissue and B-cells.C9ORF72 protein, expansions of the hexanucleotide GGGGCC in the first intron of its gene have been implicated in amyotrophic lateral sclerosis (ALS) and fronto-temporal dementia (FTD).This entry represents the FNIP1/FNIP2-type divergent tripartite DENN domain.
Protein Domain
Type: Domain
Description: The CRM domain is an ~100-amino acid RNA-binding domain. The name chloroplast RNA splicing and ribosome maturation (CRM) has been suggested to reflect the functions established for the four characterised members of the family: Zea mays (Maize) CRS1 (), CAF1 () and CAF2 () proteins and the Escherichia coli protein YhbY (). The CRM domain is found in eubacteria, archaea, and plants. The CRM domain is represented as a stand-alone protein in archaea and bacteria, and in single- and multi-domain proteins in plants. It has been suggested that prokaryotic CRM proteins existed as ribosome-associated proteins prior to the divergence of archaea and bacteria, and that they were co-opted in the plant lineage as RNA binding modules by incorporation into diverse protein contexts. Plant CRM domains are predicted to reside not only in the chloroplast, but also in the mitochondrion and the nucleo/cytoplasmic compartment. The diversity of the CRM domain family in plants suggests a diverse set of RNA targets [, ].The CRM domain is a compact alpha/beta domain consisting of a four-stranded beta sheet and three alpha helices with an α-β-α-β-α-β-beta topology. The beta sheet face is basic, consistent with a role in RNA binding. Proximal to the basic beta sheet face is another moiety that could contribute to nucleic acid recognition. Connecting strand beta1 and helix alpha2 is a loop with a six amino acid motif, GxxG flanked by large aliphatic residues, within which one 'x' is typically a basic residue []. Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants []. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing []. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes []. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome [].
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.In common with other members of the steroid hormone receptor family, thyroid hormone receptors (TRs) contain 2 major highly-conserved domains, involved in DNA- and ligand-binding respectively. Except for a conserved 12 residue motif adjacent to the DNA-binding domain, the N-terminal domains are divergent between alpha- and beta-TR isoforms (but are conserved within isoforms). The DNA-binding domain is the most highly conserved feature of the family; it contains 2 zinc-binding modules, which are sometimes referred to as zinc fingers (see ). The ligand-binding domain []includes a number of conserved motifs, parts of which are thought to be involved in dimerisation. The hinge region between these 2 domains []is thought to contain the binding site for transcriptional co-repressor proteins that mediate the transcriptional repression function of unliganded receptors.
Protein Domain
Type: Domain
Description: This entry represents the DOT1 domain.The Dot1 protein (Dot1p) is an histone-lysine N-methyltransferase (EC2.1.1.43) that methylates lysine 79 (Lys-79) of histone H3. It was firstidentified as a Disruptor Of Telomeric silencing in yeast where Dot1p isimplicated in gene silencing and localization of the Silent InformationRegulator (SIR) complex; in higher eukaryotes the methylation carried out bythis enzyme may be used for differentiating chromatin domains. Unlike otherhistone-lysine methyltransferases (HKMTs), Dot1p displays a Rossmann-like(Class I) S-adenosyl-L-methyionine (SAM)-dependent MT foldwhile other HKMTs contain the SET domain and hence belong toa whole different structural class [, ].Whereas most HKMTs, such as Suvar3-9 methylate Lys on the N-terminal tails ofhistones that stick out from the nucleosome, Dot1p substrate (Lys-79 ofhistone H3) is located in the conserved histone core, in a short turnconnecting the first and second helices, exposed on the nucleosome disksurface [, ]. In order for Lys-79 of H3 to be methylated by Dot1p, anotherlysine, Lys-123 of histone H2B, needs to be ubiquitinated. A possible reasonput forward for this requirement is that the ubiquitination may create a spacebetween adjacent nucleosomes, permitting access of Dot1p to its substrate[, ]. In yeast, different states of methylation on Lys-79 of histone H3(unmodified, mono-, di- and trimethylated) co-exist at the same time, but noclear function is associated with these different methylation states [].The strucure of the evolutionary conserved core of Dot1p, the DOT1 domain, hasfirst been described for the yeast Dot1p in complex withS-adenosyl-L-homocysteine (AdoHcy) and then for the humanDot1-like protein (Dot1Lp) in complex with SAM. The DOT1domain is about 300-350 amino acids long and is usually located at either ofthe extremities of the protein sequence: it stands at the C terminus of theyeast Dot1p and at the N terminus of the human Dot1Lp [, ]. DOT1 displays arather elongated structure and can be subdivided into two parts: the N- andthe C-terminal subdomains []. The N-terminal part is made up of five alphahelices and two pairs of short beta strand hairpins. The C-terminal partdisplays a Rossmann-like fold: it consists in aseven-stranded beta sheet tucked by five alpha helices (three helices on oneside of the sheet and two on the other), the sheet contains a centraltopological switchpoint resulting in a deep pocket where SAM is bound. The twosubdomains are linked covalently by a loop. Altogether the SAM binding pocketis formed by five segments of the DOT1 domain of which four are located in theC-terminal substructure of the DOT1 domain and one in the loop connecting bothparts; two of these segments are conserved across different Class ISAM-dependent MTs [].
Protein Domain
Type: Domain
Description: Arteriviruses are enveloped, positive-stranded RNA viruses and includepathogens of major economic concern to the swine- and horse-breedingindustries:Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus replicase gene is composed of two open reading frames (ORFs).ORF1a is translated directly from the genomic RNA, whereas ORF1b can beexpressed only by ribosomal frameshifting, yelding a 1ab fusion protein. Bothreplicase gene products are multidomain precursor proteins which areproteolytically processed into functional nonstructural proteins (nsps) by acomplex proteolytic cascade that is directed by four (PRRSV/LDV) or three(EAV) proteinase domains encoded in ORF1a. The arterivirus replicaseprocessing scheme involves the rapid autoproteolytic release of two or threeN-terminal nsps (nsp1 (or nsp1alpha/1beta) and nsp2) and thesubsequent processing of the remaining polyproteins by the "main protease"residing in nsp4, together resulting in a set of 13or 14 individual nsps. The arterivirus nsp1 region contains a tandem ofpapain-like cysteine autoprotease domains (PCPalpha and PCPbeta), but in EAVPCPalpha has lost its enzymatic activity, resulting in the 'merge' ofnsp1alpha and nsp1beta into a single nsp1 subunit. Thus, instead of threeself-cleaving N-terminal subunits, EAV has two: nsp1 and nsp2. The PCPalphaand PCPbeta domains mediate the nsp1alpha|1beta and nsp1beta|2 cleavages,respectively. The catalytic dyad of PCPalpha and PCPbeta domains is composedof Cys and His residues. In EAV, a Lys residue is found in place of thecatalytic Cys residue, which explains the proteolytic deficiency of the EAVPCPalpha domain [, , , ]. The PCPalpha and PCPbeta domains form respectivelypeptidase families C31 and C32.The PCPalpha and PCPbeta domains have a typical papain fold, which consists ofa compact global region containing sequentially connected left (L) and right(R) parts in a so-called standard orientation. The L subdomain of PCPalphaconsists of four α-helices, while the R subdomain is formed by threeantiparallel beta strands []. The L subdomain of the PCBbetaconsists of three α-helices, while the R subdomain is formed by fourantiparallel β-strands []. The Cys and His residues faceeach other at the L-R interface and form the catalytic centre of the PCPalphaand PCPbeta domains [, ].This entry represents the PCPbeta domain (peptidase C32).
Protein Domain
Type: Family
Description: Like all apoptotic cell death, T cell receptor (TCR)-mediated death can bedivided into two phases: an inductive phase and an effector phase. The effector phase includes a sequence of steps that are common to apoptosis inmany cell types, which, if not interrupted, will lead to cell death. Theinduction phase, which often requires the expression of new genes, consistsof a set of signals that activate the effector phase. Outside the thymus,most, if not all, of the TCR-mediated apoptosis of mature T cells (sometimesreferred to as activation-induced cell death (AICD)) is induced through thesurface antigen Fas pathway: activation through the TCR induces expressionof the Fas (CD95) ligand (FasL); the expression of FasL on either aneighbouring cell, or on the Fas-bearing cell, induces trimerisation of Fas,which then initiates a signal-transduction cascade, leading to apoptosis of the Fas-bearing cell. This commitment stage requires the activation of keydeath-inducing enzymes, termed caspases, which act by cleaving proteins thatare essential for cell survival and proliferation[, ]. However whathappens to FasL itself remains unknown. It is possible that it is cleavedfrom the effector cells and internalised into the target cells; it may bedownregulated in the effector cells; or it may be phagocytosed by the targetcells.Fas is also known to be essential in the death of hyperactivated peripheralCD4 cells: in the absence of Fas, mature peripheral T cells do not die, butthe activated cells continue to proliferate, producing cytokines that leadto grossly enlarged lymph nodes and spleen. Defects in the Fas-FasL systemare associated with various disease syndromes. Mice with non-functional Fasor FasL display characteristics of lymphoproliferative disorder, such as lymphadenopathy, splenomegaly, and elevated secretion of IgM and IgG. Thesemice also secrete anti-DNA autoantibodies and rheumatoid factor [].FasL (also known as tumor necrosis factor ligand superfamily member 6) is a 40kDa type II membrane protein belonging to the tumour necrosisfactor (TNF) family. Its binding to the cognate Fas receptor triggers the apoptosis that plays a pivotal role in the maintenance of immune system homeostasis. It is expressed on activated lymphocytes, NK cells,platelets, certain immune-privileged cells and some tumour cells[, ]. The cell death-inducing property of FasL has been associated with its extracellular domain, which can be cleaved off by metalloprotease activity to produce soluble FasL [].Human and mouse FasL induce apoptosis in cells expressing either mouse orhuman Fas with the same specificity. Although the amino acid sequence ofFasL is highly conserved between human and mouse, the similarity betweenhuman and murine Fas is much less pronounced. Greater conservation of theligand than the receptor is also observed in other members of the TNF family.By comparison with other TNF family members, FasL has a long N-terminal intracellular region rich in proline residues, which is known tobind to the SH3 domain. SH3 domains play important roles in mediating specificprotein-protein interactions, specifically in the cytoskeleton.
Protein Domain
Type: Family
Description: Small GTPases form an independent superfamily within the larger class of regulatory GTP hydrolases. This superfamily contains proteins that control a vast number of important processes and possess a common, structurally preserved GTP-binding domain [, ]. Sequence comparisons of small G proteins from various species have revealed that they are conserved in primary structures at the level of 30-55% similarity [].Crystallographic analysis of various small G proteins revealed the presence of a 20kDa catalytic domain that is unique for the whole superfamily [, ]. The domain is built of five alpha helices (A1-A5), six β-strands (B1-B6) and five polypeptide loops (G1-G5). A structural comparison of the GTP- and GDP-bound form, allows one to distinguish two functional loop regions: switch I and switch II that surround the gamma-phosphate group of the nucleotide. The G1 loop (also called the P-loop) that connects the B1 strand and the A1 helix is responsible for the binding of the phosphate groups. The G3 loop provides residues for Mg2 and phosphate binding and is located at the N terminus of the A2 helix. The G1 and G3 loops are sequentially similar to Walker A and Walker B boxes that are found in other nucleotide binding motifs. The G2 loop connects the A1 helix and the B2 strand and contains a conserved Thr residue responsible for Mg2 binding. The guanine base is recognised by the G4 and G5 loops. The consensus sequence NKXD of the G4 loop contains Lys and Asp residues directly interacting with the nucleotide. Part of the G5 loop located between B6 and A5 acts as a recognition site for the guanine base [].The small GTPase superfamily can be divided into at least 8 different families, including:Arf small GTPases. GTP-binding proteins involved in protein trafficking by modulating vesicle budding and uncoating within the Golgi apparatus.Ran small GTPases. GTP-binding proteins involved in nucleocytoplasmic transport. Required for the import of proteins into the nucleus and also for RNA export.Rab small GTPases. GTP-binding proteins involved in vesicular traffic.Rho small GTPases. GTP-binding proteins that control cytoskeleton reorganisation.Ras small GTPases. GTP-binding proteins involved in signalling pathways.Sar1 small GTPases. Small GTPase component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER).Mitochondrial Rho (Miro). Small GTPase domain found in mitochondrial proteins involved in mitochondrial trafficking.Roc small GTPases domain. Small GTPase domain always found associated with the COR domain.Ras proteins are small GTPases that regulate cell growth, proliferation anddifferentiation. The different Ras isoforms: H-ras, N-ras and K-ras, generate distinct signaloutputs, despite interactingwith a common set of activators and effectors. Ras is activated by guanine nucleotide exchange factors (GEFs) thatrelease GDP and allow GTP binding. Many RasGEFs have been identified.These are sequestered in the cytosol until activation by growth factorstriggers recruitment to the plasma membrane or Golgi, where the GEFcolocalizes with Ras. Active GTP-bound Ras interacts with severaleffector proteins: among the best characterised are the Raf kinases,phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1. Ras proteins are synthesized ascytosolic precursors that undergo post-translational processing to be ableto associate with cell membranes []. First, protein farnesyl transferase, a cytosolicenzyme, attaches a farnesyl group to the cysteine residue of the CAAXmotif. Second, the farnesylated CAAX sequence targets Ras to thecytosolic surface of the ER where an endopeptidase removes the AAX tripeptide. Third, the α-carboxyl group on the now carboxy-terminal farnesylcysteine ismethylated by isoprenylcysteine carboxyl methyltransferase. Finally, after methylation, Ras proteins take one oftwo routes to the cell surface, which is dictated by a second targetingsignal that is located immediately amino-terminal to the farnesylatedcysteine. N-ras and H-ras are expressedstably on the plasma membrane, on Golgi in transfected cells, and at least transiently on the ER. Ras has also been visualized on endosomes.
Protein Domain
Type: Family
Description: N -Acetylglutamate (NAG) fulfils distinct biological roles in lower and higher organisms. In prokaryotes, lower eukaryotes and plants it is the first intermediate in the biosynthesis of arginine, whereas in ureotelic (excreting nitrogen mostly in the form of urea) vertebrates, it is an essential allosteric cofactor for carbamyl phosphate synthetase I (CPSI), the first enzyme of the urea cycle. The pathway that leads from glutamate to arginine in lower organisms employs eight steps, starting with the acetylation of glutamate to form NAG. In these species, NAG can be produced by two enzymatic reactions: one catalysed by NAG synthase (NAGS) and the other by ornithine acetyltransferase (OAT). In ureotelic species, NAG is produced exclusively by NAGS. In lower organisms, NAGS is feedback-inhibited by L-arginine, whereas mammalian NAGS activity is significantly enhanced by this amino acid. The NAGS genes of bacteria, fungi and mammals are more diverse than other arginine-biosynthesis and urea-cycle genes. The evolutionary relationship between the distinctly different roles of NAG and its metabolism in lower and higher organisms remains to be determined [].The pathway from glutamate to arginine is: NAGS; N-acetylglutamate synthase () (glutamate to N-acetylglutamate)NAGK; N-acetylglutamate kinase () (N-acetylglutamate to N-acetylglutamate-5P)N-acetyl-gamma-glutamyl-phosphate reductase () (N-acetylglutamate-5P to N-acetylglumate semialdehyde)Acetylornithine aminotransferase () (N-acetylglumate semialdehyde to N-acetylornithine)Acetylornithine deacetylase () (N-acetylornithine to ornithine)Arginase () (ornithine to arginine)N-acetyl-gamma-glutamyl-phosphate reductase () (AGPR, NAGSA dehydrogenase) [, ]is the enzyme that catalyses the third step in the biosynthesis of arginine from glutamate, the NADP-dependent reduction of N-acetyl-5-glutamyl phosphate into N-acetylglutamate 5-semialdehyde. In bacteria it is a monofunctional protein of 35 to 38kDa (gene argC), while in fungi it is part of a bifunctional mitochondrial enzyme (gene ARG5,6, arg11 or arg-6) which contains a N-terminal acetylglutamate kinase () domain and a C-terminal AGPR domain. In the Escherichia coli enzyme, a cysteine has been shown to be implicated in the catalytic activity, and the region around this residue is well conserved.This entry represents the more common of two related families of N-acetyl-gamma-glutamyl-phosphate reductase, an enzyme catalyzing the third step or Arg biosynthesis from Glu. The two families differ by phylogeny, similarity clustering, and the gap architecture in a multiple sequence alignment. Bacterial members of this family tend to be found within Arg biosynthesis operons. This family also includes LysY (LysW-L-2-aminoadipate/LysW-L-glutamate phosphate reductase), which is involved in both the arginine and lysine biosynthetic pathways. Several bacteria and archaea utilize the amino group-carrier protein, LysW, for lysine biosynthesis from alpha-aminoadipate (AAA). In some cases, such as Sulfolobus, LysW is also used to protect the amino group of glutamate in arginine biosynthesis. After LysW modification, AAA and glutamate are converted to lysine and ornithine, respectively, by a single set of enzymes with dual functions []. LysY is the third enzyme in lysine biosynthesis from AAA []. LysY shows high sequence identity and functional similarities with ArgC, and they are considered to have evolved from a common ancestor [, ].
Protein Domain
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Glycosyltransferase family 35 comprises enzymes with only one known activity; glycogen and starch phosphorylase (). The main role of glycogen phosphorylase (GPase) is to provide phosphorylated glucose molecules (G-1-P) []. GPase is a highly regulated allosteric enzyme. The net effect of the regulatory site allows the enzyme to operate at a variety of rates; the enzyme is not simply regulated as "on"or "off", but rather it can be thought of being set to operate at an ideal rate based on changing conditions at in the cell. The most important allosteric effector is the phosphate molecule covalently attached to Ser14.This switches GPase from the b (inactive) state to the a (active) state. Upon phosphorylation, GPase attains about 80% of its Vmax. When the enzyme is not phosphorylated, GPase activity is practically non-existent at low AMP levels.There is some apparent controversy as to the structure of GPase. All sources agree that the enzyme is multimeric, but there is apparent controversy as to the enzyme being a tetramer or a dimer. Apparently, GPase (in the aform) forms tetramers in the crystal form. The consensus seems to be that `regardless of the a or b form, GPase functions as a dimer in vivo[]. The GPase monomer is best described as consisting of two domains, an N-terminal domain and a C-terminal domain []. The C-terminal domain is often referred to as the catalytic domain. It consists of a β-sheet core surrounded by layers of helical segments []. The vitamin cofactor pyridoxal phosphate (PLP) is covalently attached to the amino acid backbone. The N-terminal domain also consists of a central β-sheet core and is surrounded by layers of helical segments. The N-terminal domain contains different allosteric effector sites to regulate the enzyme.Bacterial phosphorylases follow the same catalytic mechanisms as their plant and animal counterparts, but differ considerably in terms of their substrate specificity and regulation. The catalytic domains are highly conserved while the regulatory sites are only poorly conserved. For maltodextrin phosphorylase from Escherichia coli the physiological role of the enzyme in the utilisation of maltidextrins is known in detail; that of all the other bacterial phosphorylases is still unclear. Roles in regulatuon of endogenous glycogen metabolism in periods of starvation, and sporulation, stress response or quick adaptation to changing environments are possible [].
Protein Domain
Type: Family
Description: The biosynthesis of disaccharides, oligosaccharides and polysaccharides involves the action of hundreds of different glycosyltransferases. These enzymes catalyse the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. A classification of glycosyltransferases using nucleotide diphospho-sugar, nucleotide monophospho-sugar and sugar phosphates ([intenz:2.4.1.-]) and related proteins into distinct sequence based families has been described []. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. The same three-dimensional fold is expected to occur within each of the families. Because 3-D structures are better conserved than sequences, several of the families defined on the basis of sequence similarities may have similar 3-D structures and therefore form 'clans'.Proteins in this entry are glycosyltransferases with phosphorylase activities. Members use phosphate to break alpha 1,4 linkages between pairs of glucose residues at the end of long glucose polymers, releasing alpha-D-glucose 1-phosphate. The nomenclature convention is to preface the name according to the natural substrate, as in glycogen phosphorylase, starch phosphorylase, maltodextrin phosphorylase, etc.The main role of glycogen phosphorylase (GPase) is to provide phosphorylated glucose molecules (G-1-P) []. GPase is a highly regulated allosteric enzyme. The net effect of the regulatory site allows the enzyme to operate at a variety of rates; the enzyme is not simply regulated as "on"or "off", but rather it can be thought of being set to operate at an ideal rate based on changing conditions at in the cell. The most important allosteric effector is the phosphate molecule covalently attached to Ser14. This switches GPase from the b (inactive) state to the a (active) state. Upon phosphorylation, GPase attains about 80% of its Vmax. When the enzyme is not phosphorylated, GPase activity is practically non-existent at low AMP levels.There is some apparent controversy as to the structure of GPase. All sources agree that the enzyme is multimeric, but there is apparent controversy as to the enzyme being a tetramer or a dimer. Apparently, GPase (in the aform) forms tetramers in the crystal form. The consensus seems to be that `regardless of the a or b form, GPase functions as a dimer in vivo[]. The GPase monomer is best described as consisting of two domains, an N-terminal domain and a C-terminal domain []. The C-terminal domain is often referred to as the catalytic domain. It consists of a β-sheet core surrounded by layers of helical segments []. The vitamin cofactor pyridoxal phosphate (PLP) is covalently attached to the amino acid backbone. The N-terminal domain also consists of a central β-sheet core and is surrounded by layers of helical segments. The N-terminal domain contains different allosteric effector sites to regulate the enzyme.Bacterial phosphorylases follow the same catalytic mechanisms as their plant and animal counterparts, but differ considerably in terms of their substrate specificity and regulation. The catalytic domains are highly conserved while the regulatory sites are only poorly conserved. For maltodextrin phosphorylase from Escherichia coli the physiological role of the enzyme in the utilisation of maltidextrins is known in detail; that of all the other bacterial phosphorylases is still unclear. Roles in regulatuon of endogenous glycogen metabolism in periods of starvation, and sporulation, stress response or quick adaptation to changing environments are possible [].
Protein Domain
Type: Domain
Description: This entry represents LIM-type zinc finger (Znf) domains. LIM domains coordinate one or more zinc atoms, and are named after the three proteins (LIN-11, Isl1 and MEC-3) in which they were first found. They consist of two zinc-binding motifs that resemble GATA-like Znf's, however the residues holding the zinc atom(s) are variable, involving Cys, His, Asp or Glu residues. LIM domains are involved in proteins with differing functions, including gene expression, and cytoskeleton organisation and development [, ]. Protein containing LIM Znf domains include:Caenorhabditis elegans mec-3; a protein required for the differentiation of the set of six touch receptor neurons in this nematode.C. elegans. lin-11; a protein required for the asymmetric division of vulval blast cells.Vertebrate insulin gene enhancer binding protein isl-1. Isl-1 binds to one of the two cis-acting protein-binding domains of the insulin gene.Vertebrate homeobox proteins lim-1, lim-2 (lim-5) and lim3.Vertebrate lmx-1, which acts as a transcriptional activator by binding to the FLAT element; a beta-cell-specific transcriptional enhancer found in the insulin gene.Mammalian LH-2, a transcriptional regulatory protein involved in the control of cell differentiation in developing lymphoid and neural cell types.Drosophila melanogaster (Fruit fly) protein apterous, required for the normal development of the wing and halter imaginal discs.Vertebrate protein kinases LIMK-1 and LIMK-2.Mammalian rhombotins. Rhombotin 1 (RBTN1 or TTG-1) and rhombotin-2 (RBTN2 or TTG-2) are proteins of about 160 amino acids whose genes are disrupted by chromosomal translocations in T-cell leukemia.Mammalian and avian cysteine-rich protein (CRP), a 192 amino-acid protein of unknown function. Seems to interact with zyxin.Mammalian cysteine-rich intestinal protein (CRIP), a small protein which seems to have a role in zinc absorption and may function as an intracellular zinc transport protein.Vertebrate paxillin, a cytoskeletal focal adhesion protein.Mus musculus (Mouse) testin which should not be confused with rat testin which is a thiol protease homologue (see ).Helianthus annuus (Common sunflower) pollen specific protein SF3.Chicken zyxin. Zyxin is a low-abundance adhesion plaque protein which has been shown to interact with CRP.Yeast protein LRG1 which is involved in sporulation [].Saccharomyces cerevisiae (Baker's yeast) rho-type GTPase activating protein RGA1/DBM1.C. elegans homeobox protein ceh-14.C. elegans homeobox protein unc-97.S. cerevisiae hypothetical protein YKR090w.C. elegans hypothetical proteins C28H8.6.These proteins generally contain two tandem copies of the LIM domain in their N-terminal section. Zyxin and paxillin are exceptions in that they contain respectively three and four LIM domains at their C-terminal extremity. In apterous, isl-1, LH-2, lin-11, lim-1 to lim-3, lmx-1 and ceh-14 and mec-3 there is a homeobox domain some 50 to 95 amino acids after the LIM domains.LIM domains contain seven conserved cysteine residues and a histidine. The arrangement followed by these conserved residues is:C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD]LIM domains bind two zinc ions []. LIM does not bind DNA, rather it seems to act as an interface for protein-protein interaction.
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important superfamily of transcription regulators that are involved in widely diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members of the superfamily include the steroid hormone receptors and receptors for thyroid hormone, retinoids, 1,25-dihydroxy-vitamin D3 and a variety of other ligands []. The proteins function as dimeric molecules in nuclei to regulate the transcription of target genes in a ligand-responsive manner [, ]. In addition to C-terminal ligand-binding domains, these nuclear receptors contain a highly-conserved, N-terminal zinc-finger that mediates specific binding to target DNA sequences, termed ligand-responsive elements. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity.NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes, hormone resistance syndromes, etc. While several NRs act as ligand-inducible transcription factors, many do not yet have a defined ligand and are accordingly termed 'orphan' receptors. During the last decade, more than 300 NRs have been described, many of which are orphans, which cannot easily be named due to current nomenclature confusions in the literature. However, a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The retinoic acid receptors (RAR) belong to the large family of ligand responsive gene regulatory proteins that includes receptors for steroid andthyroid hormones. These proteins contain two highly conserved domainsthat are involved in determining their DNA and ligand-binding activities []. Three distinct RARs have been identified (termed RAR alpha, beta, and gamma) and are encoded by genes on separate chromosomes. Additional isoforms of the receptors have been described, all of which differ in the N-terminal regions. Comparison of the amino acid sequences of human and mouse RARs indicates that interspecies conservation in members of the RAR subfamily (either alpha, beta or gamma) is much higher than conservation of thereceptors within species []. These observations indicate that RAR-alpha, -beta and -gamma may perform specific functions. hRAR-gamma RNA has been shown to be the predominant RAR RNA species in human skin, suggestingthat hRAR-gamma mediates some of the retinoid effects in this tissue []. The crystal structure of the ligand-binding domain (LBD) of the hRAR-gammabound to all-trans retinoic acid has been determined to 2.0A resolution [].Overall, the fold is similar to that of the human RXR-alpha apo-LBD, exceptfor the C-terminal portion, which folds back towards the LBD core, contributing to the hydrophobic ligand pocket and `sealing' its entry site. A `mouse trap' mechanism is thus proposed, whereby a ligand-induced conformational transition re-positions the amphipathic α-helix of theactivating domain and forms a transcriptionally active receptor []. Synonym(s): 1B nuclear receptor
Protein Domain
Type: Family
Description: Steroid or nuclear hormone receptors (NRs) constitute an important super-family of transcription regulators that are involved in diverse physiological functions, including control of embryonic development, cell differentiation and homeostasis. Members include the steroid hormone receptors and receptors for thyroid hormone, retinoids and 1,25-dihydroxy-vitamin D3. The proteins function as dimeric molecules in the nucleus to regulate the transcription of target genes in a ligand-responsive manner [, ]. NRs are extremely important in medical research, a large number of them being implicated in diseases such as cancer, diabetes and hormone resistance syndromes. Many do not yet have a defined ligand and are accordingly termed "orphan"receptors. More than 300 NRs have been described to date and a new system has recently been introduced in an attempt to rationalise the increasingly complex set of names used to describe superfamily members.The androgen receptor (AR) consists of 3 functional and structural domains: an N-terminal (modulatory) domain; a DNA binding domain () that mediates specific binding to target DNA sequences (ligand-responsive elements); and a hormone binding domain. The N-terminal domain (NTD) is unique to the androgen receptors and spans approximately the first 530 residues; the highly-conserved DNA-binding domain is smaller (around 65 residues) and occupies the central portion of the protein; and the hormone ligand binding domain (LBD) lies at the receptor C terminus. In the absence of ligand, steroid hormone receptors are thought to be weakly associated with nuclear components; hormone binding greatly increases receptor affinity. The LBDs of steroid hormone receptors fold into 12 helices that form a ligand-binding pocket. When an agonist is bound, helix 12 folds over the pocket to enclose the ligand []. When an antagonist is unbound, helix 12 is positioned away from the pocket in a way that interferes with the binding of coactivators to a groove in the hormone-binding domain formed after ligand binding. In AR, ligand binding that induces folding of helix 12 to overlie the pocket discloses a groove that binds a region of the NTD. Coactivator molecules can also bind to this groove, but the predominant site for coactivator binding to AR is in the NTD. AR ligand resides in a pocket and primarily contacts helices 4, 5, and 10. The DNA-binding region includes eight cysteine residues that form two coordination complexes, each composed of four cysteines and a Zn2+ion. These two zinc fingers form the structure that binds to the major groove of DNA. The second zinc finger stabilises the binding complex by hydrophobic interactions with the first finger and contributes to specificity of receptor DNA binding.It is also necessary for receptor dimerisation that occurs during DNA bindingDefects in the androgen receptor cause testicular feminisation syndrome,androgen insensibility syndrome (AIS) [, ]. AIS may be complete (CAIS), where external genitalia are phenotypically female; partial (PAIS), where genitalia are substantively ambiguous; or mild (MAIS), where external genitalia are normal male, or nearly so. Defects in the receptor also cause X-linked spinal and bulbar muscular atrophy (also known as Kennedy's disease).
Protein Domain
Type: Family
Description: G protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a wide range of functions, including various autocrine, paracrine and endocrine processes. They show considerable diversity at the sequence level, on the basis of which they can be separated into distinct groups []. The term clan can be used to describe the GPCRs, as they embrace a group of families for which there are indications of evolutionary relationship, but between which there is no statistically significant similarity in sequence []. The currently known clan members include rhodopsin-like GPCRs (Class A, GPCRA), secretin-like GPCRs (Class B, GPCRB), metabotropic glutamate receptor family (Class C, GPCRC), fungal mating pheromone receptors (Class D, GPCRD), cAMP receptors (Class E, GPCRE) and frizzled/smoothened (Class F, GPCRF) [, , , , ]. GPCRs are major drug targets, and are consequently the subject of considerable research interest. It has been reported that the repertoire of GPCRs for endogenous ligands consists of approximately 400 receptors in humans and mice []. Most GPCRs are identified on the basis of their DNA sequences, rather than the ligand they bind, those that are unmatched to known natural ligands are designated by as orphan GPCRs, or unclassified GPCRs [].GPCR Fungal pheromone mating factor receptors form a distinct family of G-protein-coupled receptors, and are also known as Class D GPCRs.The Fungal pheromone mating factor receptors STE2 and STE3 are integral membrane proteins that may be involved in the response to mating factors on the cell membrane [, , ]. The amino acid sequences of both receptors contain high proportions of hydrophobic residues grouped into 7 domains,in a manner reminiscent of the rhodopsins and other receptors believed tointeract with G-proteins. However, while a similar 3D framework has been proposed to account for this, there is no significant sequence similarity either between STE2 and STE3, or between these and the rhodopsin-type family: the receptors thereofore bear their own unique '7TM' signatures which is why they have been given their own GPCR group: Class D Fungal mating pheromone receptors.The STE3 gene in Saccharomyces cerevisiae is the cell-surface receptor that binds the13-residue lipopeptide a-factor. Several related fungal pheromone receptorsequences are known: these include pheromone B alpha 1 and B alpha 3, andpheromone B beta 1 receptors from Schizophyllum commune; pheromone receptor1 from Ustilago hordei; and pheromone receptors 1 and 2 from Ustilago maydis.Members of the family share about 20% sequence identity.U. maydis, a tetrapolar fungal species, has two genetically unlinkedloci that encode the distinct mating functions of cell fusion (the a locus)and subsequent sexual development and pathogenicity (the b locus) [].The a locus exists in two alleles, the mating type in each of which isdetermined by a set of two genes; one encodes a precursor for a lipopeptidemating factor, while the other specifies the receptor for the pheromonesecreted by cells of opposite mating type []. U. maydis thus employs anovel strategy to determine its mating type by providing the primarydeterminants of cell-cell recognition directly from the mating type locus[]. The bipolar species, U. hordei, contains both a and b loci;physical linkage of these loci in this bipolar fungus accounts for itsdistinct mating system [].This entry represents mating-type a receptors.
Protein Domain
Type: Family
Description: This entry represents the accessory gene regulator protein B (AgrB) family. Proteins in this family include AgrB from Staphylococcus aureus and FsrB from Enterococcus faecalis. The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide(AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches athreshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein []involved in the proteolytic processing of AgrD, and may have both proteolytic and transporter activities, facilitating the export ofthe processed AgrD peptide []. FsrB may be involved in the proteolytic processing of a quorum sensing system signal molecule precursor required for the regulation of the virulence genes for gelatinase (gelE) and a serine protease (sprE) [].A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families []. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate.Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitiveto the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues []. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrell. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel []. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.
Protein Domain
Type: Domain
Description: Arteriviruses are enveloped, positive-stranded RNA viruses and includepathogens of major economic concern to the swine- and horse-breedingindustries:Equine arteritis virus (EAV).Porcine reproductive and respiratory syndrome virus (PRRSV).Mice actate dehydrogenase-elevating virus.Simian hemorrhagic fever virus.The arterivirus replicase gene is composed of two open reading frames (ORFs).ORF1a is translated directly from the genomic RNA, whereas ORF1b can beexpressed only by ribosomal frameshifting, yelding a 1ab fusion protein. Bothreplicase gene products are multidomain precursor proteins which areproteolytically processed into functional nonstructural proteins (nsps) by acomplex proteolytic cascade that is directed by four (PRRSV/LDV) or three(EAV) proteinase domains encoded in ORF1a. The arterivirus replicaseprocessing scheme involves the rapid autoproteolytic release of two or threeN-terminal nsps (nsp1 (or nsp1alpha/1beta) and nsp2) and thesubsequent processing of the remaining polyproteins by the "main protease"residing in nsp4, together resulting in a set of 13or 14 individual nsps. The arterivirus nsp1 region contains a tandem ofpapain-like cysteine autoprotease domains (PCPalpha and PCPbeta), but in EAVPCPalpha has lost its enzymatic activity, resulting in the 'merge' ofnsp1alpha and nsp1beta into a single nsp1 subunit. Thus, instead of threeself-cleaving N-terminal subunits, EAV has two: nsp1 and nsp2. The PCPalphaand PCPbeta domains mediate the nsp1alpha|1beta and nsp1beta|2 cleavages,respectively. The catalytic dyad of PCPalpha and PCPbeta domains is composedof Cys and His residues. In EAV, a Lys residue is found in place of thecatalytic Cys residue, which explains the proteolytic deficiency of the EAVPCPalpha domain [, , , ]. The PCPalpha and PCPbeta domains form respectively MEROPSpeptidase families C31 and C32.The PCPalpha and PCPbeta domains have a typical papain fold, which consists ofa compact global region containing sequentially connected left (L) and right(R) parts in a so-called standard orientation. The L subdomain of PCPalphaconsists of four α-helices, while the R subdomain is formed by threeantiparallel beta strands []. The L subdomain of the PCBbetaconsists of three α-helices, while the R subdomain is formed by fourantiparallel β-strands []. The Cys and His residues faceeach other at the L-R interface and form the catalytic centre of the PCPalphaand PCPbeta domains [, ].This entry represents the PCPalpha domain (peptidase C31).A cysteine peptidase is a proteolytic enzyme that hydrolyses a peptide bond using the thiol group of a cysteine residue as a nucleophile. Hydrolysis involves usually a catalytic triad consisting of the thiol group of the cysteine, the imidazolium ring of a histidine, and a third residue, usually asparagine or aspartic acid, to orientate and activate the imidazolium ring. In only one family of cysteine peptidases, is the role of the general base assigned to a residue other than a histidine: in peptidases from family C89 (acid ceramidase) an arginine is the general base. Cysteine peptidases can be grouped into fourteen different clans, with members of each clan possessing a tertiary fold unique to the clan. Four clans of cysteine peptidases share structural similarities with serine and threonine peptidases and asparagine lyases. From sequence similarities, cysteine peptidases can be clustered into over 80 different families []. Clans CF, CM, CN, CO, CP and PD contain only one family.Cysteine peptidases are often active at acidic pH and are therefore confined to acidic environments, such as the animal lysosome or plant vacuole. Cysteine peptidases can be endopeptidases, aminopeptidases, carboxypeptidases, dipeptidyl-peptidases or omega-peptidases. They are inhibited by thiol chelators such as iodoacetate, iodoacetic acid, N-ethylmaleimide or p-chloromercuribenzoate.Clan CA includes proteins with a papain-like fold. There is a catalytic triad which occurs in the order: Cys/His/Asn (or Asp). A fourth residue, usually Gln, is important for stabilising the acyl intermediate that forms during catalysis, and this precedes the active site Cys. The fold consists of two subdomains with the active site between them. One subdomain consists of a bundle of helices, with the catalytic Cys at the end of one of them, and the other subdomain is a β-barrel with the active site His and Asn (or Asp). There are over thirty families in the clan, and tertiary structures have been solved for members of most of these. Peptidases in clan CA are usually sensitive to the small molecule inhibitor E64, which is ineffective against peptidases from other clans of cysteine peptidases [].Clan CD includes proteins with a caspase-like fold. Proteins in the clan have an α/β/α sandwich structure. There is a catalytic dyad which occurs in the order His/Cys. The active site His occurs in a His-Gly motif and the active site Cys occurs in an Ala-Cys motif; both motifs are preceded by a block of hydrophobic residues []. Specificity is predominantly directed towards residues that occupy the S1 binding pocket, so that caspases cleave aspartyl bonds, legumains cleave asparaginyl bonds, and gingipains cleave lysyl or arginyl bonds.Clan CE includes proteins with an adenain-like fold. The fold consists of two subdomains with the active site between them. One domain is a bundle of helices, and the other a β-barrell. The subdomains are in the opposite order to those found in peptidases from clan CA, and this is reflected in the order of active site residues: His/Asn/Gln/Cys. This has prompted speculation that proteins in clans CA and CE are related, and that members of one clan are derived from a circular permutation of the structure of the other.Clan CL includes proteins with a sortase B-like fold. Peptidases in the clan hydrolyse and transfer bacterial cell wall peptides. The fold shows a closed β-barrel decorated with helices with the active site at one end of the barrel []. The active site consists of a His/Cys catalytic dyad.Cysteine peptidases with a chymotrypsin-like fold are included in clan PA, which also includes serine peptidases. Cysteine peptidases that are N-terminal nucleophile hydrolases are included in clan PB. Cysteine peptidases with a tertiary structure similar to that of the serine-type aspartyl dipeptidase are included in clan PC. Cysteine peptidases with an intein-like fold are included in clan PD, which also includes asparagine lyases.