MouseMine now includes new marker/marker and allele/marker relationships from MGI.
Recently, MGI starting maintaining a number of new relationships among mouse markers and alleles that capture important biological information. Previously, for example, there was no explicit relationship between a transgenic allele and the gene(s) it expressed, or an miRNA cluster and its individual members. MGI now maintains several new types of such relationships, and MouseMine now loads this additional information from MGI. These data are stored in MouseMine in a way that closely mimics their organization in MGI, and they are segregated into their own classes.
The new relationships in MGI (and MouseMine) are represented as a combination of a pair of objects (e.g., the Hoxb cluster and the Hoxb1 gene), an ontology term describing a directed relationship (e.g., “has member”), plus a qualifier, evidence code, and reference. Specific types of relationship may add other attributes. For example, the relationship between an miRNA and a predicted target includes a numeric confidence score.
There are four new types of relationship:
- Cluster Has Member. Relationship between markers that represent clusters/complexes/regions and the markers that are members of those clusters. These relationships are represented by class MGIClusterHasMember and are displayed on marker report pages. See also: ComplexClusterRegion.members and SequenceFeature.genomicClusters. Example: The Hoxb cluster contains 10 members (see “Members” on its report page); conversely, Hoxb1 is a member of the Hoxb cluster (see “Genomic Clusters” on its report page).
- Expresses Component. Relationship between transgenic alleles and the genes that they express. Represented by class MGIExpressesComponent. See also: Allele.expresses and SequenceFeature.transgenicExpressors. Example: Col1a1<tm1(tetO-Lin28a)Gqda> is a transgenic knock-in expressing Lin28a. Note the “Expresses” section on the allele’s report page, and the “Transgenic Expressors” section on the gene’s page. One subtlety in MGI’s (and therefore MouseMine’s) representation concerns transgenic alleles that express non-mouse genes. In these cases, the relationship points to the mouse ortholog and the relationship term is “expresses ortholog of”. Additional attributes record the non-mouse gene and species. Example: The mouse transgenic allele Actb<tm1(INSR)Dac> expresses the human ortholog (INSR) of mouse Insr.
- Mutation Involves. Relationship between alleles representing genomic rearrangements (e.g. inversions or deletions) and the markers that are involved. Represented by class MGIMutationInvolves. See also: Allele.mutationInvolves and SequenceFeature.involvedInMutations. Example: The coloboma deletion region, Cm, involves 29 markers, including Hao1. Note the “Mutation Involves” section on the deletion’s report page, and the “Involved In Mutations” section on the gene’s report page.
- Interacts With. Relationship between microRNAs (miRNAs) and their targets. Additional attributes include whether the interaction is validated or predicted, the quality score (if predicted), and other attributes. Represented by class MGIInteractsWith. See also: SequenceFeature.targets and SequenceFeature.targetedBy. Example: Mir421 is predicted (with high confidence) to interact with Pax6. Note the “Targets” section on Mir421′s and the “Targeted By” section on Pax6′s. NOTE: unlike the other relationship types, there are millions of these due to the huge number of predicted interactions. Be sure to limit your queries appropriately.
We’ve upgraded our InterMine instance to version 1.5.3.
This update fixes numerous bugs, in particular, the broken login-from-Google functionality. In addition, there are several improvements in the functionality and layout of the “Manage Columns” and “Manage Filter” dialogs accessed from a results table. For details see https://github.com/intermine/intermine/releases.
Now you can read all about it!
MouseMine: A New Data Warehouse for MGI
Mammalian Genome. 2015 Aug;26(7-8):325-30
MouseMine now includes interaction data! We have started regular loads of gene-gene interactions and associated information from BioGrid and IntAct. Each interaction record includes the two genes involved, the role of each (e.g. bait or prey), the type of experiment and detection method (e.g. chromatin immunoprecipitation or affinity chromatography), publication, and more. You will find interaction data included in a new section called “Interactions” (!) on each gene detail page. For example, here’s the detail page for Pax6; the new Interactions section is about half way down the page.
In addition, we have defined several template queries for your convenience. One template supports the basic use case: show me the interactions for a given gene or list of genes. Another template (lifted and adapted from FlyMine – thanks guys!) finds genes that have a specific function and are known to interact with a specified gene or genes. Two other templates performs a similar queries based on phenotypes and expression: one finds genes that are associated with a phenotype and have been shown to interact with the specified gene(s); the other finds genes expressed in specific tissues (and times) that interact with the specified gene(s). Finally, if you want to build your own queries, we are using the standard InterMine data model and data loaders, so interaction data in MouseMine has the same organization as in other mines.
The 28th International Mammalian Genome Conference (IMGC) was recently held in Bar Harbor, Maine. We presented this poster showing one example of mining MGI data using MouseMine. The poster illustrates how lists and templates synergize to enable powerful work flows.
MouseMine poster presented at the International Mammalian Genome Conference 2014.
MouseMine celebrates it’s first anniversary with more updates to its data and infrastructure!
- Data Updates:
- Incorporating the new anatomy ontology EMAPA for gene expression
- Loading symbols and genome coordinates from ‘friendly mines’ (FlyMine, RatMine, WormMine, YeastMine, ZebrafishMine )
- Loading Panther data for: fruit fly (D melanogaster), rat (R. norvegicus), zebrafish (D. rerio), and yeast (S. cerevisiae)
- Loading Homologene data for: fruit fly (D melanogaster), rat (R. norvegicus), zebrafish (D. rerio), and yeast (S. cerevisiae), and worm (C. elegans)
- Infrastructure Updates:
- Powered by the latest version of Intermine 1.3
- Running on Tomcat 7.0
Expression. Mouse developmental gene expression data are now being loaded from GXD!
- GXD Expression Data. The Gene eXpression Database at MGI, is a curated database of endogenous gene expression in wild type and mutant mice. It includes detailed expression information from a wide variety of assay types, including RNA in situ, Immunohistochemistry, Northern blots, Western blots, and more. These expression data are imported into MouseMine in a simple, highly denormalized form. Each expression object is a statement of the form: an expression product of a particular gene (based on a particular probe in a particular assy, etc.) are or are not detected in a specific mouse (strain, mutations, sex, age), in a particular anatomical location.
- Anatomy. The mouse anatomy is currently represented by EMAP, a standardized ontology of mouse developmental anatomy. Each of the Theiler developmental stages is represented by a separate hierarchical description the mouse at that stage. In the future, we will be migrating our annotations to use the newer EMAPA. This should have little effect on the user interface.
- Templates. Three new templates have been defined to handle common queries. As always, the results can be dynamically filtered, sorted, customized, etc.
Phenotypes. Additional details being loaded for phenotype data and backing resources.
- Some phenotype data (for example from the IMPC) may be specific to one sex or the other. Sex-specificity is now represented in MouseMine as annotation extension values of the form “specific_to(female)” or “specific_to(male)”. Phenotype-related templates return this value by default, so you can easily sort/filter the results.
- Knowing the specific cell line(s) used to create the mouse that models a disease or shows specific phenotypes can be important. These relationships are now being loaded into MouseMine; each Genotype object has a collection of CellLines (Genotype.cellLines). There is currently about 800 mouse lines with this level of detail, but that number will grow due to international phenotyping efforts such as the IMPC.
- Fixed: the “withText” field of GO annotations now links to the correct places, e.g., a Genbank id in this field links to GenBank, a UniProt id links to UniProt, etc. Formerly, this field would link to the MouseMine detail page for an ontology evidence object – technically correct, but not useful.
- Fixed: browser history bug. If you went from a query results table to a detail page and then hit the back button, you didn’t return to the results.
- Updated InterMine to version 1.2.2.
- Upgraded to Tomcat 7.
- Additional testing. We enhanced our automated build process with an additional layer of acceptance testing, leading to greater assurance that MGI data are being correctly represented.
MouseMine has been updated! Well…the data are updated every week. (But you knew that.) This is a software update and is mostly bugfixes with a couple of new features as well. Here are the highlights:
- Updated code base to InterMine 1.2.1 (some of the following improvements are due to this upgrade).
- Lists: You can now do simple (asymmetric) list subtraction. Prior to this, only symmetric difference was supported.
- Results tables: Snappier paging performance. Improved layout of dialogs (Manage Columns, Manage Filters, Download).
- Ability to import templates and queries from files, in addition to textbox entry.
- Added “orientation” attribute to SyntenicRegion. This has a value of “+” or “-” and indicates whether the region and its partner have the same or opposite orientation.
- BUGFIX: creating a list from the keyword search would only pick up the first page of results.
- BUGFIX: special characters (like “<” in allele symbols) are escaped correctly everywhere. In some places (like the preview popups in the results tables) this was not the case.
- BUGFIX: when forwarding results to Galaxy, the display no longer says “FlyMine”.
- BUGFIX: some parent/child relationships were being lost in loading MEDIC.
- BUGFIX: preview popups in results tables would sometime extend beyond the screen.
MouseMine now includes mouse/human orthology and parology relationships loaded from Panther, we well as Homologene. For the user, this means the ability to cast a wider net when doing comparative queries, as well as increased consistency across the mines in the InterMOD consortium.
To see an example, go to the template Gene -> Homologs at MouseMine and click “Show Results”. As you can see, the results include associations (orthologies and parologies) from both Homologene and Panther. You can easily limit the view to one data set or the other: open the column summary for the dataset names, select one of the two data set names, then click “Filter”. (Can’t find the column summary? Mouse over the ‘Name’ column header and click the histogram icon.)
Other updates we just released:
- To help users interpret ontology annotations, evidence codes now include the full name. When a result table includes evidence codes, you can mouse over the code to see it.
- Report pages for GO, MP, and Genotypes now include external links to MGI. For GO and MP, there are links to MGI’s vocabulary browser. (Examples: B-cell homeostasis (GO), abnormal hemoglobin (MP)). For Genotypes, the links go to MGI’s phenotype and disease model report page (Example: Pax6<Sey>/Pax6<Sey>)
- The value “Not applicable” no longer displays. Many fields in MGI allow the value “Not applicable”. These have been changed to nulls in MouseMine. As such, they no longer display on report pages. In tables, they are displayed with the greyed-out “no value” text.
- Disease synonyms are now loaded from MGI. For example, searching by either “ALPS” and ”Canale-Smith Syndrome” will find OMIM:601859 (Autoimmune Lymphoproliferative Syndrome).
We are pleased to announce the first general public release of MouseMine!! MouseMine offers powerful new ways to access mouse data from MGI. For more information, see About MouseMine. To start exploring, go to the MouseMine home page.
Since the time of Caesar, the Ides of March has had a bad reputation. We aim to change all that with the first general public release of MouseMine, slated for March 15, 2013!