Help & Documentation

Non-programmatic data access

All data collected by the IMPC is freely availabe. Besides viewing it in the web portal, it can also be downloaded for independent analysis or use as is. Several channels are available, each tailored for accessing data for individual items, in small sets, or in bulk.

If you would like to access data programmatically, please see the next section.

Data access through the web portal

Data on individual items – such as genes or experiments – are summarized on their dedicated pages in the web portal (gene or chart pages, respectively). Tables on those pages are connected with export links. Those export links create files that can be opened in a spreadsheet. Similarly, images and charts can also be downloaded and saved to your computer.

Data access using the batch query tool

For summary data for small subsets of genes, the batch query tool provides a convenient interface that can be customized using a web form. This can be used to download phenotyping status, significant phenotypes, and other fields.

Please note, our batch query tool is currently under construction and will be substituted for an improved version soon.

Reports and data in bulk (FTP)

Snapshots of the entire dataset captured at the time of data releases are available for bulk download via FTP ( Starting with DR12, we have changed the structure of the FTP site in order to accommodate requests made by our users.

To go to the latest data release, please navigate to all-data-releases, then select the latest data relase. These are ordered from older to most recent. Alternatively, go to all-data-releases/latest, which will take you to the latest data release directly. To find the latest data and reports, please navigate to all-data-releases/latest/results.

The mapping of the old reports to the new reports is indicated in the README file within all-data-releases/latest/results. Additionally, we have added or updated README files at various levels throughout the FTP directory. If you can open files, these are more nicely formatted, otherwise use README.txt. Their content is otherwise identical.

The reports in the results folder are purpose-built reports intended for general use. The genotype-phenotype-assertions reports contain all the genotype to phenotype significant associations ascribed to Mammalian Phenotype (MP) ontology terms, based on the statistical analysis performed (using the OpenStat R package starting with DR12, previously the PhenStat R package). “ALL”, “IMPC”, “EuroPhenome”, “MGP” and “3I” in the file names indicate different projects. “ALL” encompasses all data in the “IMPC”, “EuroPhenome”, “MGP” and “3I” files.

For programmatic data access, see next section.