HelpView

Search Help:
Highlight search term(s)
                
Ensembl release 50 - Jul 2008
HOME · SITEMAP Contact helpdesk
 

Ensembl GeneSNPView

Introduction

Ensembl 'GeneSNPView' provides detailed information about Simple Nucleotide Polymorphisms (SNPs) as genetic variation in general and Pfam domains in and around the exons of a particular Ensembl gene model prediction.

Genetic variation annotation includes location, alleles, classification, and effects on individual transcripts predicted for a gene model.

Ensembl Gene Variation Report

The 'Ensembl Gene Variation Report' at the top of the page provides the following information about the gene.

  • Gene - Gene names are obtained by comparing its translations to entries in UniProt/Swiss-Prot, NCBI RefSeq and UniProt/TrEMBL. For further details see the 'Similarity Matches' section of 'TranscriptReports' on Ensembl 'GeneView' pages. A single name is selected for display, and the source is shown. Preferentially, a gene symbol approved by the species-specific nomenclature committee is chosen (e. g. the appropriate HUGO HGNC symbol for human). Failing that, a UniProt/Swiss-Prot identifier, a NCBI RefSeq accession number or a UniProt/TrEMBL accession number is assigned.

  • Ensembl Gene ID - Ensembl stable gene identifiers are mapped between releases. In case a gene model changes dramatically, the old stable identifier may be retired and a new one assigned. However, the Ensembl Archive tracks all stable identifiers and should provide mappings to the current gene predictions.

  • Genomic Location

    • Top-level - The top-level sequence (e. g. chromosome) location this gene has been annotated on is indicated. A link to 'ContigView' zooms into a region corresponding to the gene.

    • Sequence-level - The sequence-level (e. g. BAC clone) location the start of this gene has been annotated on is indicated. A gene prediction may span more than one sequence-level entities. A link to 'ContigView' zooms into a region covered by the sequence-level entity.

    In either case a green background in the 'Overview' and 'Detailed View' panels will highlight the gene.

  • Description - a short description of the gene. The description of the gene is taken from the UniProt/Swiss-Prot entry the predicted gene is mapped to, or, if none available, from a NCBI RefSeq or UniProt/TrEMBL entry. If a predicted gene has not been mapped to any such entries, no description will be given. If its translation is however part of an Ensembl protein family cluster, the consensus annotation of that family may be informative. See the links to Ensembl 'FamilyView' in the 'Protein Family' section on Ensembl 'ProteinView' pages.

Gene SNP Graph

A graphical display shows genetic variation information in genomic context. Ensembl known and novel transcript model predictions are displayed in red and black, respectively, Pfam domains are shown in blue and SNP information is displayed colour coded as indicated by the key provided.

Whenever a SNP affects STOP codons either by introducing new ones (e.g. A->*) or removing existing ones (e.g. *->Q) this is indicated in red. STOP codons are thereby denoted by asterisks (*). Amino acid changes for non-synonymous SNPs are indicated with a forward slash (e.g. S/C denotes a change from Serine to Cysteine).

A menu bar allows for customisation of the graphical display. Several menus are available.

  • Features

    • Vega Genes

    • Ensembl Genes

    • ncRNA Genes

    • EST Genes

    • RefSeq Proteins

  • Source - The source database the genetic variation information has been imported from. One of the sources will be selected as default for each species.

    • Sanger

    • dbSNP

  • SNP class

    • In-dels - Deletion/insertion polymorphisms (DIPs)

    • SNPs - Single nucleotide polymorphisms (SNPs)

    • Mixed variations - Reference SNP clusters contain submissions from two or more alleleic classes.

    • Micro-satellite repeats - Short tandem repeat (microsatellite) polymorphisms (STRs).

    • Named variations Insertion/deletion polymorphisms of named repetitive elements.

    • MNPs Multiple nucleotide polymorphism with alleles of common length greater than one base pair.

    • Heterozygous variations Variable genetic variations, but undefined at nucleotide level.

  • Validation

    • By frequency - At least one submitted SNP in a reference SNP cluster has frequency or genotype data submitted.

    • By cluster - For reference SNP (RefSNP, rs-prefixed accession number) clusters, which consist of more than two submitted SNPs (SubSNPs, ss-prefixed accession number) and more than one submission was assayed with a non-computational method. For submitted SNPs, if the method is non-computational.

    • By 2 hit 2 allele - Every allele has been observed in at least two chromosomes.

    • By other population -

    • No information

  • SNP type

    • Non-synonymous SNPs - SNPs that are located in the coding sequence and result in an amino acid change in the encoded peptide sequence.

    • Synonymous SNPs - In coding sequence, not resulting in an amino acid change (i.e. silent mutation).

    • Frameshift variations - In coding sequence, resulting in a frameshift.

    • Stop lost - In coding sequence, resulting in the loss of a stop codon.

    • Stop gained - In coding sequence, resulting in the gain of a stop codon (i.e. leading to a shortened peptide sequence).

    • Essential splice site - In the first 2 or the last 2 basepairs of an intron.

    • Splice site - 1-3 bps into an exon or 3-8 bps into an intron.

    • Upstream variations - Within 5 kb upstream of the 5'-end of a transcript.

    • Regulatory region variations - In regulatory region annotated by Ensembl.

    • 5' UTR variations - In 5' UTR (untranslated region).

    • Intronic variations - In intron.

    • 3' UTR variations - In 3' UTR.

    • Downstream variations - Within 5 kb downstream of the 3'-end of a transcript.

    • Intergenic variations - More than 5 kb either upstream or downstream of a transcript.

  • Context - The 'Context' menu allows to adjust the sequence context around exons and UTRs within which variations are displayed. All variations mapped to the genome sequence region and selected by criteria in 'SNP class', 'Variation' and 'SNP type' menus above are displayed in the SNPs track. Only those variations that fall into the sequence context window are drawn below the gene track, appear in the transcript model ideogram and have alleles annotated in colour-coded boxes. The settings in this 'Context' menu also affect the number of intron variations reported in the transcript specific tables below. To select, display and list all intron variations, 'Full introns' should be selected from the 'Context' menu.

    • 20, 50, 100, 200, 500, 1000, 2000, 5000 bp
    • Full introns
  • Export - Ensembl 'GeneSNPView' graph panels can be exported in several graphics file formats. Selecting options from this menu will include links that permit subsequent downloading in the following formats.

    • PDF - Portable Document Format

    • SVG - Scalable Vector Graphics

    • PostScript - PostScript page description language

  • Image size - Several image widths in pixels can be selected to optimally display the 'Gene SNP Graph' panel on the user's output device. Settings can be adjusted to accommodate the screen width or printer resolution.

    • 700, 900, 1200, 1500, 2000 px
  • Help - A link to the Ensembl 'GeneSNPView' on-line documentation - this document.

SNPs for Transcript (Peptide)

Depending on the number of transcripts predicted for the underlying gene model, one or more tables at the bottom provide a summary for dbSNP reference SNPs (rs) mapped to exon sequences of a particular transcript.

  • ID - The identifier of this particular genetic variation in an external database. Frequently, this will be an NCBI dbSNP reference SNP (rs) identifier.
  • class
  • alleles
  • ambiguity
  • SNP validation status
  • chromosome name
  • physical map location
  • SNP type
  • AA change
  • AA co-ordinate

Please note that if you follow a dbSNP identifier link to Ensembl 'SNPView' pages, the alleles and ambiguity codes shown there will reflect those of the NCBI dbSNP entry. Alleles on Ensembl 'GeneSNPView' pages might be complementary to Ensembl 'SNPView' and to dbSNP pages, depending on the chromosome strand the corresponding transcript maps to.

An [Export SNPs] link leads to 'BioMart' and allows dumping of genetic variation information in table form. Please be aware that SNPs exported via this link correspond to the table above and represent only those SNPs that were selected by the options from the menu bar above (e.g. Context, SNP Class, ...). All SNPs underlying a certain gene are available via the 'BioMart' data retrieval tool.

Since 'BioMart' contains only those SNPs that map to the genome sequence unambiguously, some SNPs from the 'GeneSNPView' table might be excluded from the data retrieval tool.

Note that on some of the variation display pages an 'N' is used where the allele is unknown.


The search box at the top of the page allows you to search for any identifier present in Ensembl. For detailed instructions see the Ensembl 'TextView' page.


 

© 2013 WTSI / EBI. Ensembl is available to download for public use - please see the code licence for details.