snousle: (castrocauda)
[personal profile] snousle
If you are curious about what all those darned genes you hear about are doing, it is now possible to interpret the results of gene expression experiments yourself, easily, by cutting and pasting on the Web. It takes a little savvy but if nothing goes wrong it only takes a few minutes to do.

1. Find journal articles that contain lists of genes singled out for one reason or another in scientific experiments. Search for terms like "highly expressed genes" or "list of genes expressed" (including the quotes). Gene names are short codes, which are sometimes whimsical, sometimes impenetrable, and wildly inconsistent from one data source to another. You'll know them when you see them. Fortunately the inconsistencies will be filtered out for you later.

Typical gene names: 9230117N10Rik, BC055107, Crxos1, Smug1, Bambi

You're looking for lists of 20 to 100 genes that all "did something", either turning on or turning off their activity in response to some stimulus. The question is, why did they do that? Read the abstract of the article, but don't worry if you don't understand it completely. All you need to find out is what species these genes belong to. Mouse or human will be the easiest to get results for.

Then go to the following web site:

http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment

Paste your list of genes into the "gene products" list. Make sure it's just gene names separated by whitespace. Don't worry about the "background set", that's more advanced.

Select the appropriate database. Click on the ? icon to see a list of what the supported gene databases cover. You'll want to pick a database based on a species that is at least close to the gene set you're analyzing. "Multi-Species" means you'll have to do more work to find out which species are available. There is considerable overlap between gene names in different species but you'll get better results if the species match exactly. If you can't find the right database for your gene list, go find a new gene list from a species you see in the database list.

Ignore all the other settings and click "Submit Query". You'll get an interesting report that identifies statistically overrepresented themes in your gene list. There may not be any to be had; this analysis doesn't always detect anything.

Picking a list at random, I found a hit for the Gene Ontology category "GO:0051179 localization ". Of all the genes in the mouse genome, 15% were in some way linked to this category, but in the list I provided, 53% of the genes were. The probability of getting a result like this by chance was less than 1 in 1000 - a "statistically significant" result. From each result like this, you can click out into a universe of supporting information, right down to the abstracts of the scientific papers that support your result. Why are all these genes involved in localization? Who did the experiments that determined this? Where were they published, and when? It's all at your fingertips.

This is what I do for a living, albeit on an industrial scale. My job is to create tools like this. Unfortunately they have all been proprietary and I can't show them to you - they're hugely diverse, but one like this would be typical. It's really very nice to have resources like this accessible to everybody, even curious amateurs. If you turn out to be unusually good with this, and have a sense that you know what's going on as you navigate the results of your analysis, and want to pursue this further, let me know and I will hook you up with the right people. ;-)

Profile

snousle: (Default)
snousle

August 2013

S M T W T F S
    123
45 678910
11121314151617
1819202122 2324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 9th, 2026 06:13 pm
Powered by Dreamwidth Studios