Bioinformatics At Home For Fun
Jul. 5th, 2011 11:02 amIf you're a curious/nerdy person or some combination thereof, there is a very interesting, free tool you can use to browse the world of biological knowledge. It's called CytoScape and is available at http://cytoscape.org/ . It performs network and pathway analysis on large databases of biological information, and provides ways to reason about biological processes when you have to deal with thousands of facts at a time. It doesn't exactly spoon-feed you, you have to learn a bit first, but it's amateur-friendly and doesn't require a Ph.D. to operate. Similar commercial packages run in the tens of thousands of dollars.

This is also the kind of tool used to manage large volumes of intelligence data, like at the CIA. You can parse primary evidence into sets of facts about people, organizations, places, and so forth, and discern large-scale relationships that help you infer (say) whether someone is likely to be a member of Al-Quaeda. Cytoscape itself is fairly neutral about what kind of information it's working with - it's slanted towards biology but there is no reason it couldn't represent other domains.
Obviously when misused it can support incorrect conclusions. A problem with this kind of tool is not just that its flashy graphics give "authority" to questionable reasoning, but that statistical significance becomes hard to calculate. The hard part of doing pattern recognition is not finding patterns - there's LOTS of them to be found in any experiment - but rather, tossing out the garbage so as to recognize the small fraction of them that are actually meaningful.
It's really cool to see something like this available for free, because in the past you had to either pay through the nose for an integrated, commercial tool, or cobble together systems from an open-source scrapheap of academic tools. In so many ways, this is the dawn of a golden age of analysis and visualization. All the foundational tools are coming together: advanced graphics, web services, data standards. The time is right for the whole field to catch on fire. I get so excited about it that it's hard to concentrate on actually getting things done.

This is also the kind of tool used to manage large volumes of intelligence data, like at the CIA. You can parse primary evidence into sets of facts about people, organizations, places, and so forth, and discern large-scale relationships that help you infer (say) whether someone is likely to be a member of Al-Quaeda. Cytoscape itself is fairly neutral about what kind of information it's working with - it's slanted towards biology but there is no reason it couldn't represent other domains.
Obviously when misused it can support incorrect conclusions. A problem with this kind of tool is not just that its flashy graphics give "authority" to questionable reasoning, but that statistical significance becomes hard to calculate. The hard part of doing pattern recognition is not finding patterns - there's LOTS of them to be found in any experiment - but rather, tossing out the garbage so as to recognize the small fraction of them that are actually meaningful.
It's really cool to see something like this available for free, because in the past you had to either pay through the nose for an integrated, commercial tool, or cobble together systems from an open-source scrapheap of academic tools. In so many ways, this is the dawn of a golden age of analysis and visualization. All the foundational tools are coming together: advanced graphics, web services, data standards. The time is right for the whole field to catch on fire. I get so excited about it that it's hard to concentrate on actually getting things done.
no subject
Date: 2011-07-05 11:51 pm (UTC)no subject
Date: 2011-07-08 03:28 pm (UTC)http://www.stephengrey.com/2011/07/mans-conquest-of-space/
"Signal Recognition in Man-Made Debris Patterns: Sniffing the Breeze in Space"