 |
GeneAlert - A sequence search results keywords parser.
With
the ever increasing amount of sequence data being produced, coupled with
researchers now investigating larger sequenced regions for gene and features
content, the results returned by similarity search programs can be overwhelming. A UNIX-based system for culling returned BLAST results to remove
excess information clutter and for arranging information in a more efficient
and user friendly way has been written and applied to genomic research.
This system, called Gene Alert, is used by the investigator to
reprocess the raw data returned from BLAST, re-ordering it from similarity
score based to a keyword score based list.
This processing allows the user to specify keywords and a weighting
for each that are used to assign importance to the many returned ‘hits’
found in a BLAST search. The system can exclude certain classes of information, such
as certain types of organisms, and ‘hits’ that occur and are not of interest
to the biologist/geneticist such as vector contamination ‘hits’.
The system runs automatically using a configuration file that contains
keywords and other parameters, customized to each researcher, and a user
query sequence file. BLAST and Gene Alert can be periodically and automatically
re-run, with significant results automatically e-mailed to the user.
This system is of particular utility for a researcher working on
a large number of projects, for which the computer will automatically look
only for items of interest to the scientist.
It is also of value to individual research efforts by amplifying
the importance of certain results contained within a myriad of returned
similarity ‘hits’, thus bringing to the attention of the biologist/geneticist
user information that may have ordinarily been overlooked.
This system has been applied to several genomic regions that have
been sequenced on human chromosome 3, 11 and 15 for scientists actively
involved in hunts for biologically or medically relevant genes.
This system is no longer supported by our group.
Flow chart of the Gene Alert PERL script:
|
 |
 |