вход по аккаунту


eTBlastЧIt's Only Words and Words Are All I Have Е.

код для вставкиСкачать
Web Sites
eTBlast—It’s Only Words, and
Words Are All I Have …
The body of scientific literature has
grown so large that one has to use specialized search software to filter out relevant information. Chemically oriented
search programs, such as SciFinder, use
formulae for searches, whereas in the
searches are still text based. For work
with chemical formulae, the customary
approach has been not to search for a
strictly defined compound, but rather
to use an appropriate set of parameters—a well-defined substituent here, a
joker at another position—to include
related structures in the search. In a
text-based search (such as the widely
used PubMed search), however, an
exact match with the query term is
required; if the term is not present in
the form indicated, the document
will—annoyingly—not be retrieved.
Thus, germane articles can easily slip
through the searchers fingers.
A new search engine, eTBlast,
promises to remedy this situation.
eTBlast is a search software package
developed in the research group of
Harold Garner at the University of
Texas Southwestern Medical Center at
Dallas. The idea behind the program is
captivating: The user no longer enters
a single search word, but instead a
whole paragraph of text, such as an article abstract (Figure 1). There is no need
to think too hard about what you are
really looking for—eTBlast will think
for you: The program will automatically
extract the relevant search items from
your text and weight them, scan the
approximately 12 million entries in the
Medline database, and sort the hits by
similarity. The text should contain 200–
500 words and can either be uploaded
as a file (in text format only) or pasted
into a mask. The first quick search
yields a long result list after a two- to
five-minute wait; by using this list
eTBlast can then perform a more
exhaustive iterative search, whereby
you have numerous options for tailoring
the search to your needs. Thus, you can
select the scoring criteria and the publication type, flag search items that all
retrieved articles have to include,
choose from several predefined stop
lists (i.e., lists containing words to be
ignored in the search) or upload your
own, and extend the search by using
medical synonyms.
This multitude of options is almost
too much of a good thing and makes it
easy to lose perspective in the search.
Because of the time lag before the
results are received (they are e-mailed
back to the user, sometimes with considerable delay), optimization becomes
cumbersome. In a trial run, I found the
value of the medical synonyms that
eTBlast offers to be questionable, as
one can not influence which synonyms
are used. My text was about “H3” (histone 3), which by way of an alleged synonym appeared as the “medical object”
fumagillin, a compound I was unacquainted with and which turned out to
have nothing whatsoever to do with histone 3. This is likely to happen frequently, and improvements will be necessary before this option can be used
sensibly. Another concern is that the
technical support of eTBlast is in
urgent need of improvement; several
requests to the contact address provided
“for comments, suggestions, or complaints” simply remained unanswered.
In summary, it is worth trying
eTBlast if you are trying to find your
way into a new topic, whereas for daily
routine searches the traditional keyword
method is likely to remain the standard.
It is also worth checking whether other,
better-adapted text-mining tools for
your particular field of interest already
exist, such as Textpresso[1] for those
working with the model organism C. elegans.
Christoph Weise
Freie Universitt Berlin (Germany)
For further information visit:
or contact
Figure 1. eTBlast homepage.
2005 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
DOI: 10.1002/anie.200462772
Angew. Chem. Int. Ed. 2005, 44, 182
Без категории
Размер файла
132 Кб
word, etblastчit
Пожаловаться на содержимое документа