relevance
<information science> A measure of how closely a given object (file, web
page, database record, etc.) matches a user's search for information.
The relevance algorithms used in most large web search engines today are based
on fairly simple word-occurence measurement: if the word "daffodil" occurs on a
given page, then that page is considered relevant to a query on the word
"daffodil"; and its relevance is quantised as a factor of the number of times
the word occurs in the page, on whether "daffodil" occurs in title of the page
or in its META keywords, in the first N words of the page, in a heading, and so
on; and similarly for words that a stemmer says are based on "daffodil".
More elaborate (and resource-expensive) relevance algorithms may involve
thesaurus (or synonym ring) lookup; e.g. it might rank a document about
narcissuses (but which may not mention the word "daffodil" anywhere) as relevant
to a query on "daffodil", since narcissuses and daffodils are basically the same
thing. Ditto for queries on "jail" and "gaol", etc.
More elaborate forms of thesaurus lookup may involve multilingual thesauri (e.g.
knowing that documents in Japanese which mention the Japanese word for
"narcissus" are relevant to your search on "narcissus"), or may involve thesauri
(often auto-generated) based not on equivalence of meaning, but on
word-proximity, such that "bulb" or "bloom" may be in the thesaurus entry for
"daffodil".
Word spamming essentially attempts to falsely increase a web page's relevance to
certain common searches.
See also subject index.
(1997-04-09)
Nearby terms:
release « released version « REL English «
relevance
» reliability » Reliability, Availability,
Serviceability » reliable communication
|