Writing Simple Queries


The information below introduces writing queries using Verity search features and covers:

The term Simple Queries refers to queries written in a syntax that can be interpreted by the Verity simple query parser, the parser used in most sample search forms and templates. Other types of queries are described in "Query-by-Example (QBE)" and "Internet-style Queries."

Simple Queries

A simple query uses words and phrases, separated by commas. To see documents about using text editors to create Web documents, start with a single-word query, such as:

editor

Your query finds all the documents that include the word "editor." However, this search would include not only documents about text editors, but also documents about people who are editors. (You don't have to specify the plural form, because a simple search includes stemmed variations, such as "editors.") Documents about the Web that did not include the word "editor" would not be retrieved.

For more specific results, enter several words or phrases, separated by commas, that describe the subject more precisely, such as:

text editor, document, web

Now your query finds documents that contain "text editor," "document," or "Web."

Case-sensitivity

The search engine attempts to match the case-sensitivity provided in the query expression, when mixed case is used. For search terms entered completely in lowercase or uppercase, the search engine looks for all mixed-case variations.

Search terms with mixed case automatically become case-sensitive. For example, the query of Apple behaves as if you had specified <case>Apple (which would find only the precise string Apple), while the query of apple finds all of the following: APPLE, Apple, apple.

A query all in uppercase does not turn on case-sensitive searching. The query of APPLE finds all of the following: APPLE, Apple, apple (as before).

The CASE modifier has the same effect as in previous releases. When used, the case-sensitivity of the query is preserved. For example, if you want to search for the term "OCX" and want to find instances of "OCX" in uppercase only, you could enter the following query:

<CASE> <WORD> OCX

The search engine would interpret the above query expression to mean: find all documents containing one or more instances of the word "OCX" spelled in uppercase, not mixed case.

Search Tips Online Guide

The Search Tips Online Guide provides users with advice on how to write queries that return relevant information. This guide, available in HTML only, describes many search techniques and contains numerous query examples.

To access the Search Tips Online Guide, select User Guide from the main menu bar, and then click the Search Tips hyperlink.

Verity Query Language

Make your queries more specific by using operators to combine the words you used for simple queries. Operators are special words that indicate logical relationships between the descriptive terms in your query.

Frequently Used Operator Names

The following table briefly describes the most frequently used operators. See Chapter 9, "Verity Query Language," for more information on operators.

Operator
Description
AND
Finds documents containing both words it joins.
OR
Finds documents containing either of the words it joins.
NOT
Finds documents containing the word preceding it and excludes documents containing the word that follows it.
<NEAR>
Finds documents containing words that are in the same general area, but may or may not be adjacent.
<PHRASE>
Finds documents containing phrases, words that are adjacent to each other.
, (comma)
Finds documents containing at least one of the words specified, ranking them using "the more, the better" approach, so documents with the most evidence of the words searched for are given the highest rank.

NOTE: AND, OR, and NOT are treated as operators by default, and do not require brackets. To use them as literal words, enclose them in double quotes. All other operators (except commas and quotation marks) must be placed within brackets.

Shorthand Notation

The following table briefly describes the shorthand notation for some additional operators.

Operator
Description
' (single quotes)
Placing a word in single quotation marks finds stemmed variations of the word. Example: the query 'edit' finds "edited," "editing," and "edition." This is the default mode. A search retrieves stemmed variations unless double quotes are used.
" (double quotes)
Placing a word in double quotation marks finds exact matches only, excluding stemmed variations of the word. Example: the query "edit" matches the word "edit" only, not the words "edited," "editing," or "edition."

Shorthand notation uses double or single quotation marks to enclose words or phrases. The Verity engine automatically assigns the MANY modifier to shorthand queries.

For example, the following shorthand query:

'dog'

is interpreted as the following query:

<Many><Stem>('dog')

And this second example of a shorthand query:

"state house"

is interpreted as the following query:

<Many><Phrase>(<Many><Word>('state'),<Many><Word>('house'))

See "Using Shorthand Notation" in Chapter 9 for more information.

Assigning Importance (Weights) to Search Terms

You can assign a weight to each search term in a query to indicate each search term's relative importance. The weight assignment is expressed as a number between 01 and 100, where 01 represents the very lowest importance rating and 100 represents the very highest importance rating.

To specify a weight with a search term, you enter the weight in brackets just before the search term, as shown below:

[50]test, [80]help

For the above example, the search engine looks for stemmed variations of the words "test" and "help" and assigns a weight of 50 to the term "test" and a weight of 80 to the term "help." Search results with the highest density of stemmed variations of the term "help" would receive the highest possible scores.

Using explicit syntax, you could enter a query expression with weights as follows:

<ACCRUE> ([50]<WORD>(test), [80]<WORD>(help))

Simple Query Examples

Using these examples, you can write queries that return exactly the information you want.

Finding Words

Most queries can be written by entering words and phrases separated by commas. If you were looking for information about the Web or about using laptop computers, you would enter:

web, laptop computers

This query returns documents that contain the terms "Web" (case doesn't matter in queries), "laptop computers," or both. Your results list displays a ranked list of documents; the most relevant documents are at the top of the list.

Finding Phrases

Perhaps you want to see documents that refer to a series of words that occur in a specific order, such as "Web publishing with HTML". You would enter the entire phrase:

web publishing with html

This query returns only documents that contain all of these words in the exact sequence you specified, including stemmed variations of the search terms.

Finding a Specific Subject

The simple query returned some documents about the Web, some documents about laptop computers, and some about both subjects. If your real interest is in accessing the Web using a laptop computer, use the AND operator for a more specific search. You would enter the following query:

web AND laptop computers

This query returns only documents that contain both "Web" and "laptop computers". This list is shorter than the results of the query written using commas. (You can enter AND in lower case and it will still be treated as an operator.)

AND is treated as an operator unless it is surrounded by double quotation marks. To use the word "and" as part of a phrase, enclose it inside double quotation marks. For example, to search for the phrase "addresses and URLs", you would enter:


addresses "and" URLs

Excluding Terms

You can also exclude certain documents from your results list. For example, you might want to see documents about most Web browsers, but you're not interested in Lynx. You could enter:

web browser NOT lynx

This query returns only documents referring to Web browsers that do not also mention Lynx. If a document includes both "Web browser" and "Lynx," it is excluded.

Searching Hyperlinks

You can search the hyperlinks in an HTML document by using the WHEN operator. This lets you locate the links to a document before you delete it, leaving links that don't work. For example, to locate the links to the document named report.html on the Verity website, enter the following query:

"report.html" <IN> A <WHEN> (HREF <CONTAINS> "verity")

See "WHEN" in Chapter 9 for more information.





Copyright © 1998, Verity, Inc. All rights reserved.