Indexing Your Web Site
After checking your system configuration, the next step in the Quick Start is to index your Web site to ensure that the indexing process works properly from your server. In the Quick Start, only ASCII and HTML documents are indexed. To index other file formats, use the full Index Manager.
To run the test index of ASCII and HTML documents on your Web site, do the following:
- 1. Click Step 2, Index Your Web Site, from the Analysis Complete screen, or from the main Quick Start menu. Information Server displays the following page.
- .
- 2. In the URL to Index field, enter the full URL of your local web site. Indexing in Quick Start is restricted to the host machine of starting URL.
- 3. Click New and create a collection on the New Collection page if you don't already have one. If you are not running Quick Start immediately after the product installation, you may already have a collection. If so, click the drop down list under Destination to choose an existing collection.
- 4. If your site uses a proxy host, specify the proxy host and port under HTTP Proxy.
- 5. Click Index. Information Server displays the Site Indexing page.
- 6. To see the status of the indexing process, click Update. To view the Indexing Log in detail, click Indexing Log.
- 7. When the indexing process is complete, click Step 3, Search Your Web Site.
Managing robots.txt Files
The robots.txt file is used on many web sites to specify what parts of the site indexers from outside the site should avoid. The Index Manager always honors all robots.txt files. In addition, if you are reindexing a site and robots.txt has changed, the indexer will delete documents that have been added to robots.txt. If you wish to ignore robots.txt files, you must use the command-line spider. For information about the command-line spider, see the Verity Spider User's Guide.
Copyright © 1998, Verity, Inc. All rights
reserved.