public Browse the existing search appliance crawlers
The storage and format of the content that needs to be crawled and index can vary. This is why there are three different types of crawlers in Integra, described in details below.
Web crawlers
The Web crawlers discover and feed in web content to the search indexes, in order it to be searchable. One crawler can feed content in multiple search indexes. Crawlers will not crawl web content outside of their start URL domains. To review the existing crawlers, follow the steps below:
- Log in to the WordFrame Integra Core Administration
- Click on the "Builder" tab in the upper left corner
- Click on the "Content components" menu in the main navigation bar
- Click on the “Crawler” link in “Search appliance” section on the left of the screen
File system crawlers
The File system crawlers discover and feed in content discovered in file system documents. You need to have installed dedicated iFilters on the webserver in order to index the files' content. To view the File system crawlers, you need to:
- Log in to the WordFrame Integra Core Administration
- Click on the "Builder" tab in the upper left corner
- Click on the "Content components" menu in the main navigation bar
- Click on the “Crawler” link in “Search appliance” section on the left of the screen
- Click on the “File system crawlers” tab
Sitemap crawlers
The sitemap crawler indexes content from formated XML files. Files must be in the format specified by the sitemap.org site. To view the sitemap crawlers, you need to:
- Log in to the WordFrame Integra Core Administration
- Click on the "Builder" tab in the upper left corner
- Click on the "Content components" menu in the main navigation bar
- Click on the “Crawler” link in “Search appliance” section on the left of the screen
- Click on the “Sitemap crawlers” tab
Last edited by Boz Zashev on 26 Oct 2010 | Rev. 2 |
This page is
public |
Views: 1
Comments:
0 |
Filed under:
Content components |
Tags: