Adding a Search Solution to OpenAccess Documentation


OpenAccess documentation does not incorporate a search capability because of the large variety of platform and installation configurations that are possible. Most search solutions require a Web server for implementation, but OpenAccess documentation can be installed for access from a file server or CD-ROM. To avoid delivering a solution that is broken in some configurations, the search solution described in this document is not built in but easily added if the documentation is accessed via a Web server.

The solution described in this document is Perlfect Search. There are a variety of solutions that can be used, but this one is selected because it is available under the GNU General Public License, it is simple to install and configure, and it offers good functionality and performance. Perlfect Search is a suite of Perl scripts for indexing and searching Web pages. The latest version of Perlfect Search can be downloaded from the Perlfect Solutions Web site. Because it is based on Perl, the Perlfect Search solution is compatible with all Linux, UNIX, and Windows platforms.

Preparations for Installing Perlfect Search

Perlfect is available as a tar.gz or .zip archive. Download the archive version that suits your platform or preference. In addition to the Perlfect Search archive, to install and use Perlfect Search, you need:

Usually, you do not need to install these modules. Because of their common use, they are usually already installed. You can indirectly check whether a required Perl module is installed by trying to read its documentation. For example, the shell command perldoc DB_File displays the documentation for that module if the module is installed.

This Perlfect installation and configuration guide assumes that you are familiar with Perl, and you understand how to install Perl modules if necessary. If you are not familiar with Perl, and the required Perl modules are missing from your system, you can refer to the Perlfect Solutions Web site for additional information about acquiring and installing the missing modules, or check with your system administrator.

Installing Perlfect Search

The numbered steps below explain how you install Perlfect Search. Before you begin the installation, you need to know the following:

The following setup instructions assume you are installing on a Linux or UNIX platform.

  1. Log on to the machine that runs your Web server.
  2. Create a distribution directory such as perlfect_dist.
  3. Download or move the Perlfect distribution archive to the directory you created in the previous step. (For access to the download site, see the Perlfect Solutions URL provided at the beginning of this document.)
  4. Change to the Perlfect distribution directory, and uncompress and/or unarchive the Perlfect Search distribution archive.
  5. Change to the directory where the Perlfect setup scripts were installed by the previous step. For example, for Perlfect 3.31b, use cd search-3.31b
  6. If you need to be root or other user in order to install files in your Web server cgi-bin directory, use su root.
  7. Run the Perlfect Search setup script using the command perl setup.pl.

    The setup script prompts you for:

    The answers you supply are saved in the Perlfect Search configuration file named conf.pl in the cgi-bin/perlfect/search directory. The configuration file is used by both the indexer and by the search engine. See the previous section for information about what you need to know before you install Perlfect Search.

    Notes:
    When you enter a URL during setup, it must include the http:// protocol. When you enter a path, it must start from the root of your Web server. Both URL and path entries must end with the slash ( / ) character.

    The base URL is the location at and below which the indexing and searches are performed. You can use the root directory of your Web site to index/search your entire site, or you can specify the URL where only the OpenAccess documentation is installed. For example, to index your entire Web site, you might enter a base URL of http://myIntranet/, but if you only want to index and search the OpenAccess documentation, you might enter a base URL of http://myIntranet/doc/oa/.

  8. After entering the required setup information you are prompted to choose whether to index the site or finish. You can index your site at this time, or do so in a later step.
  9. When the installation is complete, change to the directory where the Perlfect Search scripts are installed (for example, cgi-bin/perlfect/search).
  10. If you did not index your site as part of the setup process, you can validate that the installation is correct and functioning now by using the command perl indexer.pl.

The last step starts the indexer to process the pages on your site and to create an index database. Indexing the OpenAccess documentation takes a while since there are approximately 3800 files to process. The indexer uses the conf.pl configuration file you created during the Perlfect Search setup step. If the indexer produces an error, or does not appear to index your site correctly, you can use a text editor to browse the conf.pl file in the cgi-bin/perlfect/search directory. Check whether the URLs and paths look correct for your Web site. If you find an error, you can either correct it directly in the configuration file using your editor, or make a note about where the error occurred, and redo the setup from the beginning.

The default configuration file (conf.pl) enables an option to highlight matches. When enabled, each hit returned on the search results page includes two URLs. One is a link to the actual page where the keyword is found, the other is identified as (highlight matches), which is a link to the page via a proxy that adds keyword highlighting in the results page. A problem with the result pages that include the highlighting is that the images in the navigation bar are not resolved, and the pages appear to be broken. If you do not want to include highlighted matches in your search results, set $HIGHLIGHT_MATCHES = 0; in the conf.pl file.

Adding Search to Your Web Site

Once you complete the installation and index your site, your next step is to validate that the search engine functions correctly. If you installed Perlfect Search in cgi-bin/perlfect/search, enter the following address in your browser:

    http://myIntranet/cgi-bin/perlfect/search/search.pl

Of course, you need to replace myIntranet with your site domain name, and if your CGI directory uses a different name, or is in a different location, you also need to adjust that part of the URL. Your browser should return a page that includes a search text box and the Perlfect Solutions copyright notice.

If you know a term that is included among the pages you indexed, you can enter that term, click search, and a page with a ranked list of search hits is returned.

Note: If searching for a term produces an error, refer to the README file in the Perlfect Search distribution directory, or refer to the Perlfect Search FAQs on the Perlfect Solutions Web site.

If you want to add a search text box to your Web site, you can enter the following HTML code in a text file, and save it as searchform.html.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head><title>OpenAccess Search</title>
<base target="main">
</head>
<body>
<form method="get" action="/cgi-bin/perlfect/search/search.pl">
	<input type="hidden" name="p" value="1">
	<input type="hidden" name="lang" value="en">
	<input type="hidden" name="include" value="">
	<input type="hidden" name="exclude" value="">
	<input type="hidden" name="penalty" value="0">
	<select name="mode">
	  <option value="all">Match ALL words</option>
	  <option value="any">Match ANY word</option>
	</select>
	<input type="text" name="q"><input type="submit" value="Search">
</form>
<p>Choose <em>Match ALL words</em> to find pages containing <em>all<em> the specified words  
(though not necessarily together in the same string).<br>
Prefix a word with + to find only those pages that include the word.<br>
Prefix a word with - to find only those pages that do not include the word.</p>
</body>
</html>

The form input variables affect search results and serve the following purposes:

For additional details about the form variables, refer to the README file in the Perlfect Search distribution directory.

Enhancing Your Search Implementation

There are a variety of ways you can configure your Web site to include search. One approach that does not require modifying each page on your site uses a frameset to encapsulate the OpenAccess pages. While this approach has the advantage of not requiring modification of each page on your site, framesets can have disadvantages dependent on how you implement them. This example provides a basic framework that demonstrates one way to incorporate search without modifying every page on your Web site.

The following HTML defines a two-row frameset with the search text box in the top frame and the starting page and search results displayed in the bottom frame. Adjust the URLs for your site, and save this in a text file as search.html. When this file and the searchform.html file listed above are included in the same directory on your Web site, you can open search.html in your browser, enter a search word or phrase in the top frame, and view search results in the bottom frame—all from a single browser window.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD 
HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> <html> <head> <title>Perlfect
   Search
Frameset
Example</title>
</head> <!-- frames -->
<frameset
rows= "27%,73%"> <frame
name="search"  src="searchform.html"
    marginwidth="10" marginheight="5" scrolling="auto" frameborder="1"> <frame name="main" src=
    "http://myIntranet/doc/oa/html/index.html" marginwidth="10" marginheight="10" scrolling="auto" frameborder="0"> <noframes> <p>This
frameset
      document contains: <ul> <li><a
      href=
         "http://myserver/cgi-bin/perlfect/search/search.pl">Left search frame</a> <li><a
         href= "http://myserver/doc_oa22/oa/html/index.html">Right frame displays site documents</a>
      </ul>
  </noframes>
</frameset>
</html>

Perlfect Search has provisions for limiting the files and directories that are searched and for further customization of the search solution. Be sure to read the README file in your Perlfect Search distribution directory for details, and visit the Perlfect Solutions Web site for FAQs, detailed installation and configuration instructions, and troubleshooting tips.



Return to top of page

Return to Programmers Guide topics


Copyright © 2010 Cadence Design Systems, Inc.
All rights reserved.