|
||
|
Overview Search Form Output Format Advanced Integration Search Options Indexing Options Activity Reports Troubleshooting Ads Operators Help Titles Reliability PDF and Word files Deleted Pages Robot Exclusion Accented Letters Password Protection Setting Page Date
|
Common Questions
Does the Blossom Search output ever contain advertisements?Never!Can I search with Boolean expressions? What about regular expressions?Simplicity of usage has been the guiding principle for Blossom Search. Most people don't know Boolean from Bully Inn and would assume "Hello, how are you?" to be as regular an expression as any.Accordingly, Blossom Search accepts neither Boolean nor pattern operators, but that is not to say that the searches it performs are simple-minded. The default phrase search would be entered on some search engines by inserting AND NEAR between the words in the phrase. Thus a phrase will match a Web page if all the words being searched for are located within about 20 words of one another on the page. The default word search would be entered as word* in some search engines. That is, a search string matches any word that begins with the string. (You can remove the NEAR operator or force whole-word matches, as described in the Search Guide Section "Search Options".) Is there a help file for using Blossom Search?The document https://BlossomSoftware.net/search_help.html contains tips on searching. You can add a link directly to that page or copy the text to your own help page.How does Blossom Search treat titles, descriptions, and keywords?When Blossom Search indexes a site, it remembers where in a Web page each word appeared. In the search results, pages are listed in order of a weighted score for the page. Each search hit increases the score of a page. Words that appear in keyword lists, descriptions, titles, and headers add more to the score than words that appear in the ordinary text of a page. (Please see the Search Guide Section "Page Order" for ways to control the order of the listings in the results lists.)If a page has a description, it will be output after the page title and before the search hits. How do I search by category?Any subdirectory of a site can be turned into a category. For example, perhaps your site has a bulletin board that you would like to index separately. If the URLs for the bulletin board have a distinctive prefix, then you can create distinct indexes. You will be given a different Blossom Search ID for each index allowing you to control when each index is searched (see the Search Guide Section "Searching Multiple Indexes").In addition, Blossom Search can organize the search results by category. You can create up to 100 categories and assign pages expliticly to a category using the Blossom "category" command. (Please see the Search Guide Section "Search Categories" for details.) Will Blossom Search index Adobe PDF files? What about MS/Word and WordPerfect files?Blossom Search will index PDF and word processing files, but only if you ask it to. See the Search Guide Section "Indexing Options" for details on how to ask.What happens when I delete a document from my site?The spider tests each page in the search index to see if it has changed or been removed. When the spider requests information about a page that has been removed, your Web server will report that the page isn't found (a 404 status code in HTTP-speak) and the page will be removed from the index. Note that just removing the links to a page won't remove the page from your index, you must also remove the file.Some Web servers are set up to redirect requests for missing pages to a site index or some other page. As a result, the Web server may return success (a 200 status code) for requests of deleted pages. In this case the page may not be removed from the index, but its contents will reflect the redirection. If more than one page has been deleted on your site and all are redirected to the same page, then all but one will be deleted from the index when our indexer checks for duplicates. Does Blossom's spider follow the Robot Exclusion commands?Yes. The spider follows instructions in "robots.txt", if it exists, as well as commands in any "robots" meta tag. To control the Blossom spider specifically, use "Blossom" as the agent name. For more information about Robot Exclusion, please see Wikipedia.As an alternative to robot exclusion, Blossom's spider also looks for special comments to control spidering and indexing. The comments allow more flexiblity since they can be turned on and off inside a document. See the Search Guide Sections on "Files and Directories" and "Headers and Footers" for more details. How are accented letters handled?Letters with accent marks (e.g., à, ñ, and ü) are treated as though the accent was not there. Thus, when searching, a letter without an accent will match the same letter with any accent. That is, an "a" will match an "à". Similarly, if an accented letter is entered in a search form, it will match the same letter unaccented.How do I index a password protected site?If your site uses Basic Authentication, you can establish a user name and password for the spider to use when it visits the site. To set the user name and password, log on to the search configuration page for your index and follow the "Change Index Settings" link.For other authorization schemes, please contact Blossom Support for assistance. How do I set the date for dynamic pages?By default, dynamic pages will always have today's date, since they were literally created today. This interferes with the order of search results, particularly when the age order is specified. To give an HTML page a different date, put HTTP-EQUIV="Last-Modified" meta tag into the HTML output. It should look something like this:Mercifully, the "day of week" does not need to be correct.<meta http-equiv="Last-Modified" content="Fri, 06 Feb 2008 01:00:00 GMT"> For PDF files the technique is slightly different because there are no meta tags. First, if the "Modification Date" or the "Creation Date" properties of the document are correct, then the index will use those instead of the date from the webserver. If those dates are absent or not correct, you can put the correct date, using the format shown above, as the first (or only) entry in the "keywords" field for the PDF file. (Please note that in order for a "keyword" to be interpreted as a date, the format must adhere strictly to the Internet standard as shown above. although the actual value for Day of Week and the time are not important. That is, the complete date must be exactly 29 characters, the Day of Week and month must each be exactly three letters, and the Day of Month and time components must contain exactly two digits.) |
|
|
|
||