HTML Filter Defaults - Full-Text Retrieval (FTR) - Help

Full-Text Retrieval (FTR) Help

Language
English
Product
Full-Text Retrieval (FTR)
Search by Category
Help

The html filter indexes all text in an HTML file, except for HTML tags and comments. In addition, the following fields are accessible by default from reserved fields within FTR collections.

  • Html filter derives the title from the head of the document and assigns it to the reserved field FT_DNAME.

  • Formatted text is assigned to the FT_KEYWORDS reserved field. This includes text between the following HTML tags:

    • bold tags <B> and </B>

    • italics tags: <I> and </I>

    • strong emphasis tags: <STRONG> and </STRONG>

    • emphasis tags: <EM> and </EM>

    • teletype tags: <TT> and </TT>

  • Contents of META tags with the NAME=SUBJECT attribute are assigned to the FT_SUBJECT reserved field. If this tag doesn't exist, the <TITLE> information is written to FT_SUBJECT.

  • Document format id 3,023 is written to the FT_FORMAT reserved field.

  • The length of the document, in bytes, is written to the FT_ORIGINAL_SIZE reserved field.