Token
in package
Tags
Table of Contents
Properties
- $_endOffset : int
- End in source text
- $_positionIncrement : int
- The position of this token relative to the previous Token.
- $_startOffset : int
- Start in source text.
- $_termText : string
- The text of the term.
Methods
- __construct() : mixed
- Object constructor
- getEndOffset() : int
- Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text.
- getPositionIncrement() : int
- Returns the position increment of this Token.
- getStartOffset() : int
- Returns this Token's starting offset, the position of the first character corresponding to this token in the source text.
- getTermText() : string
- Returns the Token's term text.
- setPositionIncrement() : void
- positionIncrement setter
Properties
$_endOffset
End in source text
private
int
$_endOffset
$_positionIncrement
The position of this token relative to the previous Token.
private
int
$_positionIncrement
The default value is one.
Some common uses for this are: Set it to zero to put multiple terms in the same position. This is useful if, e.g., a word has multiple stems. Searches for phrases including either stem will match. In this case, all but the first stem's increment should be set to zero: the increment of the first instance should be one. Repeating a token with an increment of zero can also be used to boost the scores of matches on that token.
Set it to values greater than one to inhibit exact phrase matches. If, for example, one does not want phrases to match across removed stop words, then one could build a stop word filter that removes stop words and also sets the increment to the number of stop words removed before each non-stop word. Then exact phrase queries will only match when the terms occur with no intervening stop words.
$_startOffset
Start in source text.
private
int
$_startOffset
$_termText
The text of the term.
private
string
$_termText
Methods
__construct()
Object constructor
public
__construct(string $text, int $start, int $end) : mixed
Parameters
- $text : string
- $start : int
- $end : int
getEndOffset()
Returns this Token's ending offset, one greater than the position of the last character corresponding to this token in the source text.
public
getEndOffset() : int
Return values
intgetPositionIncrement()
Returns the position increment of this Token.
public
getPositionIncrement() : int
Return values
intgetStartOffset()
Returns this Token's starting offset, the position of the first character corresponding to this token in the source text.
public
getStartOffset() : int
Note: The difference between getEndOffset() and getStartOffset() may not be equal to strlen(Zend_Search_Lucene_Analysis_Token::getTermText()), as the term text may have been altered by a stemmer or some other filter.
Return values
intgetTermText()
Returns the Token's term text.
public
getTermText() : string
Return values
stringsetPositionIncrement()
positionIncrement setter
public
setPositionIncrement(int $positionIncrement) : void
Parameters
- $positionIncrement : int