HumHub Documentation (unofficial)

Utf8Num extends AbstractCommon
in package

AbstractCommon implementation of the analyzerfunctionality.

Tags
category

Zend

subpackage

Analysis

Table of Contents

Properties

$_encoding  : string
Input string encoding
$_input  : string
Input string
$_bytePosition  : int
Current binary position in an UTF-8 stream
$_filters  : array<string|int, mixed>
The set of Token filters applied to the Token stream.
$_position  : int
Current char position in an UTF-8 stream

Methods

__construct()  : mixed
Object constructor
addFilter()  : void
Add Token filter to the AnalyzerInterface
nextToken()  : Token|null
Tokenization stream API Get next token Returns null at the end of stream
normalize()  : Token
Apply filters to the token. Can return null when the token was removed.
reset()  : void
Reset token stream
setInput()  : void
Tokenization stream API Set input
tokenize()  : array<string|int, mixed>
Tokenize text to a terms Returns array of \ZendSearch\Lucene\Analysis\Token objects

Properties

$_bytePosition

Current binary position in an UTF-8 stream

private int $_bytePosition

$_filters

The set of Token filters applied to the Token stream.

private array<string|int, mixed> $_filters = array()

Array of \ZendSearch\Lucene\Analysis\TokenFilter\TokenFilterInterface objects.

$_position

Current char position in an UTF-8 stream

private int $_position

Methods

nextToken()

Tokenization stream API Get next token Returns null at the end of stream

public nextToken() : Token|null
Return values
Token|null

normalize()

Apply filters to the token. Can return null when the token was removed.

public normalize(Token $token) : Token
Parameters
$token : Token
Return values
Token

reset()

Reset token stream

public reset() : void

setInput()

Tokenization stream API Set input

public setInput(string $data[, mixed $encoding = '' ]) : void
Parameters
$data : string
$encoding : mixed = ''

tokenize()

Tokenize text to a terms Returns array of \ZendSearch\Lucene\Analysis\Token objects

public tokenize(string $data[, mixed $encoding = '' ]) : array<string|int, mixed>

Tokens are returned in UTF-8 (internal Zend_Search_Lucene encoding)

Parameters
$data : string
$encoding : mixed = ''
Return values
array<string|int, mixed>

        
On this page

Search results