HumHub Documentation (unofficial)

MultiSearcher
in package
implements SearchIndexInterface

Multisearcher allows to search through several independent indexes.

Tags
category

Zend

Table of Contents

Interfaces

SearchIndexInterface

Properties

$_documentDistributorCallBack  : callable
Callback used to choose target index for new documents
$_indices  : array<string|int, mixed>
List of indices for searching.
$_termsStream  : TermStreamsPriorityQueue
Terms stream priority queue object

Methods

__construct()  : mixed
Object constructor.
addDocument()  : void
Adds a document to this index.
addIndex()  : void
Add index for searching.
closeTermsStream()  : void
Close terms stream
commit()  : void
Commit changes resulting from delete() or undeleteAll() operations.
count()  : int
Returns the total number of documents in this index (including deleted documents).
currentTerm()  : Term|null
Returns term in current position
delete()  : void
Deletes a document from the index.
docFreq()  : int
Returns the number of documents in this index containing the $term.
find()  : array<string|int, mixed>|QueryHit
Performs a query against the index and returns an array of Zend_Search_Lucene_Search_QueryHit objects.
getActualGeneration()  : int
Get current generation number
getDirectory()  : DirectoryInterface
Returns the Zend_Search_Lucene_Storage_Directory instance for this index.
getDocument()  : Document
Returns a Zend_Search_Lucene_Document object for the document number $id in this index.
getDocumentDistributorCallback()  : callable
Get callback for choosing target index.
getFieldNames()  : array<string|int, mixed>
Returns a list of all unique field names that exist in this index.
getFormatVersion()  : int
Get index format version
getMaxBufferedDocs()  : int
Retrieve index maxBufferedDocs option
getMaxMergeDocs()  : int
Retrieve index maxMergeDocs option
getMergeFactor()  : int
Retrieve index mergeFactor option
getSegmentFileName()  : string
Get segments file name
getSimilarity()  : AbstractSimilarity
Retrive similarity used by index reader
hasDeletions()  : bool
Returns true if any documents have been deleted from this index.
hasTerm()  : bool
Returns true if index contain documents with specified term.
isDeleted()  : bool
Checks, that document is deleted
maxDoc()  : int
Returns one greater than the largest possible document number.
nextTerm()  : Term|null
Scans terms dictionary and returns next term
norm()  : float
Returns a normalization factor for "field, document" pair.
numDocs()  : int
Returns the total number of non-deleted documents in this index.
optimize()  : void
Optimize index.
resetTermsStream()  : void
Reset terms stream.
setDocumentDistributorCallback()  : void
Set callback for choosing target index.
setFormatVersion()  : void
Set index format version.
setMaxBufferedDocs()  : void
Set index maxBufferedDocs option
setMaxMergeDocs()  : void
Set index maxMergeDocs option
setMergeFactor()  : void
Set index mergeFactor option
skipTo()  : void
Skip terms stream up to specified term preffix.
termDocs()  : array<string|int, mixed>
Returns IDs of all the documents containing term.
termDocsFilter()  : DocsFilter
Returns documents filter for all documents containing term.
termFreqs()  : int
Returns an array of all term freqs.
termPositions()  : array<string|int, mixed>
Returns an array of all term positions in the documents.
terms()  : array<string|int, mixed>
Returns an array of all terms in this index.
undeleteAll()  : void
Undeletes all documents currently marked as deleted in this index.

Properties

$_documentDistributorCallBack

Callback used to choose target index for new documents

protected callable $_documentDistributorCallBack = null

Function/method signature: Zend_Search_Lucene_Interface callbackFunction(Zend_Search_Lucene_Document $document, array $indices);

null means "default documents distributing algorithm"

$_indices

List of indices for searching.

protected array<string|int, mixed> $_indices

Array of Zend_Search_Lucene_Interface objects

Methods

__construct()

Object constructor.

public __construct([array<string|int, mixed> $indices = array() ]) : mixed
Parameters
$indices : array<string|int, mixed> = array()

Arrays of indices for search

Tags
throws
InvalidArgumentException

closeTermsStream()

Close terms stream

public closeTermsStream() : void

Should be used for resources clean up if stream is not read up to the end

commit()

Commit changes resulting from delete() or undeleteAll() operations.

public commit() : void

count()

Returns the total number of documents in this index (including deleted documents).

public count() : int
Return values
int

currentTerm()

Returns term in current position

public currentTerm() : Term|null
Return values
Term|null

docFreq()

Returns the number of documents in this index containing the $term.

public docFreq(Term $term) : int
Parameters
$term : Term
Return values
int

find()

Performs a query against the index and returns an array of Zend_Search_Lucene_Search_QueryHit objects.

public find(mixed $query) : array<string|int, mixed>|QueryHit

Input is a string or Zend_Search_Lucene_Search_Query.

Parameters
$query : mixed
Return values
array<string|int, mixed>|QueryHit

getDocumentDistributorCallback()

Get callback for choosing target index.

public getDocumentDistributorCallback() : callable
Return values
callable

getFieldNames()

Returns a list of all unique field names that exist in this index.

public getFieldNames([bool $indexed = false ]) : array<string|int, mixed>
Parameters
$indexed : bool = false
Return values
array<string|int, mixed>

getMaxBufferedDocs()

Retrieve index maxBufferedDocs option

public getMaxBufferedDocs() : int

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Tags
throws
RuntimeException
Return values
int

getMaxMergeDocs()

Retrieve index maxMergeDocs option

public getMaxMergeDocs() : int

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Tags
throws
RuntimeException
Return values
int

getMergeFactor()

Retrieve index mergeFactor option

public getMergeFactor() : int

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Tags
throws
RuntimeException
Return values
int

getSegmentFileName()

Get segments file name

public static getSegmentFileName(int $generation) : string
Parameters
$generation : int
Return values
string

hasDeletions()

Returns true if any documents have been deleted from this index.

public hasDeletions() : bool
Return values
bool

hasTerm()

Returns true if index contain documents with specified term.

public hasTerm(Term $term) : bool

Is used for query optimization.

Parameters
$term : Term
Return values
bool

isDeleted()

Checks, that document is deleted

public isDeleted(int $id) : bool
Parameters
$id : int
Tags
throws
OutOfRangeException

is thrown if $id is out of the range

Return values
bool

maxDoc()

Returns one greater than the largest possible document number.

public maxDoc() : int

This may be used to, e.g., determine how big to allocate a structure which will have an element for every document number in an index.

Return values
int

nextTerm()

Scans terms dictionary and returns next term

public nextTerm() : Term|null
Return values
Term|null

norm()

Returns a normalization factor for "field, document" pair.

public norm(int $id, string $fieldName) : float
Parameters
$id : int
$fieldName : string
Return values
float

numDocs()

Returns the total number of non-deleted documents in this index.

public numDocs() : int
Return values
int

optimize()

Optimize index.

public optimize() : void

Merges all segments into one

resetTermsStream()

Reset terms stream.

public resetTermsStream() : void

setDocumentDistributorCallback()

Set callback for choosing target index.

public setDocumentDistributorCallback(callable $callback) : void
Parameters
$callback : callable
Tags
throws
InvalidArgumentException

setFormatVersion()

Set index format version.

public setFormatVersion(int $formatVersion) : void

Index is converted to this format at the nearest upfdate time

Parameters
$formatVersion : int

setMaxBufferedDocs()

Set index maxBufferedDocs option

public setMaxBufferedDocs(int $maxBufferedDocs) : void

maxBufferedDocs is a minimal number of documents required before the buffered in-memory documents are written into a new Segment

Default value is 10

Parameters
$maxBufferedDocs : int

setMaxMergeDocs()

Set index maxMergeDocs option

public setMaxMergeDocs(int $maxMergeDocs) : void

maxMergeDocs is a largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

Default value is PHP_INT_MAX

Parameters
$maxMergeDocs : int

setMergeFactor()

Set index mergeFactor option

public setMergeFactor(mixed $mergeFactor) : void

mergeFactor determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indices are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches on unoptimized indices are slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.

Default value is 10

Parameters
$mergeFactor : mixed

skipTo()

Skip terms stream up to specified term preffix.

public skipTo(Term $prefix) : void

Prefix contains fully specified field info and portion of searched term

Parameters
$prefix : Term

termPositions()

Returns an array of all term positions in the documents.

public termPositions(Term $term[, DocsFilter|null $docsFilter = null ]) : array<string|int, mixed>

Return array structure: array( docId => array( pos1, pos2, ...), ...)

Parameters
$term : Term
$docsFilter : DocsFilter|null = null
Tags
throws
InvalidArgumentException
Return values
array<string|int, mixed>

terms()

Returns an array of all terms in this index.

public terms() : array<string|int, mixed>
Return values
array<string|int, mixed>

undeleteAll()

Undeletes all documents currently marked as deleted in this index.

public undeleteAll() : void

        
On this page

Search results