|
xapian-core
1.5.1
|
Xapian::Weight subclass implementing Coordinate Matching. More...
#include <weight.h>
Public Member Functions | |
| CoordWeight * | clone () const |
| Clone this object. | |
| void | init (double factor_) |
| Allow the subclass to perform any initialisation it needs to. | |
| CoordWeight () | |
| Construct a CoordWeight. | |
| std::string | name () const |
| Return the name of this weighting scheme, e.g. | |
| std::string | serialise () const |
| Return this object's parameters serialised as a single string. | |
| CoordWeight * | unserialise (const std::string &serialised) const |
| Unserialise parameters. | |
| double | get_sumpart (Xapian::termcount wdf, Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const |
| Calculate the weight contribution for this object's term to a document. | |
| double | get_maxpart () const |
| Return an upper bound on what get_sumpart() can return for any document. | |
| CoordWeight * | create_from_parameters (const char *params) const |
| Create from a human-readable parameter string. | |
| Public Member Functions inherited from Xapian::Weight | |
| Weight () | |
| Default constructor, needed by subclass constructors. | |
| virtual | ~Weight () |
| Virtual destructor, because we have virtual methods. | |
| virtual double | get_sumextra (Xapian::termcount doclen, Xapian::termcount uniqterms, Xapian::termcount wdfdocmax) const |
| Calculate the term-independent weight component for a document. | |
| virtual double | get_maxextra () const |
| Return an upper bound on what get_sumextra() can return for any document. | |
Additional Inherited Members | |
| Static Public Member Functions inherited from Xapian::Weight | |
| static const Weight * | create (const std::string &scheme, const Registry ®=Registry()) |
| Return the appropriate weighting scheme object. | |
| Protected Types inherited from Xapian::Weight | |
| enum | stat_flags { COLLECTION_SIZE = 0 , RSET_SIZE = 0 , AVERAGE_LENGTH = 4 , TERMFREQ = 1 , RELTERMFREQ = 1 , QUERY_LENGTH = 0 , WQF = 0 , WDF = 2 , DOC_LENGTH = 8 , DOC_LENGTH_MIN = 16 , DOC_LENGTH_MAX = 32 , WDF_MAX = 64 , COLLECTION_FREQ = 1 , UNIQUE_TERMS = 128 , TOTAL_LENGTH = 256 , WDF_DOC_MAX = 512 , UNIQUE_TERMS_MIN = 1024 , UNIQUE_TERMS_MAX = 2048 , DB_DOC_LENGTH_MIN = 4096 , DB_DOC_LENGTH_MAX = 8192 , DB_UNIQUE_TERMS_MIN = 16384 , DB_UNIQUE_TERMS_MAX = 32768 , DB_WDF_MAX = 65536 , IS_BOOLWEIGHT_ = static_cast<int>(0x80000000) } |
| Stats which the weighting scheme can use (see need_stat()). More... | |
| Protected Member Functions inherited from Xapian::Weight | |
| void | need_stat (stat_flags flag) |
| Tell Xapian that your subclass will want a particular statistic. | |
| Weight (const Weight &) | |
| Don't allow copying. | |
| Xapian::doccount | get_collection_size () const |
| The number of documents in the collection. | |
| Xapian::doccount | get_rset_size () const |
| The number of documents marked as relevant. | |
| Xapian::doclength | get_average_length () const |
| The average length of a document in the collection. | |
| Xapian::doccount | get_termfreq () const |
| The number of documents which this term indexes. | |
| Xapian::doccount | get_reltermfreq () const |
| The number of relevant documents which this term indexes. | |
| Xapian::termcount | get_collection_freq () const |
| The collection frequency of the term. | |
| Xapian::termcount | get_query_length () const |
| The length of the query. | |
| Xapian::termcount | get_wqf () const |
| The within-query-frequency of this term. | |
| Xapian::termcount | get_doclength_upper_bound () const |
| An upper bound on the maximum length of any document in the shard. | |
| Xapian::termcount | get_doclength_lower_bound () const |
| A lower bound on the minimum length of any document in the shard. | |
| Xapian::termcount | get_wdf_upper_bound () const |
| An upper bound on the wdf of this term in the shard. | |
| Xapian::totallength | get_total_length () const |
| Total length of all documents in the collection. | |
| Xapian::termcount | get_unique_terms_upper_bound () const |
| A lower bound on the number of unique terms in any document in the shard. | |
| Xapian::termcount | get_unique_terms_lower_bound () const |
| An upper bound on the number of unique terms in any document in the shard. | |
| Xapian::termcount | get_db_doclength_upper_bound () const |
| An upper bound on the maximum length of any document in the database. | |
| Xapian::termcount | get_db_doclength_lower_bound () const |
| A lower bound on the minimum length of any document in the database. | |
| Xapian::termcount | get_db_unique_terms_upper_bound () const |
| A lower bound on the number of unique terms in any document in the database. | |
| Xapian::termcount | get_db_unique_terms_lower_bound () const |
| An upper bound on the number of unique terms in any document in the database. | |
| Xapian::termcount | get_db_wdf_upper_bound () const |
| An upper bound on the wdf of this term in the database. | |
Xapian::Weight subclass implementing Coordinate Matching.
Each matching term scores one point. See Managing Gigabytes, Second Edition p181.
|
virtual |
Clone this object.
This method allocates and returns a copy of the object it is called on.
If your subclass is called FooWeight and has parameters a and b, then you would implement FooWeight::clone() like so:
FooWeight * FooWeight::clone() const { return new FooWeight(a, b); }
Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: https://trac.xapian.org/ticket/554#comment:1
Implements Xapian::Weight.
References CoordWeight().
|
virtual |
Create from a human-readable parameter string.
| params | string containing weighting scheme parameter values. |
Reimplemented from Xapian::Weight.
References CoordWeight().
|
virtual |
Return an upper bound on what get_sumpart() can return for any document.
This information is used by the matcher to perform various optimisations, so strive to make the bound as tight as possible.
Implements Xapian::Weight.
|
virtual |
Calculate the weight contribution for this object's term to a document.
The parameters give information about the document which may be used in the calculations:
| wdf | The within document frequency of the term in the document. You need to call need_stat(WDF) if you use this value. |
| doclen | The document's length (unnormalised). You need to call need_stat(DOC_LENGTH) if you use this value. |
| uniqterms | Number of unique terms in the document. You need to call need_stat(UNIQUE_TERMS) if you use this value. |
| wdfdocmax | Maximum wdf value in the document. You need to call need_stat(WDF_DOC_MAX) if you use this value. |
You can rely of wdf <= doclen if you call both need_stat(WDF) and need_stat(DOC_LENGTH) - this is trivially true for terms, but Xapian also ensure it's true for OP_SYNONYM, where the wdf is approximated.
Implements Xapian::Weight.
|
virtual |
Allow the subclass to perform any initialisation it needs to.
| factor | Any scaling factor (e.g. from OP_SCALE_WEIGHT). If the Weight object is for the term-independent weight supplied by get_sumextra()/get_maxextra(), then init(0.0) is called (starting from Xapian 1.2.11 and 1.3.1 - earlier versions failed to call init() for such Weight objects). |
Implements Xapian::Weight.
|
virtual |
Return the name of this weighting scheme, e.g.
"bm25+".
This is the name that the weighting scheme gets registered under when passed to Xapian:Registry::register_weighting_scheme().
As a result:
For 1.4.x and earlier we recommended returning the full namespace-qualified name of your class here, but now we recommend returning a just the name in lower case, e.g. "foo" instead of "FooWeight", "bm25+" instead of "Xapian::BM25PlusWeight".
If you don't want to support creation via Weight::create() or the remote backend, you can use the default implementation which simply returns an empty string.
Reimplemented from Xapian::Weight.
|
virtual |
Return this object's parameters serialised as a single string.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Reimplemented from Xapian::Weight.
|
virtual |
Unserialise parameters.
This method unserialises parameters serialised by the serialise() method and allocates and returns a new object initialised with them.
If you don't want to support the remote backend, you can use the default implementation which simply throws Xapian::UnimplementedError.
Note that the returned object will be deallocated by Xapian after use with "delete". If you want to handle the deletion in a special way (for example when wrapping the Xapian API for use from another language) then you can define a static operator delete method in your subclass as shown here: https://trac.xapian.org/ticket/554#comment:1
| serialised | A string containing the serialised parameters. |
Reimplemented from Xapian::Weight.
References CoordWeight().