La recherche (searchv2/
)
Module situé dans zds/searchv2/
.
Modèles (models.py
)
- class zds.searchv2.models.AbstractESDjangoIndexable(*args, **kwargs)
Version of AbstractESIndexable for a Django object, with some improvements :
Already include
pk
in mapping ;Match ES
_id
field andpk
;Override
es_already_indexed
to a database field.Define a
es_flagged
field to restrict the number of object to be indexed ;Override
save()
to manage the field ;Define a
get_es_django_indexable()
method that can be overridden to change the queryset to fetch object.
- classmethod get_es_django_indexable(force_reindexing=False)
Method that can be overridden to filter django objects from database based on any criterion.
- Paramètres:
force_reindexing (bool) – force to return all objects, even if they may be already indexed.
- Renvoie:
query
- Type renvoyé:
django.db.models.query.QuerySet
- classmethod get_es_indexable(force_reindexing=False)
Override
get_es_indexable()
in order to use the Django querysets and batch objects.- Renvoie:
a queryset
- Type renvoyé:
django.db.models.query.QuerySet
- classmethod get_es_mapping()
Overridden to add pk into mapping.
- Renvoie:
mapping object
- Type renvoyé:
elasticsearch_dsl.Mapping
- save(*args, **kwargs)
Override the
save()
method to flag the object if saved (which assumes a modification of the object, so the need to reindex).Note
Flagging can be prevented using
save(es_flagged=False)
.
- class zds.searchv2.models.AbstractESIndexable
Mixin for indexable objects.
Define a number of different functions that can be overridden to tune the behavior of indexing into elasticsearch.
You (may) need to override :
get_indexable()
;get_mapping()
(not mandatory, but otherwise, ES will choose the mapping by itself) ;get_document()
(not mandatory, but may be useful if data differ from mapping or extra stuffs need to be done).
You also need to maintain
es_id
andes_already_indexed
for bulk indexing/updating (if any).- get_es_document_as_bulk_action(index, action='index')
Create a document formatted for a
_bulk
operation. Formatting is done based on action.See https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html.
- Paramètres:
index (str) – index in witch the document will be inserted
action (str) – action, either « index », « update » or « delete »
- Renvoie:
the document
- Type renvoyé:
dict
- get_es_document_source(excluded_fields=None)
Create a document from the variable of the class, based on the mapping.
Attention
You may need to override this method if the data differ from the mapping for some reason.
- Paramètres:
excluded_fields (list) – exclude some field from the default method
- Renvoie:
document
- Type renvoyé:
dict
- classmethod get_es_document_type()
value of the
_type
field in the index
- classmethod get_es_indexable(force_reindexing=False)
Return objects to index.
Attention
You need to override this method (otherwise nothing will be indexed).
- Paramètres:
force_reindexing (bool) – force to return all objects, even if they may already be indexed.
- Type renvoyé:
list
- classmethod get_es_mapping()
Setup mapping (data scheme).
Note
You will probably want to change the analyzer and boost value. Also consider the
index='not_analyzed'
option to improve performances.See https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#mappings
Attention
You may want to override this method (otherwise ES choose the mapping by itself).
- Renvoie:
mapping object
- Type renvoyé:
elasticsearch_dsl.Mapping
- class zds.searchv2.models.ESIndexManager(name, shards=5, replicas=0, connection_alias='default')
Manage a given index with different taylor-made functions
- analyze_sentence(request)
Use the anlyzer on a given sentence. Get back the list of tokens.
See http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html.
This is useful to perform « terms » queries instead of full-text queries.
- Paramètres:
request (str) – a sentence from user input
- Renvoie:
the tokens
- Type renvoyé:
list
- clear_es_index()
Clear index
- clear_indexing_of_model(model)
Nullify the indexing of a given model by setting
es_already_index=False
to all objects.Use full updating for
AbstractESDjangoIndexable
, instead of saving all of them.- Paramètres:
model (class) – the model
- delete_by_query(doc_type='', query=MatchAll())
Perform a deletion trough the
_delete_by_query
API.See https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Attention
Call to this function must be done with great care!
- Paramètres:
doc_type (str) – the document type
query (elasticsearch_dsl.query.Query) – the query to match all document to be deleted
- delete_document(document)
Delete a given document, based on its
es_id
- Paramètres:
document (AbstractESIndexable) – the document
- es_bulk_indexing_of_model(model, force_reindexing=False)
Perform a bulk action on documents of a given model. Use the
objects_per_batch
property to index.See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.bulk and http://elasticsearch-py.readthedocs.io/en/master/helpers.html#elasticsearch.helpers.parallel_bulk
Attention
Currently only implemented with « index » and « update » !
Currently only working with
AbstractESDjangoIndexable
.
- Paramètres:
model (class) – and model
force_reindexing (bool) – force all document to be returned
- Renvoie:
the number of documents indexed
- Type renvoyé:
int
- refresh_index()
Force the refreshing the index. The task is normally done periodically, but may be forced with this method.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html.
Note
The use of this function is mandatory if you want to use the search right after an indexing.
- reset_es_index(models)
Delete old index and create an new one (with the same name). Setup the number of shards and replicas. Then, set mappings for the different models.
- Paramètres:
models (list) – list of models
number_shards (int) – number of shards
number_replicas (int) – number of replicas
- setup_custom_analyzer()
Override the default analyzer.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis.html.
Our custom analyzer is based on the « french » analyzer (https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html#french-analyzer) but with some difference
« custom_tokenizer », to deal with punctuation and all kind of (non-breaking) spaces, but keep dashes and other stuffs intact (in order to keep « c++ » or « c# », for example).
« protect_c_language », a pattern replace filter to prevent « c » from being wiped out by the stopper.
« french_keywords », a keyword stopper prevent some programming language from being stemmed.
Avertissement
You need to run
manage.py es_manager index_all
if you modified this !!
- setup_search(request)
Setup search to the good index
- Paramètres:
request (elasticsearch_dsl.Search) – the search request
- Renvoie:
formated search
- Type renvoyé:
elasticsearch_dsl.Search
- update_single_document(document, doc)
Update given fields of a single document.
See https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html.
- Paramètres:
document (AbstractESIndexable) – the document
doc (dict) – fields to update
- exception zds.searchv2.models.NeedIndex
Raised when an action requires an index, but it is not created (yet).
- zds.searchv2.models.delete_document_in_elasticsearch(instance)
Delete a ESDjangoIndexable from ES database. Must be implemented by all classes that derive from AbstractESDjangoIndexable.
- Paramètres:
instance (AbstractESIndexable) – the document to delete
- zds.searchv2.models.get_django_indexable_objects()
Return all indexable objects registered in Django
Vues (views.py
)
- class zds.searchv2.views.SearchView(**kwargs)
Search view.
- get(request, *args, **kwargs)
Overridden to catch the request and fill the form.
- get_context_data(**kwargs)
Get the context for this view. This method is surcharged to modify the paginator and information given at the template.
- get_queryset()
Return the list of items for this view.
The return value must be an iterable and may be an instance of QuerySet in which case QuerySet specific behavior will be enabled.
- get_queryset_chapters()
Search in content chapters.
- get_queryset_posts()
Search in posts, and remove result if the forum is not allowed for the user or if the message is invisible.
Score is modified if:
post is the first one in a topic;
post is marked as « useful »;
post has a like/dislike ratio above (has more likes than dislikes) or below (the other way around) 1.0.
- get_queryset_publishedcontents()
Search in PublishedContent objects.
- get_queryset_topics()
Search in topics, and remove the result if the forum is not allowed for the user.
Score is modified if:
topic is solved;
topic is sticky;
topic is locked.
- search_form_class
alias de
SearchForm
- class zds.searchv2.views.SimilarTopicsView(**kwargs)
- get(request, *args, **kwargs)
Handle GET requests: instantiate a blank version of the form.
- class zds.searchv2.views.SuggestionContentView(**kwargs)
- get(request, *args, **kwargs)
Handle GET requests: instantiate a blank version of the form.
- zds.searchv2.views.opensearch(request)
Generate OpenSearch Description file.