.oO SearXNG Developer Documentation Oo.
|
Functions | |
get_sc_code (searxng_locale, params) | |
request (query, params) | |
_request_cat_web (query, params) | |
response (resp) | |
_response_cat_web (dom) | |
fetch_traits (EngineTraits engine_traits) | |
Variables | |
logging | logger .Logger |
dict | about |
str | startpage_categ = 'web' |
bool | send_accept_language_header = True |
list | categories = ['general', 'web'] |
bool | paging = True |
int | max_page = 18 |
bool | time_range_support = True |
bool | safesearch = True |
dict | time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'} |
dict | safesearch_dict = {0: '0', 1: '1', 2: '1'} |
str | base_url = 'https://www.startpage.com' |
str | search_url = base_url + '/sp/search' |
str | search_form_xpath = '//form[@id="search"]' |
int | sc_code_ts = 0 |
str | sc_code = '' |
int | sc_code_cache_sec = 30 |
Startpage's language & region selectors are a mess .. .. _startpage regions: Startpage regions ================= In the list of regions there are tags we need to map to common region tags:: pt-BR_BR --> pt_BR zh-CN_CN --> zh_Hans_CN zh-TW_TW --> zh_Hant_TW zh-TW_HK --> zh_Hant_HK en-GB_GB --> en_GB and there is at least one tag with a three letter language tag (ISO 639-2):: fil_PH --> fil_PH The locale code ``no_NO`` from Startpage does not exists and is mapped to ``nb-NO``:: babel.core.UnknownLocaleError: unknown locale 'no_NO' For reference see languages-subtag at iana; ``no`` is the macrolanguage [1]_ and W3C recommends subtag over macrolanguage [2]_. .. [1] `iana: language-subtag-registry <https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry>`_ :: type: language Subtag: nb Description: Norwegian Bokmål Added: 2005-10-16 Suppress-Script: Latn Macrolanguage: no .. [2] Use macrolanguages with care. Some language subtags have a Scope field set to macrolanguage, i.e. this primary language subtag encompasses a number of more specific primary language subtags in the registry. ... As we recommended for the collection subtags mentioned above, in most cases you should try to use the more specific subtags ... `W3: The primary language subtag <https://www.w3.org/International/questions/qa-choosing-language-tags#langsubtag>`_ .. _startpage languages: Startpage languages =================== :py:obj:`send_accept_language_header`: The displayed name in Startpage's settings page depend on the location of the IP when ``Accept-Language`` HTTP header is unset. In :py:obj:`fetch_traits` we use:: 'Accept-Language': "en-US,en;q=0.5", .. to get uniform names independent from the IP). .. _startpage categories: Startpage categories ==================== Startpage's category (for Web-search, News, Videos, ..) is set by :py:obj:`startpage_categ` in settings.yml:: - name: startpage engine: startpage startpage_categ: web ... .. hint:: The default category is ``web`` .. and other categories than ``web`` are not yet implemented.
|
protected |
Definition at line 260 of file startpage.py.
References searx.engines.startpage.get_sc_code().
Referenced by searx.engines.startpage.request().
|
protected |
Definition at line 331 of file startpage.py.
Referenced by searx.engines.startpage.response().
searx.engines.startpage.fetch_traits | ( | EngineTraits | engine_traits | ) |
Fetch :ref:`languages <startpage languages>` and :ref:`regions <startpage regions>` from Startpage.
Definition at line 390 of file startpage.py.
searx.engines.startpage.get_sc_code | ( | searxng_locale, | |
params ) |
Get an actual ``sc`` argument from Startpage's search form (HTML page). Startpage puts a ``sc`` argument on every HTML :py:obj:`search form <search_form_xpath>`. Without this argument Startpage considers the request is from a bot. We do not know what is encoded in the value of the ``sc`` argument, but it seems to be a kind of a *time-stamp*. Startpage's search form generates a new sc-code on each request. This function scrap a new sc-code from Startpage's home page every :py:obj:`sc_code_cache_sec` seconds.
Definition at line 168 of file startpage.py.
Referenced by searx.engines.startpage._request_cat_web().
searx.engines.startpage.request | ( | query, | |
params ) |
Assemble a Startpage request. To avoid CAPTCHA we need to send a well formed HTTP POST request with a cookie. We need to form a request that is identical to the request build by Startpage's search form: - in the cookie the **region** is selected - in the HTTP POST data the **language** is selected Additionally the arguments form Startpage's search form needs to be set in HTML POST data / compare ``<input>`` elements: :py:obj:`search_form_xpath`.
Definition at line 240 of file startpage.py.
References searx.engines.startpage._request_cat_web().
searx.engines.startpage.response | ( | resp | ) |
Definition at line 321 of file startpage.py.
References searx.engines.startpage._response_cat_web().
dict searx.engines.startpage.about |
Definition at line 108 of file startpage.py.
str searx.engines.startpage.base_url = 'https://www.startpage.com' |
Definition at line 140 of file startpage.py.
list searx.engines.startpage.categories = ['general', 'web'] |
Definition at line 128 of file startpage.py.
logging searx.engines.startpage.logger .Logger |
Definition at line 103 of file startpage.py.
int searx.engines.startpage.max_page = 18 |
Definition at line 130 of file startpage.py.
bool searx.engines.startpage.paging = True |
Definition at line 129 of file startpage.py.
bool searx.engines.startpage.safesearch = True |
Definition at line 134 of file startpage.py.
dict searx.engines.startpage.safesearch_dict = {0: '0', 1: '1', 2: '1'} |
Definition at line 137 of file startpage.py.
str searx.engines.startpage.sc_code = '' |
Definition at line 163 of file startpage.py.
int searx.engines.startpage.sc_code_cache_sec = 30 |
Definition at line 164 of file startpage.py.
int searx.engines.startpage.sc_code_ts = 0 |
Definition at line 162 of file startpage.py.
str searx.engines.startpage.search_form_xpath = '//form[@id="search"]' |
Definition at line 146 of file startpage.py.
str searx.engines.startpage.search_url = base_url + '/sp/search' |
Definition at line 141 of file startpage.py.
bool searx.engines.startpage.send_accept_language_header = True |
Definition at line 121 of file startpage.py.
str searx.engines.startpage.startpage_categ = 'web' |
Definition at line 117 of file startpage.py.
dict searx.engines.startpage.time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'} |
Definition at line 136 of file startpage.py.
bool searx.engines.startpage.time_range_support = True |
Definition at line 133 of file startpage.py.