.oO SearXNG Developer Documentation Oo.
|
Functions | |
get_sc_code (searxng_locale, params) | |
request (query, params) | |
tuple[str, datetime|None] | _parse_published_date (str content) |
_get_web_result (result) | |
_get_news_result (result) | |
dict[str, Any]|None | _get_image_result (result) |
response (resp) | |
fetch_traits (EngineTraits engine_traits) | |
Variables | |
logging | logger .Logger |
dict | about |
str | startpage_categ = 'web' |
bool | send_accept_language_header = True |
list | categories = ['general', 'web'] |
bool | paging = True |
int | max_page = 18 |
bool | time_range_support = True |
bool | safesearch = True |
dict | time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'} |
dict | safesearch_dict = {0: '0', 1: '1', 2: '1'} |
str | base_url = 'https://www.startpage.com' |
str | search_url = base_url + '/sp/search' |
str | search_form_xpath = '//form[@id="search"]' |
int | sc_code_ts = 0 |
str | sc_code = '' |
int | sc_code_cache_sec = 30 |
Startpage's language & region selectors are a mess .. .. _startpage regions: Startpage regions ================= In the list of regions there are tags we need to map to common region tags:: pt-BR_BR --> pt_BR zh-CN_CN --> zh_Hans_CN zh-TW_TW --> zh_Hant_TW zh-TW_HK --> zh_Hant_HK en-GB_GB --> en_GB and there is at least one tag with a three letter language tag (ISO 639-2):: fil_PH --> fil_PH The locale code ``no_NO`` from Startpage does not exists and is mapped to ``nb-NO``:: babel.core.UnknownLocaleError: unknown locale 'no_NO' For reference see languages-subtag at iana; ``no`` is the macrolanguage [1]_ and W3C recommends subtag over macrolanguage [2]_. .. [1] `iana: language-subtag-registry <https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry>`_ :: type: language Subtag: nb Description: Norwegian Bokmål Added: 2005-10-16 Suppress-Script: Latn Macrolanguage: no .. [2] Use macrolanguages with care. Some language subtags have a Scope field set to macrolanguage, i.e. this primary language subtag encompasses a number of more specific primary language subtags in the registry. ... As we recommended for the collection subtags mentioned above, in most cases you should try to use the more specific subtags ... `W3: The primary language subtag <https://www.w3.org/International/questions/qa-choosing-language-tags#langsubtag>`_ .. _startpage languages: Startpage languages =================== :py:obj:`send_accept_language_header`: The displayed name in Startpage's settings page depend on the location of the IP when ``Accept-Language`` HTTP header is unset. In :py:obj:`fetch_traits` we use:: 'Accept-Language': "en-US,en;q=0.5", .. to get uniform names independent from the IP). .. _startpage categories: Startpage categories ==================== Startpage's category (for Web-search, News, Videos, ..) is set by :py:obj:`startpage_categ` in settings.yml:: - name: startpage engine: startpage startpage_categ: web ... .. hint:: Supported categories are ``web``, ``news`` and ``images``.
|
protected |
Definition at line 375 of file startpage.py.
Referenced by response().
|
protected |
Definition at line 353 of file startpage.py.
Referenced by response().
|
protected |
Definition at line 341 of file startpage.py.
References _parse_published_date().
Referenced by response().
|
protected |
Definition at line 312 of file startpage.py.
Referenced by _get_web_result().
searx.engines.startpage.fetch_traits | ( | EngineTraits | engine_traits | ) |
Fetch :ref:`languages <startpage languages>` and :ref:`regions <startpage regions>` from Startpage.
Definition at line 427 of file startpage.py.
searx.engines.startpage.get_sc_code | ( | searxng_locale, | |
params ) |
Get an actual ``sc`` argument from Startpage's search form (HTML page). Startpage puts a ``sc`` argument on every HTML :py:obj:`search form <search_form_xpath>`. Without this argument Startpage considers the request is from a bot. We do not know what is encoded in the value of the ``sc`` argument, but it seems to be a kind of a *time-stamp*. Startpage's search form generates a new sc-code on each request. This function scrap a new sc-code from Startpage's home page every :py:obj:`sc_code_cache_sec` seconds.
Definition at line 169 of file startpage.py.
Referenced by request().
searx.engines.startpage.request | ( | query, | |
params ) |
Assemble a Startpage request. To avoid CAPTCHA we need to send a well formed HTTP POST request with a cookie. We need to form a request that is identical to the request build by Startpage's search form: - in the cookie the **region** is selected - in the HTTP POST data the **language** is selected Additionally the arguments form Startpage's search form needs to be set in HTML POST data / compare ``<input>`` elements: :py:obj:`search_form_xpath`.
Definition at line 241 of file startpage.py.
References get_sc_code().
searx.engines.startpage.response | ( | resp | ) |
Definition at line 406 of file startpage.py.
References _get_image_result(), _get_news_result(), and _get_web_result().
dict searx.engines.startpage.about |
Definition at line 109 of file startpage.py.
str searx.engines.startpage.base_url = 'https://www.startpage.com' |
Definition at line 141 of file startpage.py.
list searx.engines.startpage.categories = ['general', 'web'] |
Definition at line 129 of file startpage.py.
logging searx.engines.startpage.logger .Logger |
Definition at line 104 of file startpage.py.
int searx.engines.startpage.max_page = 18 |
Definition at line 131 of file startpage.py.
bool searx.engines.startpage.paging = True |
Definition at line 130 of file startpage.py.
bool searx.engines.startpage.safesearch = True |
Definition at line 135 of file startpage.py.
dict searx.engines.startpage.safesearch_dict = {0: '0', 1: '1', 2: '1'} |
Definition at line 138 of file startpage.py.
str searx.engines.startpage.sc_code = '' |
Definition at line 164 of file startpage.py.
int searx.engines.startpage.sc_code_cache_sec = 30 |
Definition at line 165 of file startpage.py.
int searx.engines.startpage.sc_code_ts = 0 |
Definition at line 163 of file startpage.py.
str searx.engines.startpage.search_form_xpath = '//form[@id="search"]' |
Definition at line 147 of file startpage.py.
str searx.engines.startpage.search_url = base_url + '/sp/search' |
Definition at line 142 of file startpage.py.
bool searx.engines.startpage.send_accept_language_header = True |
Definition at line 122 of file startpage.py.
str searx.engines.startpage.startpage_categ = 'web' |
Definition at line 118 of file startpage.py.
dict searx.engines.startpage.time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'} |
Definition at line 137 of file startpage.py.
bool searx.engines.startpage.time_range_support = True |
Definition at line 134 of file startpage.py.