.oO SearXNG Developer Documentation Oo.
All Classes Namespaces Files Functions Variables Pages
searx.engines.duckduckgo Namespace Reference

Functions

 _cache_key (str query, str region)
 
 cache_vqd (str query, str region, str value)
 
 get_vqd (str query, str region, bool force_request=False)
 
 get_ddg_lang (EngineTraits eng_traits, sxng_locale, default='en_US')
 

Variables

logging logger .Logger
 
dict about
 
bool send_accept_language_header = True
 
list categories = ['general', 'web']
 
bool paging = True
 
bool time_range_support = True
 
bool safesearch = True
 
str url = "https://html.duckduckgo.com/html"
 
dict time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'}
 
dict form_data = {'v': 'l', 'api': 'd.js', 'o': 'json'}
 
list __CACHE = []
 
dict ddg_reg_map
 

Detailed Description

DuckDuckGo WEB ~~~~~~~~~~~~~~

Function Documentation

◆ _cache_key()

searx.engines.duckduckgo._cache_key ( str query,
str region )
protected

Definition at line 67 of file duckduckgo.py.

67def _cache_key(query: str, region: str):
68 return 'SearXNG_ddg_web_vqd' + redislib.secret_hash(f"{query}//{region}")
69
70

Referenced by cache_vqd(), and get_vqd().

+ Here is the caller graph for this function:

◆ cache_vqd()

searx.engines.duckduckgo.cache_vqd ( str query,
str region,
str value )
Caches a ``vqd`` value from a query.

Definition at line 71 of file duckduckgo.py.

71def cache_vqd(query: str, region: str, value: str):
72 """Caches a ``vqd`` value from a query."""
73 c = redisdb.client()
74 if c:
75 logger.debug("VALKEY cache vqd value: %s (%s)", value, region)
76 c.set(_cache_key(query, region), value, ex=600)
77
78 else:
79 logger.debug("MEM cache vqd value: %s (%s)", value, region)
80 if len(__CACHE) > 100: # cache vqd from last 100 queries
81 __CACHE.pop(0)
82 __CACHE.append((_cache_key(query, region), value))
83
84

References _cache_key().

Referenced by get_vqd().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ get_ddg_lang()

searx.engines.duckduckgo.get_ddg_lang ( EngineTraits eng_traits,
sxng_locale,
default = 'en_US' )
Get DuckDuckGo's language identifier from SearXNG's locale. DuckDuckGo defines its languages by region codes (see :py:obj:`fetch_traits`). To get region and language of a DDG service use: .. code: python eng_region = traits.get_region(params['searxng_locale'], traits.all_locale) eng_lang = get_ddg_lang(traits, params['searxng_locale']) It might confuse, but the ``l`` value of the cookie is what SearXNG calls the *region*: .. code:: python # !ddi paris :es-AR --> {'ad': 'es_AR', 'ah': 'ar-es', 'l': 'ar-es'} params['cookies']['ad'] = eng_lang params['cookies']['ah'] = eng_region params['cookies']['l'] = eng_region .. hint:: `DDG-lite <https://lite.duckduckgo.com/lite>`__ and the *no Javascript* page https://html.duckduckgo.com/html do not offer a language selection to the user, only a region can be selected by the user (``eng_region`` from the example above). DDG-lite and *no Javascript* store the selected region in a cookie:: params['cookies']['kl'] = eng_region # 'ar-es'

Definition at line 144 of file duckduckgo.py.

144def get_ddg_lang(eng_traits: EngineTraits, sxng_locale, default='en_US'):
145 """Get DuckDuckGo's language identifier from SearXNG's locale.
146
147 DuckDuckGo defines its languages by region codes (see
148 :py:obj:`fetch_traits`).
149
150 To get region and language of a DDG service use:
151
152 .. code: python
153
154 eng_region = traits.get_region(params['searxng_locale'], traits.all_locale)
155 eng_lang = get_ddg_lang(traits, params['searxng_locale'])
156
157 It might confuse, but the ``l`` value of the cookie is what SearXNG calls
158 the *region*:
159
160 .. code:: python
161
162 # !ddi paris :es-AR --> {'ad': 'es_AR', 'ah': 'ar-es', 'l': 'ar-es'}
163 params['cookies']['ad'] = eng_lang
164 params['cookies']['ah'] = eng_region
165 params['cookies']['l'] = eng_region
166
167 .. hint::
168
169 `DDG-lite <https://lite.duckduckgo.com/lite>`__ and the *no Javascript*
170 page https://html.duckduckgo.com/html do not offer a language selection
171 to the user, only a region can be selected by the user (``eng_region``
172 from the example above). DDG-lite and *no Javascript* store the selected
173 region in a cookie::
174
175 params['cookies']['kl'] = eng_region # 'ar-es'
176
177 """
178 return eng_traits.custom['lang_region'].get( # type: ignore
179 sxng_locale, eng_traits.get_language(sxng_locale, default)
180 )
181
182

◆ get_vqd()

searx.engines.duckduckgo.get_vqd ( str query,
str region,
bool force_request = False )
Returns the ``vqd`` that fits to the *query*. :param query: The query term :param region: DDG's region code :param force_request: force a request to get a vqd value from DDG TL;DR; the ``vqd`` value is needed to pass DDG's bot protection and is used by all request to DDG: - DuckDuckGo Lite: ``https://lite.duckduckgo.com/lite`` (POST form data) - DuckDuckGo Web: ``https://links.duckduckgo.com/d.js?q=...&vqd=...`` - DuckDuckGo Images: ``https://duckduckgo.com/i.js??q=...&vqd=...`` - DuckDuckGo Videos: ``https://duckduckgo.com/v.js??q=...&vqd=...`` - DuckDuckGo News: ``https://duckduckgo.com/news.js??q=...&vqd=...`` DDG's bot detection is sensitive to the ``vqd`` value. For some search terms (such as extremely long search terms that are often sent by bots), no ``vqd`` value can be determined. If SearXNG cannot determine a ``vqd`` value, then no request should go out to DDG. .. attention:: A request with a wrong ``vqd`` value leads to DDG temporarily putting SearXNG's IP on a block list. Requests from IPs in this block list run into timeouts. Not sure, but it seems the block list is a sliding window: to get my IP rid from the bot list I had to cool down my IP for 1h (send no requests from that IP to DDG).

Definition at line 85 of file duckduckgo.py.

85def get_vqd(query: str, region: str, force_request: bool = False):
86 """Returns the ``vqd`` that fits to the *query*.
87
88 :param query: The query term
89 :param region: DDG's region code
90 :param force_request: force a request to get a vqd value from DDG
91
92 TL;DR; the ``vqd`` value is needed to pass DDG's bot protection and is used
93 by all request to DDG:
94
95 - DuckDuckGo Lite: ``https://lite.duckduckgo.com/lite`` (POST form data)
96 - DuckDuckGo Web: ``https://links.duckduckgo.com/d.js?q=...&vqd=...``
97 - DuckDuckGo Images: ``https://duckduckgo.com/i.js??q=...&vqd=...``
98 - DuckDuckGo Videos: ``https://duckduckgo.com/v.js??q=...&vqd=...``
99 - DuckDuckGo News: ``https://duckduckgo.com/news.js??q=...&vqd=...``
100
101 DDG's bot detection is sensitive to the ``vqd`` value. For some search terms
102 (such as extremely long search terms that are often sent by bots), no ``vqd``
103 value can be determined.
104
105 If SearXNG cannot determine a ``vqd`` value, then no request should go out
106 to DDG.
107
108 .. attention::
109
110 A request with a wrong ``vqd`` value leads to DDG temporarily putting
111 SearXNG's IP on a block list.
112
113 Requests from IPs in this block list run into timeouts. Not sure, but it
114 seems the block list is a sliding window: to get my IP rid from the bot list
115 I had to cool down my IP for 1h (send no requests from that IP to DDG).
116 """
117 key = _cache_key(query, region)
118
119 c = redisdb.client()
120 if c:
121 value = c.get(key)
122 if value or value == b'':
123 value = value.decode('utf-8') # type: ignore
124 logger.debug("re-use CACHED vqd value: %s", value)
125 return value
126
127 for k, value in __CACHE:
128 if k == key:
129 logger.debug("MEM re-use CACHED vqd value: %s", value)
130 return value
131
132 if force_request:
133 resp = get(f'https://duckduckgo.com/?q={quote_plus(query)}')
134 if resp.status_code == 200: # type: ignore
135 value = extr(resp.text, 'vqd="', '"') # type: ignore
136 if value:
137 logger.debug("vqd value from DDG request: %s", value)
138 cache_vqd(query, region, value)
139 return value
140
141 return None
142
143

References _cache_key(), and cache_vqd().

+ Here is the call graph for this function:

Variable Documentation

◆ __CACHE

list searx.engines.duckduckgo.__CACHE = []
private

Definition at line 64 of file duckduckgo.py.

◆ about

dict searx.engines.duckduckgo.about
Initial value:
1= {
2 "website": 'https://lite.duckduckgo.com/lite/',
3 "wikidata_id": 'Q12805',
4 "use_official_api": False,
5 "require_api_key": False,
6 "results": 'HTML',
7}

Definition at line 40 of file duckduckgo.py.

◆ categories

list searx.engines.duckduckgo.categories = ['general', 'web']

Definition at line 55 of file duckduckgo.py.

◆ ddg_reg_map

dict searx.engines.duckduckgo.ddg_reg_map
Initial value:
1= {
2 'tw-tzh': 'zh_TW',
3 'hk-tzh': 'zh_HK',
4 'ct-ca': 'skip', # ct-ca and es-ca both map to ca_ES
5 'es-ca': 'ca_ES',
6 'id-en': 'id_ID',
7 'no-no': 'nb_NO',
8 'jp-jp': 'ja_JP',
9 'kr-kr': 'ko_KR',
10 'xa-ar': 'ar_SA',
11 'sl-sl': 'sl_SI',
12 'th-en': 'th_TH',
13 'vn-en': 'vi_VN',
14}

Definition at line 183 of file duckduckgo.py.

◆ form_data

dict searx.engines.duckduckgo.form_data = {'v': 'l', 'api': 'd.js', 'o': 'json'}

Definition at line 63 of file duckduckgo.py.

◆ logger

logging searx.engines.duckduckgo.logger .Logger

Definition at line 36 of file duckduckgo.py.

◆ paging

bool searx.engines.duckduckgo.paging = True

Definition at line 56 of file duckduckgo.py.

◆ safesearch

bool searx.engines.duckduckgo.safesearch = True

Definition at line 58 of file duckduckgo.py.

◆ send_accept_language_header

bool searx.engines.duckduckgo.send_accept_language_header = True

Definition at line 48 of file duckduckgo.py.

◆ time_range_dict

dict searx.engines.duckduckgo.time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm', 'year': 'y'}

Definition at line 62 of file duckduckgo.py.

◆ time_range_support

bool searx.engines.duckduckgo.time_range_support = True

Definition at line 57 of file duckduckgo.py.

◆ url

str searx.engines.duckduckgo.url = "https://html.duckduckgo.com/html"

Definition at line 60 of file duckduckgo.py.