.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.duckduckgo_definitions Namespace Reference

Functions

 is_broken_text (text)
 
 result_to_text (text, htmlResult)
 
 request (query, params)
 
 response (resp)
 
 unit_to_str (unit)
 
 area_to_str (area)
 

Variables

logging logger .Logger
 
dict about
 
bool send_accept_language_header = True
 
str URL = 'https://api.duckduckgo.com/' + '?{query}&format=json&pretty=0&no_redirect=1&d=1'
 
list WIKIDATA_PREFIX = ['http://www.wikidata.org/entity/', 'https://www.wikidata.org/entity/']
 
 replace_http_by_https = get_string_replaces_function({'http:': 'https:'})
 

Detailed Description

DuckDuckGo Instant Answer API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The `DDG-API <https://duckduckgo.com/api>`__ is no longer documented but from
reverse engineering we can see that some services (e.g. instant answers) still
in use from the DDG search engine.

As far we can say the *instant answers* API does not support languages, or at
least we could not find out how language support should work.  It seems that
most of the features are based on English terms.

Function Documentation

◆ area_to_str()

searx.engines.duckduckgo_definitions.area_to_str ( area)
parse ``{'unit': 'https://www.wikidata.org/entity/Q712226', 'amount': '+20.99'}``

Definition at line 248 of file duckduckgo_definitions.py.

248def area_to_str(area):
249 """parse ``{'unit': 'https://www.wikidata.org/entity/Q712226', 'amount': '+20.99'}``"""
250 unit = unit_to_str(area.get('unit'))
251 if unit is not None:
252 try:
253 amount = float(area.get('amount'))
254 return '{} {}'.format(amount, unit)
255 except ValueError:
256 pass
257 return '{} {}'.format(area.get('amount', ''), area.get('unit', ''))

References searx.format, and searx.engines.duckduckgo_definitions.unit_to_str().

Referenced by searx.engines.duckduckgo_definitions.response().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ is_broken_text()

searx.engines.duckduckgo_definitions.is_broken_text ( text)
duckduckgo may return something like ``<a href="xxxx">http://somewhere Related website<a/>``

The href URL is broken, the "Related website" may contains some HTML.

The best solution seems to ignore these results.

Definition at line 49 of file duckduckgo_definitions.py.

49def is_broken_text(text):
50 """duckduckgo may return something like ``<a href="xxxx">http://somewhere Related website<a/>``
51
52 The href URL is broken, the "Related website" may contains some HTML.
53
54 The best solution seems to ignore these results.
55 """
56 return text.startswith('http') and ' ' in text
57
58

Referenced by searx.engines.duckduckgo_definitions.response(), and searx.engines.duckduckgo_definitions.result_to_text().

+ Here is the caller graph for this function:

◆ request()

searx.engines.duckduckgo_definitions.request ( query,
params )

Definition at line 73 of file duckduckgo_definitions.py.

73def request(query, params):
74 params['url'] = URL.format(query=urlencode({'q': query}))
75 return params
76
77

◆ response()

searx.engines.duckduckgo_definitions.response ( resp)

Definition at line 78 of file duckduckgo_definitions.py.

78def response(resp):
79 # pylint: disable=too-many-locals, too-many-branches, too-many-statements
80 results = []
81
82 search_res = resp.json()
83
84 # search_res.get('Entity') possible values (not exhaustive) :
85 # * continent / country / department / location / waterfall
86 # * actor / musician / artist
87 # * book / performing art / film / television / media franchise / concert tour / playwright
88 # * prepared food
89 # * website / software / os / programming language / file format / software engineer
90 # * company
91
92 content = ''
93 heading = search_res.get('Heading', '')
94 attributes = []
95 urls = []
96 infobox_id = None
97 relatedTopics = []
98
99 # add answer if there is one
100 answer = search_res.get('Answer', '')
101 if answer:
102 logger.debug('AnswerType="%s" Answer="%s"', search_res.get('AnswerType'), answer)
103 if search_res.get('AnswerType') not in ['calc', 'ip']:
104 results.append({'answer': html_to_text(answer), 'url': search_res.get('AbstractURL', '')})
105
106 # add infobox
107 if 'Definition' in search_res:
108 content = content + search_res.get('Definition', '')
109
110 if 'Abstract' in search_res:
111 content = content + search_res.get('Abstract', '')
112
113 # image
114 image = search_res.get('Image')
115 image = None if image == '' else image
116 if image is not None and urlparse(image).netloc == '':
117 image = urljoin('https://duckduckgo.com', image)
118
119 # urls
120 # Official website, Wikipedia page
121 for ddg_result in search_res.get('Results', []):
122 firstURL = ddg_result.get('FirstURL')
123 text = ddg_result.get('Text')
124 if firstURL is not None and text is not None:
125 urls.append({'title': text, 'url': firstURL})
126 results.append({'title': heading, 'url': firstURL})
127
128 # related topics
129 for ddg_result in search_res.get('RelatedTopics', []):
130 if 'FirstURL' in ddg_result:
131 firstURL = ddg_result.get('FirstURL')
132 text = ddg_result.get('Text')
133 if not is_broken_text(text):
134 suggestion = result_to_text(text, ddg_result.get('Result'))
135 if suggestion != heading and suggestion is not None:
136 results.append({'suggestion': suggestion})
137 elif 'Topics' in ddg_result:
138 suggestions = []
139 relatedTopics.append({'name': ddg_result.get('Name', ''), 'suggestions': suggestions})
140 for topic_result in ddg_result.get('Topics', []):
141 suggestion = result_to_text(topic_result.get('Text'), topic_result.get('Result'))
142 if suggestion != heading and suggestion is not None:
143 suggestions.append(suggestion)
144
145 # abstract
146 abstractURL = search_res.get('AbstractURL', '')
147 if abstractURL != '':
148 # add as result ? problem always in english
149 infobox_id = abstractURL
150 urls.append({'title': search_res.get('AbstractSource'), 'url': abstractURL, 'official': True})
151 results.append({'url': abstractURL, 'title': heading})
152
153 # definition
154 definitionURL = search_res.get('DefinitionURL', '')
155 if definitionURL != '':
156 # add as result ? as answer ? problem always in english
157 infobox_id = definitionURL
158 urls.append({'title': search_res.get('DefinitionSource'), 'url': definitionURL})
159
160 # to merge with wikidata's infobox
161 if infobox_id:
162 infobox_id = replace_http_by_https(infobox_id)
163
164 # attributes
165 # some will be converted to urls
166 if 'Infobox' in search_res:
167 infobox = search_res.get('Infobox')
168 if 'content' in infobox:
169 osm_zoom = 17
170 coordinates = None
171 for info in infobox.get('content'):
172 data_type = info.get('data_type')
173 data_label = info.get('label')
174 data_value = info.get('value')
175
176 # Workaround: ddg may return a double quote
177 if data_value == '""':
178 continue
179
180 # Is it an external URL ?
181 # * imdb_id / facebook_profile / youtube_channel / youtube_video / twitter_profile
182 # * instagram_profile / rotten_tomatoes / spotify_artist_id / itunes_artist_id / soundcloud_id
183 # * netflix_id
184 external_url = get_external_url(data_type, data_value)
185 if external_url is not None:
186 urls.append({'title': data_label, 'url': external_url})
187 elif data_type in ['instance', 'wiki_maps_trigger', 'google_play_artist_id']:
188 # ignore instance: Wikidata value from "Instance Of" (Qxxxx)
189 # ignore wiki_maps_trigger: reference to a javascript
190 # ignore google_play_artist_id: service shutdown
191 pass
192 elif data_type == 'string' and data_label == 'Website':
193 # There is already an URL for the website
194 pass
195 elif data_type == 'area':
196 attributes.append({'label': data_label, 'value': area_to_str(data_value), 'entity': 'P2046'})
197 osm_zoom = area_to_osm_zoom(data_value.get('amount'))
198 elif data_type == 'coordinates':
199 if data_value.get('globe') == 'http://www.wikidata.org/entity/Q2':
200 # coordinate on Earth
201 # get the zoom information from the area
202 coordinates = info
203 else:
204 # coordinate NOT on Earth
205 attributes.append({'label': data_label, 'value': data_value, 'entity': 'P625'})
206 elif data_type == 'string':
207 attributes.append({'label': data_label, 'value': data_value})
208
209 if coordinates:
210 data_label = coordinates.get('label')
211 data_value = coordinates.get('value')
212 latitude = data_value.get('latitude')
213 longitude = data_value.get('longitude')
214 url = get_earth_coordinates_url(latitude, longitude, osm_zoom)
215 urls.append({'title': 'OpenStreetMap', 'url': url, 'entity': 'P625'})
216
217 if len(heading) > 0:
218 # TODO get infobox.meta.value where .label='article_title' # pylint: disable=fixme
219 if image is None and len(attributes) == 0 and len(urls) == 1 and len(relatedTopics) == 0 and len(content) == 0:
220 results.append({'url': urls[0]['url'], 'title': heading, 'content': content})
221 else:
222 results.append(
223 {
224 'infobox': heading,
225 'id': infobox_id,
226 'content': content,
227 'img_src': image,
228 'attributes': attributes,
229 'urls': urls,
230 'relatedTopics': relatedTopics,
231 }
232 )
233
234 return results
235
236

References searx.engines.duckduckgo_definitions.area_to_str(), searx.engines.duckduckgo_definitions.is_broken_text(), searx.engines.duckduckgo_definitions.replace_http_by_https, and searx.engines.duckduckgo_definitions.result_to_text().

+ Here is the call graph for this function:

◆ result_to_text()

searx.engines.duckduckgo_definitions.result_to_text ( text,
htmlResult )

Definition at line 59 of file duckduckgo_definitions.py.

59def result_to_text(text, htmlResult):
60 # TODO : remove result ending with "Meaning" or "Category" # pylint: disable=fixme
61 result = None
62 dom = html.fromstring(htmlResult)
63 a = dom.xpath('//a')
64 if len(a) >= 1:
65 result = extract_text(a[0])
66 else:
67 result = text
68 if not is_broken_text(result):
69 return result
70 return None
71
72

References searx.engines.duckduckgo_definitions.is_broken_text().

Referenced by searx.engines.duckduckgo_definitions.response().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ unit_to_str()

searx.engines.duckduckgo_definitions.unit_to_str ( unit)

Definition at line 237 of file duckduckgo_definitions.py.

237def unit_to_str(unit):
238 for prefix in WIKIDATA_PREFIX:
239 if unit.startswith(prefix):
240 wikidata_entity = unit[len(prefix) :]
241 real_unit = WIKIDATA_UNITS.get(wikidata_entity)
242 if real_unit is None:
243 return unit
244 return real_unit['symbol']
245 return unit
246
247

Referenced by searx.engines.duckduckgo_definitions.area_to_str().

+ Here is the caller graph for this function:

Variable Documentation

◆ about

dict searx.engines.duckduckgo_definitions.about
Initial value:
1= {
2 "website": 'https://duckduckgo.com/',
3 "wikidata_id": 'Q12805',
4 "official_api_documentation": 'https://duckduckgo.com/api',
5 "use_official_api": True,
6 "require_api_key": False,
7 "results": 'JSON',
8}

Definition at line 31 of file duckduckgo_definitions.py.

◆ logger

logging searx.engines.duckduckgo_definitions.logger .Logger

Definition at line 28 of file duckduckgo_definitions.py.

◆ replace_http_by_https

searx.engines.duckduckgo_definitions.replace_http_by_https = get_string_replaces_function({'http:': 'https:'})

◆ send_accept_language_header

bool searx.engines.duckduckgo_definitions.send_accept_language_header = True

Definition at line 40 of file duckduckgo_definitions.py.

◆ URL

str searx.engines.duckduckgo_definitions.URL = 'https://api.duckduckgo.com/' + '?{query}&format=json&pretty=0&no_redirect=1&d=1'

Definition at line 42 of file duckduckgo_definitions.py.

◆ WIKIDATA_PREFIX

list searx.engines.duckduckgo_definitions.WIKIDATA_PREFIX = ['http://www.wikidata.org/entity/', 'https://www.wikidata.org/entity/']

Definition at line 44 of file duckduckgo_definitions.py.