.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.yahoo Namespace Reference

Functions

 build_sb_cookie (cookie_params)
 request (query, params)
 parse_url (url_string)
 response (resp)

Variables

logging logger .Logger
dict about
list categories = ['general', 'web']
bool paging = True
bool time_range_support = True
dict time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm'}
dict safesearch_dict = {0: 'p', 1: 'i', 2: 'r'}
dict region2domain
dict lang2domain
dict yahoo_languages

Detailed Description

Yahoo Search (Web)

Languages are supported by mapping the language to a domain.  If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.

Function Documentation

◆ build_sb_cookie()

searx.engines.yahoo.build_sb_cookie ( cookie_params)
Build sB cookie parameter from provided parameters.

:param cookie_params: Dictionary of cookie parameters
:type cookie_params: dict
:returns: Formatted cookie string
:rtype: str

Example:
    >>> cookie_params = {'v': '1', 'vm': 'p', 'fl': '1', 'vl': 'lang_fr'}
    >>> build_sb_cookie(cookie_params)
    'v=1&vm=p&fl=1&vl=lang_fr'

Definition at line 133 of file yahoo.py.

133def build_sb_cookie(cookie_params):
134 """Build sB cookie parameter from provided parameters.
135
136 :param cookie_params: Dictionary of cookie parameters
137 :type cookie_params: dict
138 :returns: Formatted cookie string
139 :rtype: str
140
141 Example:
142 >>> cookie_params = {'v': '1', 'vm': 'p', 'fl': '1', 'vl': 'lang_fr'}
143 >>> build_sb_cookie(cookie_params)
144 'v=1&vm=p&fl=1&vl=lang_fr'
145 """
146
147 cookie_parts = []
148 for key, value in cookie_params.items():
149 cookie_parts.append(f"{key}={value}")
150
151 return "&".join(cookie_parts)
152
153

Referenced by request().

Here is the caller graph for this function:

◆ parse_url()

searx.engines.yahoo.parse_url ( url_string)
remove yahoo-specific tracking-url

Definition at line 206 of file yahoo.py.

206def parse_url(url_string):
207 """remove yahoo-specific tracking-url"""
208
209 endings = ['/RS', '/RK']
210 endpositions = []
211 start = url_string.find('http', url_string.find('/RU=') + 1)
212
213 for ending in endings:
214 endpos = url_string.rfind(ending)
215 if endpos > -1:
216 endpositions.append(endpos)
217
218 if start == 0 or len(endpositions) == 0:
219 return url_string
220
221 end = min(endpositions)
222 return unquote(url_string[start:end])
223
224

Referenced by response().

Here is the caller graph for this function:

◆ request()

searx.engines.yahoo.request ( query,
params )
Build Yahoo search request.

Definition at line 154 of file yahoo.py.

154def request(query, params):
155 """Build Yahoo search request."""
156
157 lang, region = (params["language"].split("-") + [None])[:2]
158 lang = yahoo_languages.get(lang, "any")
159
160 # Build URL parameters
161 # - p (str): Search query string
162 # - btf (str): Time filter, maps to values like 'd' (day), 'w' (week), 'm' (month)
163 # - iscqry (str): Empty string, necessary for results to appear properly on first page
164 # - b (int): Search offset for pagination
165 # - pz (str): Amount of results expected for the page
166 url_params = {'p': query}
167
168 btf = time_range_dict.get(params['time_range'])
169 if btf:
170 url_params['btf'] = btf
171
172 if params['pageno'] == 1:
173 url_params['iscqry'] = ''
174 elif params['pageno'] >= 2:
175 url_params['b'] = params['pageno'] * 7 + 1 # 8, 15, 21, etc.
176 url_params['pz'] = 7
177 url_params['bct'] = 0
178 url_params['xargs'] = 0
179
180 # Build sB cookie (for filters)
181 # - vm (str): SafeSearch filter, maps to values like 'p' (None), 'i' (Moderate), 'r' (Strict)
182 # - fl (bool): Indicates if a search language is used or not
183 # - vl (str): The search language to use (e.g. lang_fr)
184 sbcookie_params = {
185 'v': 1,
186 'vm': safesearch_dict[params['safesearch']],
187 'fl': 1,
188 'vl': f'lang_{lang}',
189 'pn': 10,
190 'rw': 'new',
191 'userset': 1,
192 }
193 params['cookies']['sB'] = build_sb_cookie(sbcookie_params)
194
195 # Search region/language
196 domain = region2domain.get(region)
197 if not domain:
198 domain = lang2domain.get(lang, f'{lang}.search.yahoo.com')
199 logger.debug(f'domain selected: {domain}')
200 logger.debug(f'cookies: {params["cookies"]}')
201
202 params['url'] = f'https://{domain}/search?{urlencode(url_params)}'
203 params['domain'] = domain
204
205

References build_sb_cookie().

Here is the call graph for this function:

◆ response()

searx.engines.yahoo.response ( resp)
parse response

Definition at line 225 of file yahoo.py.

225def response(resp):
226 """parse response"""
227
228 results = []
229 dom = html.fromstring(resp.text)
230
231 url_xpath = './/div[contains(@class,"compTitle")]/h3/a/@href'
232 title_xpath = './/h3//a/@aria-label'
233
234 domain = resp.search_params['domain']
235 if domain == "search.yahoo.com":
236 url_xpath = './/div[contains(@class,"compTitle")]/a/@href'
237 title_xpath = './/div[contains(@class,"compTitle")]/a/h3/span'
238
239 # parse results
240 for result in eval_xpath_list(dom, '//div[contains(@class,"algo-sr")]'):
241 url = eval_xpath_getindex(result, url_xpath, 0, default=None)
242 if url is None:
243 continue
244 url = parse_url(url)
245
246 title = eval_xpath_getindex(result, title_xpath, 0, default='')
247 title: str = extract_text(title)
248 content = eval_xpath_getindex(result, './/div[contains(@class, "compText")]', 0, default='')
249 content: str = extract_text(content, allow_none=True)
250
251 # append result
252 results.append(
253 {
254 'url': url,
255 # title sometimes contains HTML tags / see
256 # https://github.com/searxng/searxng/issues/3790
257 'title': " ".join(html_to_text(title).strip().split()),
258 'content': " ".join(html_to_text(content).strip().split()),
259 }
260 )
261
262 for suggestion in eval_xpath_list(dom, '//div[contains(@class, "AlsoTry")]//table//a'):
263 # append suggestion
264 results.append({'suggestion': extract_text(suggestion)})
265
266 return results

References parse_url().

Here is the call graph for this function:

Variable Documentation

◆ about

dict searx.engines.yahoo.about
Initial value:
1= {
2 "website": 'https://search.yahoo.com/',
3 "wikidata_id": None,
4 "official_api_documentation": 'https://developer.yahoo.com/api/',
5 "use_official_api": False,
6 "require_api_key": False,
7 "results": 'HTML',
8}

Definition at line 32 of file yahoo.py.

◆ categories

list searx.engines.yahoo.categories = ['general', 'web']

Definition at line 42 of file yahoo.py.

◆ lang2domain

dict searx.engines.yahoo.lang2domain
Initial value:
1= {
2 'zh_chs': 'hk.search.yahoo.com',
3 'zh_cht': 'tw.search.yahoo.com',
4 'any': 'search.yahoo.com',
5 'en': 'search.yahoo.com',
6 'bg': 'search.yahoo.com',
7 'cs': 'search.yahoo.com',
8 'da': 'search.yahoo.com',
9 'el': 'search.yahoo.com',
10 'et': 'search.yahoo.com',
11 'he': 'search.yahoo.com',
12 'hr': 'search.yahoo.com',
13 'ja': 'search.yahoo.com',
14 'ko': 'search.yahoo.com',
15 'sk': 'search.yahoo.com',
16 'sl': 'search.yahoo.com',
17}

Definition at line 73 of file yahoo.py.

◆ logger

logging searx.engines.yahoo.logger .Logger

Definition at line 29 of file yahoo.py.

◆ paging

bool searx.engines.yahoo.paging = True

Definition at line 43 of file yahoo.py.

◆ region2domain

dict searx.engines.yahoo.region2domain
Initial value:
1= {
2 "CO": "co.search.yahoo.com", # Colombia
3 "TH": "th.search.yahoo.com", # Thailand
4 "VE": "ve.search.yahoo.com", # Venezuela
5 "CL": "cl.search.yahoo.com", # Chile
6 "HK": "hk.search.yahoo.com", # Hong Kong
7 "PE": "pe.search.yahoo.com", # Peru
8 "CA": "ca.search.yahoo.com", # Canada
9 "DE": "de.search.yahoo.com", # Germany
10 "FR": "fr.search.yahoo.com", # France
11 "TW": "tw.search.yahoo.com", # Taiwan
12 "GB": "uk.search.yahoo.com", # United Kingdom
13 "UK": "uk.search.yahoo.com",
14 "BR": "br.search.yahoo.com", # Brazil
15 "IN": "in.search.yahoo.com", # India
16 "ES": "espanol.search.yahoo.com", # Espanol
17 "PH": "ph.search.yahoo.com", # Philippines
18 "AR": "ar.search.yahoo.com", # Argentina
19 "MX": "mx.search.yahoo.com", # Mexico
20 "SG": "sg.search.yahoo.com", # Singapore
21}

Definition at line 50 of file yahoo.py.

◆ safesearch_dict

dict searx.engines.yahoo.safesearch_dict = {0: 'p', 1: 'i', 2: 'r'}

Definition at line 48 of file yahoo.py.

◆ time_range_dict

dict searx.engines.yahoo.time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm'}

Definition at line 47 of file yahoo.py.

◆ time_range_support

bool searx.engines.yahoo.time_range_support = True

Definition at line 44 of file yahoo.py.

◆ yahoo_languages

dict searx.engines.yahoo.yahoo_languages

Definition at line 92 of file yahoo.py.