.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.yahoo Namespace Reference

Functions

 build_sb_cookie (cookie_params)
 request (query, params)
 parse_url (url_string)
 response (resp)

Variables

dict about
list categories = ['general', 'web']
bool paging = True
bool time_range_support = True
dict time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm'}
dict safesearch_dict = {0: 'p', 1: 'i', 2: 'r'}
dict region2domain
dict lang2domain
dict yahoo_languages

Detailed Description

Yahoo Search (Web)

Languages are supported by mapping the language to a domain.  If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.

Function Documentation

◆ build_sb_cookie()

searx.engines.yahoo.build_sb_cookie ( cookie_params)
Build sB cookie parameter from provided parameters.

:param cookie_params: Dictionary of cookie parameters
:type cookie_params: dict
:returns: Formatted cookie string
:rtype: str

Example:
    >>> cookie_params = {'v': '1', 'vm': 'p', 'fl': '1', 'vl': 'lang_fr'}
    >>> build_sb_cookie(cookie_params)
    'v=1&vm=p&fl=1&vl=lang_fr'

Definition at line 124 of file yahoo.py.

124def build_sb_cookie(cookie_params):
125 """Build sB cookie parameter from provided parameters.
126
127 :param cookie_params: Dictionary of cookie parameters
128 :type cookie_params: dict
129 :returns: Formatted cookie string
130 :rtype: str
131
132 Example:
133 >>> cookie_params = {'v': '1', 'vm': 'p', 'fl': '1', 'vl': 'lang_fr'}
134 >>> build_sb_cookie(cookie_params)
135 'v=1&vm=p&fl=1&vl=lang_fr'
136 """
137
138 cookie_parts = []
139 for key, value in cookie_params.items():
140 cookie_parts.append(f"{key}={value}")
141
142 return "&".join(cookie_parts)
143
144

Referenced by request().

Here is the caller graph for this function:

◆ parse_url()

searx.engines.yahoo.parse_url ( url_string)
remove yahoo-specific tracking-url

Definition at line 197 of file yahoo.py.

197def parse_url(url_string):
198 """remove yahoo-specific tracking-url"""
199
200 endings = ['/RS', '/RK']
201 endpositions = []
202 start = url_string.find('http', url_string.find('/RU=') + 1)
203
204 for ending in endings:
205 endpos = url_string.rfind(ending)
206 if endpos > -1:
207 endpositions.append(endpos)
208
209 if start == 0 or len(endpositions) == 0:
210 return url_string
211
212 end = min(endpositions)
213 return unquote(url_string[start:end])
214
215

Referenced by response().

Here is the caller graph for this function:

◆ request()

searx.engines.yahoo.request ( query,
params )
Build Yahoo search request.

Definition at line 145 of file yahoo.py.

145def request(query, params):
146 """Build Yahoo search request."""
147
148 lang, region = (params["language"].split("-") + [None])[:2]
149 lang = yahoo_languages.get(lang, "any")
150
151 # Build URL parameters
152 # - p (str): Search query string
153 # - btf (str): Time filter, maps to values like 'd' (day), 'w' (week), 'm' (month)
154 # - iscqry (str): Empty string, necessary for results to appear properly on first page
155 # - b (int): Search offset for pagination
156 # - pz (str): Amount of results expected for the page
157 url_params = {'p': query}
158
159 btf = time_range_dict.get(params['time_range'])
160 if btf:
161 url_params['btf'] = btf
162
163 if params['pageno'] == 1:
164 url_params['iscqry'] = ''
165 elif params['pageno'] >= 2:
166 url_params['b'] = params['pageno'] * 7 + 1 # 8, 15, 21, etc.
167 url_params['pz'] = 7
168 url_params['bct'] = 0
169 url_params['xargs'] = 0
170
171 # Build sB cookie (for filters)
172 # - vm (str): SafeSearch filter, maps to values like 'p' (None), 'i' (Moderate), 'r' (Strict)
173 # - fl (bool): Indicates if a search language is used or not
174 # - vl (str): The search language to use (e.g. lang_fr)
175 sbcookie_params = {
176 'v': 1,
177 'vm': safesearch_dict[params['safesearch']],
178 'fl': 1,
179 'vl': f'lang_{lang}',
180 'pn': 10,
181 'rw': 'new',
182 'userset': 1,
183 }
184 params['cookies']['sB'] = build_sb_cookie(sbcookie_params)
185
186 # Search region/language
187 domain = region2domain.get(region)
188 if not domain:
189 domain = lang2domain.get(lang, f'{lang}.search.yahoo.com')
190 logger.debug(f'domain selected: {domain}')
191 logger.debug(f'cookies: {params["cookies"]}')
192
193 params['url'] = f'https://{domain}/search?{urlencode(url_params)}'
194 params['domain'] = domain
195
196

References build_sb_cookie().

Here is the call graph for this function:

◆ response()

searx.engines.yahoo.response ( resp)
parse response

Definition at line 216 of file yahoo.py.

216def response(resp):
217 """parse response"""
218
219 results = []
220 dom = html.fromstring(resp.text)
221
222 url_xpath = './/div[contains(@class,"compTitle")]/h3/a/@href'
223 title_xpath = './/h3//a/@aria-label'
224
225 domain = resp.search_params['domain']
226 if domain == "search.yahoo.com":
227 url_xpath = './/div[contains(@class,"compTitle")]/a/@href'
228 title_xpath = './/div[contains(@class,"compTitle")]/a/h3/span'
229
230 # parse results
231 for result in eval_xpath_list(dom, '//div[contains(@class,"algo-sr")]'):
232 url = eval_xpath_getindex(result, url_xpath, 0, default=None)
233 if url is None:
234 continue
235 url = parse_url(url)
236
237 title = eval_xpath_getindex(result, title_xpath, 0, default='')
238 title: str = extract_text(title)
239 content = eval_xpath_getindex(result, './/div[contains(@class, "compText")]', 0, default='')
240 content: str = extract_text(content, allow_none=True)
241
242 # append result
243 results.append(
244 {
245 'url': url,
246 # title sometimes contains HTML tags / see
247 # https://github.com/searxng/searxng/issues/3790
248 'title': " ".join(html_to_text(title).strip().split()),
249 'content': " ".join(html_to_text(content).strip().split()),
250 }
251 )
252
253 for suggestion in eval_xpath_list(dom, '//div[contains(@class, "AlsoTry")]//table//a'):
254 # append suggestion
255 results.append({'suggestion': extract_text(suggestion)})
256
257 return results

References parse_url().

Here is the call graph for this function:

Variable Documentation

◆ about

dict searx.engines.yahoo.about
Initial value:
1= {
2 "website": 'https://search.yahoo.com/',
3 "wikidata_id": None,
4 "official_api_documentation": 'https://developer.yahoo.com/api/',
5 "use_official_api": False,
6 "require_api_key": False,
7 "results": 'HTML',
8}

Definition at line 23 of file yahoo.py.

◆ categories

list searx.engines.yahoo.categories = ['general', 'web']

Definition at line 33 of file yahoo.py.

◆ lang2domain

dict searx.engines.yahoo.lang2domain
Initial value:
1= {
2 'zh_chs': 'hk.search.yahoo.com',
3 'zh_cht': 'tw.search.yahoo.com',
4 'any': 'search.yahoo.com',
5 'en': 'search.yahoo.com',
6 'bg': 'search.yahoo.com',
7 'cs': 'search.yahoo.com',
8 'da': 'search.yahoo.com',
9 'el': 'search.yahoo.com',
10 'et': 'search.yahoo.com',
11 'he': 'search.yahoo.com',
12 'hr': 'search.yahoo.com',
13 'ja': 'search.yahoo.com',
14 'ko': 'search.yahoo.com',
15 'sk': 'search.yahoo.com',
16 'sl': 'search.yahoo.com',
17}

Definition at line 64 of file yahoo.py.

◆ paging

bool searx.engines.yahoo.paging = True

Definition at line 34 of file yahoo.py.

◆ region2domain

dict searx.engines.yahoo.region2domain
Initial value:
1= {
2 "CO": "co.search.yahoo.com", # Colombia
3 "TH": "th.search.yahoo.com", # Thailand
4 "VE": "ve.search.yahoo.com", # Venezuela
5 "CL": "cl.search.yahoo.com", # Chile
6 "HK": "hk.search.yahoo.com", # Hong Kong
7 "PE": "pe.search.yahoo.com", # Peru
8 "CA": "ca.search.yahoo.com", # Canada
9 "DE": "de.search.yahoo.com", # Germany
10 "FR": "fr.search.yahoo.com", # France
11 "TW": "tw.search.yahoo.com", # Taiwan
12 "GB": "uk.search.yahoo.com", # United Kingdom
13 "UK": "uk.search.yahoo.com",
14 "BR": "br.search.yahoo.com", # Brazil
15 "IN": "in.search.yahoo.com", # India
16 "ES": "espanol.search.yahoo.com", # Espanol
17 "PH": "ph.search.yahoo.com", # Philippines
18 "AR": "ar.search.yahoo.com", # Argentina
19 "MX": "mx.search.yahoo.com", # Mexico
20 "SG": "sg.search.yahoo.com", # Singapore
21}

Definition at line 41 of file yahoo.py.

◆ safesearch_dict

dict searx.engines.yahoo.safesearch_dict = {0: 'p', 1: 'i', 2: 'r'}

Definition at line 39 of file yahoo.py.

◆ time_range_dict

dict searx.engines.yahoo.time_range_dict = {'day': 'd', 'week': 'w', 'month': 'm'}

Definition at line 38 of file yahoo.py.

◆ time_range_support

bool searx.engines.yahoo.time_range_support = True

Definition at line 35 of file yahoo.py.

◆ yahoo_languages

dict searx.engines.yahoo.yahoo_languages

Definition at line 83 of file yahoo.py.