.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.zlibrary Namespace Reference

Functions

None init (engine_settings=None)
 
Dict[str, Any] request (str query, Dict[str, Any] params)
 
 domain_is_seized (dom)
 
List[Dict[str, Any]] response (httpx.Response resp)
 
str|None _text (item, str selector)
 
Dict[str, Any] _parse_result (item)
 
None fetch_traits (EngineTraits engine_traits)
 

Variables

logging logger .Logger
 
dict about
 
list categories = ["files"]
 
bool paging = True
 
str base_url = "https://zlibrary-global.se"
 
str zlib_year_from = ""
 
str zlib_year_to = ""
 
str zlib_ext = ""
 
 i18n_language = gettext("Language")
 
 i18n_book_rating = gettext("Book rating")
 
 i18n_file_quality = gettext("File quality")
 

Detailed Description

`Z-Library`_ (abbreviated as z-lib, formerly BookFinder) is a shadow library
project for file-sharing access to scholarly journal articles, academic texts
and general-interest books.  It began as a mirror of Library Genesis, from which
most of its books originate.

.. _Z-Library: https://zlibrary-global.se/

Configuration
=============

The engine has the following additional settings:

- :py:obj:`zlib_year_from`
- :py:obj:`zlib_year_to`
- :py:obj:`zlib_ext`

With this options a SearXNG maintainer is able to configure **additional**
engines for specific searches in Z-Library.  For example a engine to search
only for EPUB from 2010 to 2020.

.. code:: yaml

   - name: z-library 2010s epub
     engine: zlibrary
     shortcut: zlib2010s
     zlib_year_from: '2010'
     zlib_year_to: '2020'
     zlib_ext: 'EPUB'

Implementations
===============

Function Documentation

◆ _parse_result()

Dict[str, Any] searx.engines.zlibrary._parse_result ( item)
protected

Definition at line 142 of file zlibrary.py.

142def _parse_result(item) -> Dict[str, Any]:
143
144 author_elements = eval_xpath_list(item, './/div[@class="authors"]//a[@itemprop="author"]')
145
146 result = {
147 "template": "paper.html",
148 "url": base_url + item.xpath('(.//a[starts-with(@href, "/book/")])[1]/@href')[0],
149 "title": _text(item, './/*[@itemprop="name"]'),
150 "authors": [extract_text(author) for author in author_elements],
151 "publisher": _text(item, './/a[@title="Publisher"]'),
152 "type": _text(item, './/div[contains(@class, "property__file")]//div[contains(@class, "property_value")]'),
153 }
154
155 thumbnail = _text(item, './/img[contains(@class, "cover")]/@data-src')
156 if not thumbnail.startswith('/'):
157 result["thumbnail"] = thumbnail
158
159 year = _text(item, './/div[contains(@class, "property_year")]//div[contains(@class, "property_value")]')
160 if year:
161 result["publishedDate"] = datetime.strptime(year, '%Y')
162
163 content = []
164 language = _text(item, './/div[contains(@class, "property_language")]//div[contains(@class, "property_value")]')
165 if language:
166 content.append(f"{i18n_language}: {language.capitalize()}")
167 book_rating = _text(item, './/span[contains(@class, "book-rating-interest-score")]')
168 if book_rating and float(book_rating):
169 content.append(f"{i18n_book_rating}: {book_rating}")
170 file_quality = _text(item, './/span[contains(@class, "book-rating-quality-score")]')
171 if file_quality and float(file_quality):
172 content.append(f"{i18n_file_quality}: {file_quality}")
173 result["content"] = " | ".join(content)
174
175 return result
176
177

References searx.engines.zlibrary._text().

Referenced by searx.engines.zlibrary.response().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ _text()

str | None searx.engines.zlibrary._text ( item,
str selector )
protected

Definition at line 133 of file zlibrary.py.

133def _text(item, selector: str) -> str | None:
134 return extract_text(eval_xpath(item, selector))
135
136

Referenced by searx.engines.zlibrary._parse_result().

+ Here is the caller graph for this function:

◆ domain_is_seized()

searx.engines.zlibrary.domain_is_seized ( dom)

Definition at line 116 of file zlibrary.py.

116def domain_is_seized(dom):
117 return bool(dom.xpath('//title') and "seized" in dom.xpath('//title')[0].text.lower())
118
119

Referenced by searx.engines.zlibrary.fetch_traits(), and searx.engines.zlibrary.response().

+ Here is the caller graph for this function:

◆ fetch_traits()

None searx.engines.zlibrary.fetch_traits ( EngineTraits engine_traits)
Fetch languages and other search arguments from zlibrary's search form.

Definition at line 178 of file zlibrary.py.

178def fetch_traits(engine_traits: EngineTraits) -> None:
179 """Fetch languages and other search arguments from zlibrary's search form."""
180 # pylint: disable=import-outside-toplevel, too-many-branches
181
182 import babel
183 from searx.network import get # see https://github.com/searxng/searxng/issues/762
184 from searx.locales import language_tag
185
186 def _use_old_values():
187 # don't change anything, re-use the existing values
188 engine_traits.all_locale = ENGINE_TRAITS["z-library"]["all_locale"]
189 engine_traits.custom = ENGINE_TRAITS["z-library"]["custom"]
190 engine_traits.languages = ENGINE_TRAITS["z-library"]["languages"]
191
192 try:
193 resp = get(base_url, verify=False)
194 except SearxException as exc:
195 print(f"ERROR: zlibrary domain '{base_url}' is seized?")
196 print(f" --> {exc}")
197 _use_old_values()
198 return
199
200 if not resp.ok: # type: ignore
201 raise RuntimeError("Response from zlibrary's search page is not OK.")
202 dom = html.fromstring(resp.text) # type: ignore
203
204 if domain_is_seized(dom):
205 print(f"ERROR: zlibrary domain is seized: {base_url}")
206 _use_old_values()
207 return
208
209 engine_traits.all_locale = ""
210 engine_traits.custom["ext"] = []
211 engine_traits.custom["year_from"] = []
212 engine_traits.custom["year_to"] = []
213
214 for year in eval_xpath_list(dom, "//div[@id='advSearch-noJS']//select[@id='sf_yearFrom']/option"):
215 engine_traits.custom["year_from"].append(year.get("value"))
216
217 for year in eval_xpath_list(dom, "//div[@id='advSearch-noJS']//select[@id='sf_yearTo']/option"):
218 engine_traits.custom["year_to"].append(year.get("value"))
219
220 for ext in eval_xpath_list(dom, "//div[@id='advSearch-noJS']//select[@id='sf_extensions']/option"):
221 value: Optional[str] = ext.get("value")
222 if value is None:
223 value = ""
224 engine_traits.custom["ext"].append(value)
225
226 # Handle languages
227 # Z-library uses English names for languages, so we need to map them to their respective locales
228 language_name_locale_map: Dict[str, babel.Locale] = {}
229 for locale in babel.core.localedata.locale_identifiers(): # type: ignore
230 # Create a Locale object for the current locale
231 loc = babel.Locale.parse(locale)
232 if loc.english_name is None:
233 continue
234 language_name_locale_map[loc.english_name.lower()] = loc # type: ignore
235
236 for x in eval_xpath_list(dom, "//div[@id='advSearch-noJS']//select[@id='sf_languages']/option"):
237 eng_lang = x.get("value")
238 if eng_lang is None:
239 continue
240 try:
241 locale = language_name_locale_map[eng_lang.lower()]
242 except KeyError:
243 # silently ignore unknown languages
244 # print("ERROR: %s is unknown by babel" % (eng_lang))
245 continue
246 sxng_lang = language_tag(locale)
247 conflict = engine_traits.languages.get(sxng_lang)
248 if conflict:
249 if conflict != eng_lang:
250 print("CONFLICT: babel %s --> %s, %s" % (sxng_lang, conflict, eng_lang))
251 continue
252 engine_traits.languages[sxng_lang] = eng_lang

References searx.engines.zlibrary.domain_is_seized().

+ Here is the call graph for this function:

◆ init()

None searx.engines.zlibrary.init ( engine_settings = None)
Check of engine's settings.

Definition at line 82 of file zlibrary.py.

82def init(engine_settings=None) -> None: # pylint: disable=unused-argument
83 """Check of engine's settings."""
84 traits: EngineTraits = EngineTraits(**ENGINE_TRAITS["z-library"])
85
86 if zlib_ext and zlib_ext not in traits.custom["ext"]:
87 raise ValueError(f"invalid setting ext: {zlib_ext}")
88 if zlib_year_from and zlib_year_from not in traits.custom["year_from"]:
89 raise ValueError(f"invalid setting year_from: {zlib_year_from}")
90 if zlib_year_to and zlib_year_to not in traits.custom["year_to"]:
91 raise ValueError(f"invalid setting year_to: {zlib_year_to}")
92
93

◆ request()

Dict[str, Any] searx.engines.zlibrary.request ( str query,
Dict[str, Any] params )

Definition at line 94 of file zlibrary.py.

94def request(query: str, params: Dict[str, Any]) -> Dict[str, Any]:
95 lang: str = traits.get_language(params["language"], traits.all_locale) # type: ignore
96 search_url: str = (
97 base_url
98 + "/s/{search_query}/?page={pageno}"
99 + "&yearFrom={zlib_year_from}"
100 + "&yearTo={zlib_year_to}"
101 + "&languages[]={lang}"
102 + "&extensions[]={zlib_ext}"
103 )
104 params["url"] = search_url.format(
105 search_query=quote(query),
106 pageno=params["pageno"],
107 lang=lang,
108 zlib_year_from=zlib_year_from,
109 zlib_year_to=zlib_year_to,
110 zlib_ext=zlib_ext,
111 )
112 params["verify"] = False
113 return params
114
115

◆ response()

List[Dict[str, Any]] searx.engines.zlibrary.response ( httpx.Response resp)

Definition at line 120 of file zlibrary.py.

120def response(resp: httpx.Response) -> List[Dict[str, Any]]:
121 results: List[Dict[str, Any]] = []
122 dom = html.fromstring(resp.text)
123
124 if domain_is_seized(dom):
125 raise SearxException(f"zlibrary domain is seized: {base_url}")
126
127 for item in dom.xpath('//div[@id="searchResultBox"]//div[contains(@class, "resItemBox")]'):
128 results.append(_parse_result(item))
129
130 return results
131
132

References searx.engines.zlibrary._parse_result(), and searx.engines.zlibrary.domain_is_seized().

+ Here is the call graph for this function:

Variable Documentation

◆ about

dict searx.engines.zlibrary.about
Initial value:
1= {
2 "website": "https://zlibrary-global.se",
3 "wikidata_id": "Q104863992",
4 "official_api_documentation": None,
5 "use_official_api": False,
6 "require_api_key": False,
7 "results": "HTML",
8}

Definition at line 55 of file zlibrary.py.

◆ base_url

str searx.engines.zlibrary.base_url = "https://zlibrary-global.se"

Definition at line 66 of file zlibrary.py.

◆ categories

list searx.engines.zlibrary.categories = ["files"]

Definition at line 64 of file zlibrary.py.

◆ i18n_book_rating

searx.engines.zlibrary.i18n_book_rating = gettext("Book rating")

Definition at line 138 of file zlibrary.py.

◆ i18n_file_quality

searx.engines.zlibrary.i18n_file_quality = gettext("File quality")

Definition at line 139 of file zlibrary.py.

◆ i18n_language

searx.engines.zlibrary.i18n_language = gettext("Language")

Definition at line 137 of file zlibrary.py.

◆ logger

logging searx.engines.zlibrary.logger .Logger

Definition at line 52 of file zlibrary.py.

◆ paging

bool searx.engines.zlibrary.paging = True

Definition at line 65 of file zlibrary.py.

◆ zlib_ext

str searx.engines.zlibrary.zlib_ext = ""

Definition at line 76 of file zlibrary.py.

◆ zlib_year_from

str searx.engines.zlibrary.zlib_year_from = ""

Definition at line 68 of file zlibrary.py.

◆ zlib_year_to

str searx.engines.zlibrary.zlib_year_to = ""

Definition at line 72 of file zlibrary.py.