.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.semantic_scholar Namespace Reference

Functions

bool setup (dict[str, t.Any] engine_settings)
str get_ui_version ()
None request (str query, "OnlineParams" params)
EngineResults response ("SXNG_Response" resp)

Variables

dict about
list categories = ["science", "scientific publications"]
bool paging = True
str search_url = "https://www.semanticscholar.org/api/1/search"
str base_url = "https://www.semanticscholar.org"

Detailed Description

`Semantic Scholar`_ provides free, AI-driven search and discovery tools, and
open resources for the global research community.  `Semantic Scholar`_ index
over 200 million academic papers sourced from publisher partnerships, data
providers, and web crawls.

.. _Semantic Scholar: https://www.semanticscholar.org/about

Configuration
=============

To get in use of this engine add the following entry to your engines list in
``settings.yml``:

.. code:: yaml

   - name: semantic scholar
     engine: semantic_scholar
     shortcut: se

Implementations
===============

Function Documentation

◆ get_ui_version()

str searx.engines.semantic_scholar.get_ui_version ( )

Definition at line 66 of file semantic_scholar.py.

66def get_ui_version() -> str:
67 ret_val: str = CACHE.get("X-S2-UI-Version")
68 if not ret_val:
69 resp = get(base_url)
70 if not resp.ok:
71 raise RuntimeError("Can't determine Semantic Scholar UI version")
72
73 doc = html.fromstring(resp.text)
74 ret_val = eval_xpath_getindex(doc, "//meta[@name='s2-ui-version']/@content", 0)
75 if not ret_val:
76 raise RuntimeError("Can't determine Semantic Scholar UI version")
77 # hold the cached value for 5min
78 CACHE.set("X-S2-UI-Version", value=ret_val, expire=300)
79 logger.debug("X-S2-UI-Version: %s", ret_val)
80 return ret_val
81
82

Referenced by request().

Here is the caller graph for this function:

◆ request()

None searx.engines.semantic_scholar.request ( str query,
"OnlineParams" params )

Definition at line 83 of file semantic_scholar.py.

83def request(query: str, params: "OnlineParams") -> None:
84 params["url"] = search_url
85 params["method"] = "POST"
86 params["headers"].update(
87 {
88 "Content-Type": "application/json",
89 "X-S2-UI-Version": get_ui_version(),
90 "X-S2-Client": "webapp-browser",
91 }
92 )
93 params["json"] = {
94 "queryString": query,
95 "page": params["pageno"],
96 "pageSize": 10,
97 "sort": "relevance",
98 "getQuerySuggestions": False,
99 "authors": [],
100 "coAuthors": [],
101 "venues": [],
102 "performTitleMatch": True,
103 }
104
105

References get_ui_version().

Here is the call graph for this function:

◆ response()

EngineResults searx.engines.semantic_scholar.response ( "SXNG_Response" resp)

Definition at line 106 of file semantic_scholar.py.

106def response(resp: "SXNG_Response") -> EngineResults:
107 res = EngineResults()
108 json_data = resp.json()
109
110 for result in json_data["results"]:
111 url: str = result.get("primaryPaperLink", {}).get("url")
112 if not url and result.get("links"):
113 url = result.get("links")[0]
114 if not url:
115 alternatePaperLinks = result.get("alternatePaperLinks")
116 if alternatePaperLinks:
117 url = alternatePaperLinks[0].get("url")
118 if not url:
119 url = base_url + "/paper/%s" % result["id"]
120
121 publishedDate: datetime | None
122 if "pubDate" in result:
123 publishedDate = datetime.strptime(result["pubDate"], "%Y-%m-%d")
124 else:
125 publishedDate = None
126
127 # authors
128 authors: list[str] = [author[0]["name"] for author in result.get("authors", [])]
129
130 # pick for the first alternate link, but not from the crawler
131 pdf_url: str = ""
132 for doc in result.get("alternatePaperLinks", []):
133 if doc["linkType"] not in ("crawler", "doi"):
134 pdf_url = doc["url"]
135 break
136
137 # comments
138 comments: str = ""
139 if "citationStats" in result:
140 comments = gettext(
141 "{numCitations} citations from the year {firstCitationVelocityYear} to {lastCitationVelocityYear}"
142 ).format(
143 numCitations=result["citationStats"]["numCitations"],
144 firstCitationVelocityYear=result["citationStats"]["firstCitationVelocityYear"],
145 lastCitationVelocityYear=result["citationStats"]["lastCitationVelocityYear"],
146 )
147
148 res.add(
149 res.types.Paper(
150 title=result["title"]["text"],
151 url=url,
152 content=html_to_text(result["paperAbstract"]["text"]),
153 journal=result.get("venue", {}).get("text") or result.get("journal", {}).get("name"),
154 doi=result.get("doiInfo", {}).get("doi"),
155 tags=result.get("fieldsOfStudy"),
156 authors=authors,
157 pdf_url=pdf_url,
158 publishedDate=publishedDate,
159 comments=comments,
160 )
161 )
162
163 return res

◆ setup()

bool searx.engines.semantic_scholar.setup ( dict[str, t.Any] engine_settings)

Definition at line 60 of file semantic_scholar.py.

60def setup(engine_settings: dict[str, t.Any]) -> bool:
61 global CACHE # pylint: disable=global-statement
62 CACHE = EngineCache(engine_settings["name"])
63 return True
64
65

Variable Documentation

◆ about

dict searx.engines.semantic_scholar.about
Initial value:
1= {
2 "website": "https://www.semanticscholar.org/",
3 "wikidata_id": "Q22908627",
4 "official_api_documentation": "https://api.semanticscholar.org/",
5 "use_official_api": True,
6 "require_api_key": False,
7 "results": "JSON",
8}

Definition at line 41 of file semantic_scholar.py.

◆ base_url

str searx.engines.semantic_scholar.base_url = "https://www.semanticscholar.org"

Definition at line 53 of file semantic_scholar.py.

◆ categories

list searx.engines.semantic_scholar.categories = ["science", "scientific publications"]

Definition at line 50 of file semantic_scholar.py.

◆ paging

bool searx.engines.semantic_scholar.paging = True

Definition at line 51 of file semantic_scholar.py.

◆ search_url

str searx.engines.semantic_scholar.search_url = "https://www.semanticscholar.org/api/1/search"

Definition at line 52 of file semantic_scholar.py.