.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.favicons.cache.FaviconCacheSQLite Class Reference
+ Inheritance diagram for searx.favicons.cache.FaviconCacheSQLite:
+ Collaboration diagram for searx.favicons.cache.FaviconCacheSQLite:

Public Member Functions

 __init__ (self, FaviconCacheConfig cfg)
 
None|tuple[None|bytes, None|str] __call__ (self, str resolver, str authority)
 
bool set (self, str resolver, str authority, str|None mime, bytes|None data)
 
int next_maintenance_time (self)
 
 maintenance (self, force=False)
 
FaviconCacheStats state (self)
 
- Public Member Functions inherited from searx.sqlitedb.SQLiteAppl
 __init__ (self, db_url)
 
sqlite3.Connection connect (self)
 
 register_functions (self, conn)
 
sqlite3.Connection DB (self)
 
bool init (self, sqlite3.Connection conn)
 
 create_schema (self, sqlite3.Connection conn)
 
- Public Member Functions inherited from searx.favicons.cache.FaviconCache
 __init__ (self, FaviconCacheConfig cfg)
 
None|tuple[None|bytes, None|str] __call__ (self, str resolver, str authority)
 

Public Attributes

 cfg = cfg
 
 next_maintenance_time
 
- Public Attributes inherited from searx.sqlitedb.SQLiteAppl
 db_url = db_url
 
 properties = SQLiteProperties(db_url)
 

Static Public Attributes

str DDL_BLOBS
 
str DDL_BLOB_MAP
 
tuple SQL_DROP_LEFTOVER_BLOBS
 
tuple SQL_ITER_BLOBS_SHA256_BYTES_C
 
tuple SQL_INSERT_BLOBS
 
tuple SQL_INSERT_BLOB_MAP
 
- Static Public Attributes inherited from searx.sqlitedb.SQLiteAppl
dict DDL_CREATE_TABLES = {}
 
int DB_SCHEMA = 1
 
dict SQLITE_THREADING_MODE
 
str SQLITE_JOURNAL_MODE = "WAL"
 
dict SQLITE_CONNECT_ARGS
 

Protected Member Functions

 _query_val (self, sql, default=None)
 
- Protected Member Functions inherited from searx.sqlitedb.SQLiteAppl
 _compatibility (self)
 
sqlite3.Connection _connect (self)
 

Additional Inherited Members

- Protected Attributes inherited from searx.sqlitedb.SQLiteAppl
bool _init_done = False
 
 _DB = None
 

Detailed Description

Favicon cache that manages the favicon BLOBs in a SQLite DB.  The DB
model in the SQLite DB is implemented using the abstract class
:py:obj:`sqlitedb.SQLiteAppl`.

For introspection of the DB, jump into developer environment and run command
to show cache state::

    $ ./manage pyenv.cmd bash --norc --noprofile
    (py3) python -m searx.favicons cache state

The following configurations are required / supported:

- :py:obj:`FaviconCacheConfig.db_url`
- :py:obj:`FaviconCacheConfig.HOLD_TIME`
- :py:obj:`FaviconCacheConfig.LIMIT_TOTAL_BYTES`
- :py:obj:`FaviconCacheConfig.BLOB_MAX_BYTES`
- :py:obj:`MAINTENANCE_PERIOD`
- :py:obj:`MAINTENANCE_MODE`

Definition at line 234 of file cache.py.

Constructor & Destructor Documentation

◆ __init__()

searx.favicons.cache.FaviconCacheSQLite.__init__ ( self,
FaviconCacheConfig cfg )
An instance of the favicon cache is build up from the configuration.

Definition at line 310 of file cache.py.

310 def __init__(self, cfg: FaviconCacheConfig):
311 """An instance of the favicon cache is build up from the configuration.""" #
312
313 if cfg.db_url == ":memory:":
314 logger.critical("don't use SQLite DB in :memory: in production!!")
315 super().__init__(cfg.db_url)
316 self.cfg = cfg
317

References __init__().

Referenced by __init__().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

Member Function Documentation

◆ __call__()

None | tuple[None | bytes, None | str] searx.favicons.cache.FaviconCacheSQLite.__call__ ( self,
str resolver,
str authority )

Definition at line 318 of file cache.py.

318 def __call__(self, resolver: str, authority: str) -> None | tuple[None | bytes, None | str]:
319
320 sql = "SELECT sha256 FROM blob_map WHERE resolver = ? AND authority = ?"
321 res = self.DB.execute(sql, (resolver, authority)).fetchone()
322 if res is None:
323 return None
324
325 data, mime = (None, None)
326 sha256 = res[0]
327 if sha256 == FALLBACK_ICON:
328 return data, mime
329
330 sql = "SELECT data, mime FROM blobs WHERE sha256 = ?"
331 res = self.DB.execute(sql, (sha256,)).fetchone()
332 if res is not None:
333 data, mime = res
334 return data, mime
335

References searx.cache.ExpireCacheSQLite.DB, searx.sqlitedb.SQLiteAppl.DB(), and searx.sqlitedb.SQLiteProperties.DB.

+ Here is the call graph for this function:

◆ _query_val()

searx.favicons.cache.FaviconCacheSQLite._query_val ( self,
sql,
default = None )
protected

Definition at line 427 of file cache.py.

427 def _query_val(self, sql, default=None):
428 val = self.DB.execute(sql).fetchone()
429 if val is not None:
430 val = val[0]
431 if val is None:
432 val = default
433 return val
434

References searx.cache.ExpireCacheSQLite.DB, searx.sqlitedb.SQLiteAppl.DB(), and searx.sqlitedb.SQLiteProperties.DB.

Referenced by state().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ maintenance()

searx.favicons.cache.FaviconCacheSQLite.maintenance ( self,
force = False )
Performs maintenance on the cache

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 379 of file cache.py.

379 def maintenance(self, force=False):
380
381 # Prevent parallel DB maintenance cycles from other DB connections
382 # (e.g. in multi thread or process environments).
383
384 if not force and int(time.time()) < self.next_maintenance_time:
385 logger.debug("no maintenance required yet, next maintenance interval is in the future")
386 return
387 self.properties.set("LAST_MAINTENANCE", "") # hint: this (also) sets the m_time of the property!
388
389 # Do maintenance tasks. This can be take a little more time, to avoid
390 # DB locks, etablish a new DB connecton.
391
392 with self.connect() as conn:
393
394 # drop items not in HOLD time
395 res = conn.execute(
396 f"DELETE FROM blob_map"
397 f" WHERE cast(m_time as integer) < cast(strftime('%s', 'now') as integer) - {self.cfg.HOLD_TIME}"
398 )
399 logger.debug("dropped %s obsolete blob_map items from db", res.rowcount)
400 res = conn.execute(self.SQL_DROP_LEFTOVER_BLOBS)
401 logger.debug("dropped %s obsolete BLOBS from db", res.rowcount)
402
403 # drop old items to be in LIMIT_TOTAL_BYTES
404 total_bytes = conn.execute("SELECT SUM(bytes_c) FROM blobs").fetchone()[0] or 0
405 if total_bytes > self.cfg.LIMIT_TOTAL_BYTES:
406
407 x = total_bytes - self.cfg.LIMIT_TOTAL_BYTES
408 c = 0
409 sha_list = []
410 for row in conn.execute(self.SQL_ITER_BLOBS_SHA256_BYTES_C):
411 sha256, bytes_c = row
412 sha_list.append(sha256)
413 c += bytes_c
414 if c > x:
415 break
416 if sha_list:
417 conn.execute("DELETE FROM blobs WHERE sha256 IN ('%s')" % "','".join(sha_list))
418 conn.execute("DELETE FROM blob_map WHERE sha256 IN ('%s')" % "','".join(sha_list))
419 logger.debug("dropped %s blobs with total size of %s bytes", len(sha_list), c)
420
421 # Vacuuming the WALs
422 # https://www.theunterminatedstring.com/sqlite-vacuuming/
423
424 conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
425 conn.close()
426

◆ next_maintenance_time()

int searx.favicons.cache.FaviconCacheSQLite.next_maintenance_time ( self)
Returns (unix epoch) time of the next maintenance.

Definition at line 374 of file cache.py.

374 def next_maintenance_time(self) -> int:
375 """Returns (unix epoch) time of the next maintenance."""
376
377 return self.cfg.MAINTENANCE_PERIOD + self.properties.m_time("LAST_MAINTENANCE")
378

References searx.botdetection.config.Config.cfg, searx.cache.ExpireCacheSQLite.cfg, cfg, and searx.sqlitedb.SQLiteAppl.properties.

◆ set()

bool searx.favicons.cache.FaviconCacheSQLite.set ( self,
str resolver,
str authority,
str | None mime,
bytes | None data )
Set data and mime-type in the cache.  If data is None, the
:py:obj:`FALLBACK_ICON` is registered. in the cache.

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 336 of file cache.py.

336 def set(self, resolver: str, authority: str, mime: str | None, data: bytes | None) -> bool:
337
338 if self.cfg.MAINTENANCE_MODE == "auto" and int(time.time()) > self.next_maintenance_time:
339 # Should automatic maintenance be moved to a new thread?
340 self.maintenance()
341
342 if data is not None and mime is None:
343 logger.error(
344 "favicon resolver %s tries to cache mime-type None for authority %s",
345 resolver,
346 authority,
347 )
348 return False
349
350 bytes_c = len(data or b"")
351 if bytes_c > self.cfg.BLOB_MAX_BYTES:
352 logger.info(
353 "favicon of resolver: %s / authority: %s to big to cache (bytes: %s) " % (resolver, authority, bytes_c)
354 )
355 return False
356
357 if data is None:
358 sha256 = FALLBACK_ICON
359 else:
360 sha256 = hashlib.sha256(data).hexdigest()
361
362 with self.connect() as conn:
363 if sha256 != FALLBACK_ICON:
364 conn.execute(self.SQL_INSERT_BLOBS, (sha256, bytes_c, mime, data))
365 conn.execute(self.SQL_INSERT_BLOB_MAP, (sha256, resolver, authority))
366 # hint: the with context of the connection object closes the transaction
367 # but not the DB connection. The connection has to be closed by the
368 # caller of self.connect()!
369 conn.close()
370
371 return True
372

References searx.botdetection.config.Config.cfg, searx.cache.ExpireCacheSQLite.cfg, cfg, searx.sqlitedb.SQLiteAppl.connect(), searx.cache.ExpireCache.maintenance(), searx.cache.ExpireCacheSQLite.maintenance(), searx.favicons.cache.FaviconCache.maintenance(), searx.cache.ExpireCacheSQLite.next_maintenance_time, next_maintenance_time, SQL_INSERT_BLOB_MAP, and SQL_INSERT_BLOBS.

+ Here is the call graph for this function:

◆ state()

FaviconCacheStats searx.favicons.cache.FaviconCacheSQLite.state ( self)
Returns a :py:obj:`FaviconCacheStats` (key/values) with information
on the state of the cache.

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 435 of file cache.py.

435 def state(self) -> FaviconCacheStats:
436 return FaviconCacheStats(
437 favicons=self._query_val("SELECT count(*) FROM blobs", 0),
438 bytes=self._query_val("SELECT SUM(bytes_c) FROM blobs", 0),
439 domains=self._query_val("SELECT count(*) FROM (SELECT authority FROM blob_map GROUP BY authority)", 0),
440 resolvers=self._query_val("SELECT count(*) FROM (SELECT resolver FROM blob_map GROUP BY resolver)", 0),
441 )
442
443

References _query_val().

+ Here is the call graph for this function:

Member Data Documentation

◆ cfg

searx.favicons.cache.FaviconCacheSQLite.cfg = cfg

Definition at line 316 of file cache.py.

Referenced by next_maintenance_time(), searx.cache.ExpireCache.secret_hash(), and set().

◆ DDL_BLOB_MAP

str searx.favicons.cache.FaviconCacheSQLite.DDL_BLOB_MAP
static
Initial value:
= """\
CREATE TABLE IF NOT EXISTS blob_map (
m_time INTEGER DEFAULT (strftime('%s', 'now')), -- last modified (unix epoch) time in sec.
sha256 TEXT,
resolver TEXT,
authority TEXT,
PRIMARY KEY (resolver, authority))"""

Definition at line 267 of file cache.py.

◆ DDL_BLOBS

str searx.favicons.cache.FaviconCacheSQLite.DDL_BLOBS
static
Initial value:
= """\
CREATE TABLE IF NOT EXISTS blobs (
sha256 TEXT,
bytes_c INTEGER,
mime TEXT NOT NULL,
data BLOB NOT NULL,
PRIMARY KEY (sha256))"""

Definition at line 257 of file cache.py.

◆ next_maintenance_time

searx.favicons.cache.FaviconCacheSQLite.next_maintenance_time

Definition at line 384 of file cache.py.

Referenced by set().

◆ SQL_DROP_LEFTOVER_BLOBS

tuple searx.favicons.cache.FaviconCacheSQLite.SQL_DROP_LEFTOVER_BLOBS
static
Initial value:
= (
"DELETE FROM blobs WHERE sha256 IN ("
" SELECT b.sha256"
" FROM blobs b"
" LEFT JOIN blob_map bm"
" ON b.sha256 = bm.sha256"
" WHERE bm.sha256 IS NULL)"
)

Definition at line 282 of file cache.py.

◆ SQL_INSERT_BLOB_MAP

searx.favicons.cache.FaviconCacheSQLite.SQL_INSERT_BLOB_MAP
static
Initial value:
= (
"INSERT INTO blob_map (sha256, resolver, authority) VALUES (?, ?, ?)"
" ON CONFLICT DO UPDATE "
" SET sha256=excluded.sha256, m_time=strftime('%s', 'now')"
)

Definition at line 304 of file cache.py.

Referenced by set().

◆ SQL_INSERT_BLOBS

searx.favicons.cache.FaviconCacheSQLite.SQL_INSERT_BLOBS
static
Initial value:
= (
"INSERT INTO blobs (sha256, bytes_c, mime, data) VALUES (?, ?, ?, ?)"
" ON CONFLICT (sha256) DO NOTHING"
)

Definition at line 299 of file cache.py.

Referenced by set().

◆ SQL_ITER_BLOBS_SHA256_BYTES_C

searx.favicons.cache.FaviconCacheSQLite.SQL_ITER_BLOBS_SHA256_BYTES_C
static
Initial value:
= (
"SELECT b.sha256, b.bytes_c FROM blobs b"
" JOIN blob_map bm "
" ON b.sha256 = bm.sha256"
" ORDER BY bm.m_time ASC"
)

Definition at line 292 of file cache.py.


The documentation for this class was generated from the following file: