.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.favicons.cache.FaviconCacheSQLite Class Reference
Inheritance diagram for searx.favicons.cache.FaviconCacheSQLite:
Collaboration diagram for searx.favicons.cache.FaviconCacheSQLite:

Public Member Functions

 __init__ (self, FaviconCacheConfig cfg)
None|tuple[None|bytes, None|str] __call__ (self, str resolver, str authority)
bool set (self, str resolver, str authority, str|None mime, bytes|None data)
int next_maintenance_time (self)
 maintenance (self, bool force=False)
FaviconCacheStats state (self)
Public Member Functions inherited from searx.sqlitedb.SQLiteAppl
 __init__ (self, str db_url)
sqlite3.Connection connect (self)
 register_functions (self, sqlite3.Connection conn)
sqlite3.Connection DB (self)
bool init (self, sqlite3.Connection conn)
 create_schema (self, sqlite3.Connection conn)
Public Member Functions inherited from searx.favicons.cache.FaviconCache
 __init__ (self, FaviconCacheConfig cfg)
None|tuple[None|bytes, None|str] __call__ (self, str resolver, str authority)

Public Attributes

 cfg = cfg
 next_maintenance_time
Public Attributes inherited from searx.sqlitedb.SQLiteAppl
str db_url = db_url
SQLiteProperties properties = SQLiteProperties(db_url)

Static Public Attributes

str DDL_BLOBS
str DDL_BLOB_MAP
tuple SQL_DROP_LEFTOVER_BLOBS
tuple SQL_ITER_BLOBS_SHA256_BYTES_C
tuple SQL_INSERT_BLOBS
tuple SQL_INSERT_BLOB_MAP
Static Public Attributes inherited from searx.sqlitedb.SQLiteAppl
dict DDL_CREATE_TABLES = {}
int DB_SCHEMA = 1
dict SQLITE_THREADING_MODE
str SQLITE_JOURNAL_MODE = "WAL"
dict SQLITE_CONNECT_ARGS

Protected Member Functions

 _query_val (self, str sql, t.Any default=None)
Protected Member Functions inherited from searx.sqlitedb.SQLiteAppl
 _compatibility (self)
sqlite3.Connection _connect (self)

Additional Inherited Members

Protected Attributes inherited from searx.sqlitedb.SQLiteAppl
bool _init_done = False
sqlite3.Connection|None _DB = None

Detailed Description

Favicon cache that manages the favicon BLOBs in a SQLite DB.  The DB
model in the SQLite DB is implemented using the abstract class
:py:obj:`sqlitedb.SQLiteAppl`.

For introspection of the DB, jump into developer environment and run command
to show cache state::

    $ ./manage pyenv.cmd bash --norc --noprofile
    (py3) python -m searx.favicons cache state

The following configurations are required / supported:

- :py:obj:`FaviconCacheConfig.db_url`
- :py:obj:`FaviconCacheConfig.HOLD_TIME`
- :py:obj:`FaviconCacheConfig.LIMIT_TOTAL_BYTES`
- :py:obj:`FaviconCacheConfig.BLOB_MAX_BYTES`
- :py:obj:`MAINTENANCE_PERIOD`
- :py:obj:`MAINTENANCE_MODE`

Definition at line 236 of file cache.py.

Constructor & Destructor Documentation

◆ __init__()

searx.favicons.cache.FaviconCacheSQLite.__init__ ( self,
FaviconCacheConfig cfg )
An instance of the favicon cache is build up from the configuration.

Definition at line 312 of file cache.py.

312 def __init__(self, cfg: FaviconCacheConfig):
313 """An instance of the favicon cache is build up from the configuration.""" #
314
315 if cfg.db_url == ":memory:":
316 logger.critical("don't use SQLite DB in :memory: in production!!")
317 super().__init__(cfg.db_url)
318 self.cfg = cfg
319

References __init__().

Referenced by __init__().

Here is the call graph for this function:
Here is the caller graph for this function:

Member Function Documentation

◆ __call__()

None | tuple[None | bytes, None | str] searx.favicons.cache.FaviconCacheSQLite.__call__ ( self,
str resolver,
str authority )

Definition at line 320 of file cache.py.

320 def __call__(self, resolver: str, authority: str) -> None | tuple[None | bytes, None | str]:
321
322 sql = "SELECT sha256 FROM blob_map WHERE resolver = ? AND authority = ?"
323 res = self.DB.execute(sql, (resolver, authority)).fetchone()
324 if res is None:
325 return None
326
327 data, mime = (None, None)
328 sha256 = res[0]
329 if sha256 == FALLBACK_ICON:
330 return data, mime
331
332 sql = "SELECT data, mime FROM blobs WHERE sha256 = ?"
333 res = self.DB.execute(sql, (sha256,)).fetchone()
334 if res is not None:
335 data, mime = res
336 return data, mime
337

References searx.cache.ExpireCacheSQLite.DB, searx.sqlitedb.SQLiteAppl.DB(), and searx.sqlitedb.SQLiteProperties.DB.

Here is the call graph for this function:

◆ _query_val()

searx.favicons.cache.FaviconCacheSQLite._query_val ( self,
str sql,
t.Any default = None )
protected

Definition at line 429 of file cache.py.

429 def _query_val(self, sql: str, default: t.Any = None):
430 val = self.DB.execute(sql).fetchone()
431 if val is not None:
432 val = val[0]
433 if val is None:
434 val = default
435 return val
436

References searx.cache.ExpireCacheSQLite.DB, searx.sqlitedb.SQLiteAppl.DB(), and searx.sqlitedb.SQLiteProperties.DB.

Referenced by state().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ maintenance()

searx.favicons.cache.FaviconCacheSQLite.maintenance ( self,
bool force = False )
Performs maintenance on the cache

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 381 of file cache.py.

381 def maintenance(self, force: bool = False):
382
383 # Prevent parallel DB maintenance cycles from other DB connections
384 # (e.g. in multi thread or process environments).
385
386 if not force and int(time.time()) < self.next_maintenance_time:
387 logger.debug("no maintenance required yet, next maintenance interval is in the future")
388 return
389 self.properties.set("LAST_MAINTENANCE", "") # hint: this (also) sets the m_time of the property!
390
391 # Do maintenance tasks. This can be take a little more time, to avoid
392 # DB locks, establish a new DB connection.
393
394 with self.connect() as conn:
395
396 # drop items not in HOLD time
397 res = conn.execute(
398 f"DELETE FROM blob_map"
399 f" WHERE cast(m_time as integer) < cast(strftime('%s', 'now') as integer) - {self.cfg.HOLD_TIME}"
400 )
401 logger.debug("dropped %s obsolete blob_map items from db", res.rowcount)
402 res = conn.execute(self.SQL_DROP_LEFTOVER_BLOBS)
403 logger.debug("dropped %s obsolete BLOBS from db", res.rowcount)
404
405 # drop old items to be in LIMIT_TOTAL_BYTES
406 total_bytes = conn.execute("SELECT SUM(bytes_c) FROM blobs").fetchone()[0] or 0
407 if total_bytes > self.cfg.LIMIT_TOTAL_BYTES:
408
409 x = total_bytes - self.cfg.LIMIT_TOTAL_BYTES
410 c = 0
411 sha_list: list[str] = []
412 for row in conn.execute(self.SQL_ITER_BLOBS_SHA256_BYTES_C):
413 sha256, bytes_c = row
414 sha_list.append(sha256)
415 c += bytes_c
416 if c > x:
417 break
418 if sha_list:
419 conn.execute("DELETE FROM blobs WHERE sha256 IN ('%s')" % "','".join(sha_list))
420 conn.execute("DELETE FROM blob_map WHERE sha256 IN ('%s')" % "','".join(sha_list))
421 logger.debug("dropped %s blobs with total size of %s bytes", len(sha_list), c)
422
423 # Vacuuming the WALs
424 # https://www.theunterminatedstring.com/sqlite-vacuuming/
425
426 conn.execute("PRAGMA wal_checkpoint(TRUNCATE)")
427 conn.close()
428

◆ next_maintenance_time()

int searx.favicons.cache.FaviconCacheSQLite.next_maintenance_time ( self)
Returns (unix epoch) time of the next maintenance.

Definition at line 376 of file cache.py.

376 def next_maintenance_time(self) -> int:
377 """Returns (unix epoch) time of the next maintenance."""
378
379 return self.cfg.MAINTENANCE_PERIOD + self.properties.m_time("LAST_MAINTENANCE")
380

References searx.botdetection.config.Config.cfg, searx.cache.ExpireCacheSQLite.cfg, cfg, and searx.sqlitedb.SQLiteAppl.properties.

◆ set()

bool searx.favicons.cache.FaviconCacheSQLite.set ( self,
str resolver,
str authority,
str | None mime,
bytes | None data )
Set data and mime-type in the cache.  If data is None, the
:py:obj:`FALLBACK_ICON` is registered. in the cache.

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 338 of file cache.py.

338 def set(self, resolver: str, authority: str, mime: str | None, data: bytes | None) -> bool:
339
340 if self.cfg.MAINTENANCE_MODE == "auto" and int(time.time()) > self.next_maintenance_time:
341 # Should automatic maintenance be moved to a new thread?
342 self.maintenance()
343
344 if data is not None and mime is None:
345 logger.error(
346 "favicon resolver %s tries to cache mime-type None for authority %s",
347 resolver,
348 authority,
349 )
350 return False
351
352 bytes_c = len(data or b"")
353 if bytes_c > self.cfg.BLOB_MAX_BYTES:
354 logger.info(
355 "favicon of resolver: %s / authority: %s to big to cache (bytes: %s) " % (resolver, authority, bytes_c)
356 )
357 return False
358
359 if data is None:
360 sha256 = FALLBACK_ICON
361 else:
362 sha256 = hashlib.sha256(data).hexdigest()
363
364 with self.connect() as conn:
365 if sha256 != FALLBACK_ICON:
366 conn.execute(self.SQL_INSERT_BLOBS, (sha256, bytes_c, mime, data))
367 conn.execute(self.SQL_INSERT_BLOB_MAP, (sha256, resolver, authority))
368 # hint: the with context of the connection object closes the transaction
369 # but not the DB connection. The connection has to be closed by the
370 # caller of self.connect()!
371 conn.close()
372
373 return True
374

References searx.botdetection.config.Config.cfg, searx.cache.ExpireCacheSQLite.cfg, cfg, searx.sqlitedb.SQLiteAppl.connect(), searx.cache.ExpireCache.maintenance(), searx.cache.ExpireCacheSQLite.maintenance(), searx.favicons.cache.FaviconCache.maintenance(), searx.cache.ExpireCacheSQLite.next_maintenance_time, next_maintenance_time, SQL_INSERT_BLOB_MAP, and SQL_INSERT_BLOBS.

Here is the call graph for this function:

◆ state()

FaviconCacheStats searx.favicons.cache.FaviconCacheSQLite.state ( self)
Returns a :py:obj:`FaviconCacheStats` (key/values) with information
on the state of the cache.

Reimplemented from searx.favicons.cache.FaviconCache.

Definition at line 437 of file cache.py.

437 def state(self) -> FaviconCacheStats:
438 return FaviconCacheStats(
439 favicons=self._query_val("SELECT count(*) FROM blobs", 0),
440 bytes=self._query_val("SELECT SUM(bytes_c) FROM blobs", 0),
441 domains=self._query_val("SELECT count(*) FROM (SELECT authority FROM blob_map GROUP BY authority)", 0),
442 resolvers=self._query_val("SELECT count(*) FROM (SELECT resolver FROM blob_map GROUP BY resolver)", 0),
443 )
444
445
446@t.final

References _query_val().

Here is the call graph for this function:

Member Data Documentation

◆ cfg

searx.favicons.cache.FaviconCacheSQLite.cfg = cfg

Definition at line 318 of file cache.py.

Referenced by next_maintenance_time(), searx.cache.ExpireCache.secret_hash(), and set().

◆ DDL_BLOB_MAP

str searx.favicons.cache.FaviconCacheSQLite.DDL_BLOB_MAP
static
Initial value:
= """\
CREATE TABLE IF NOT EXISTS blob_map (
m_time INTEGER DEFAULT (strftime('%s', 'now')), -- last modified (unix epoch) time in sec.
sha256 TEXT,
resolver TEXT,
authority TEXT,
PRIMARY KEY (resolver, authority))"""

Definition at line 269 of file cache.py.

◆ DDL_BLOBS

str searx.favicons.cache.FaviconCacheSQLite.DDL_BLOBS
static
Initial value:
= """\
CREATE TABLE IF NOT EXISTS blobs (
sha256 TEXT,
bytes_c INTEGER,
mime TEXT NOT NULL,
data BLOB NOT NULL,
PRIMARY KEY (sha256))"""

Definition at line 259 of file cache.py.

◆ next_maintenance_time

searx.favicons.cache.FaviconCacheSQLite.next_maintenance_time

Definition at line 386 of file cache.py.

Referenced by set().

◆ SQL_DROP_LEFTOVER_BLOBS

tuple searx.favicons.cache.FaviconCacheSQLite.SQL_DROP_LEFTOVER_BLOBS
static
Initial value:
= (
"DELETE FROM blobs WHERE sha256 IN ("
" SELECT b.sha256"
" FROM blobs b"
" LEFT JOIN blob_map bm"
" ON b.sha256 = bm.sha256"
" WHERE bm.sha256 IS NULL)"
)

Definition at line 284 of file cache.py.

◆ SQL_INSERT_BLOB_MAP

searx.favicons.cache.FaviconCacheSQLite.SQL_INSERT_BLOB_MAP
static
Initial value:
= (
"INSERT INTO blob_map (sha256, resolver, authority) VALUES (?, ?, ?)"
" ON CONFLICT DO UPDATE "
" SET sha256=excluded.sha256, m_time=strftime('%s', 'now')"
)

Definition at line 306 of file cache.py.

Referenced by set().

◆ SQL_INSERT_BLOBS

searx.favicons.cache.FaviconCacheSQLite.SQL_INSERT_BLOBS
static
Initial value:
= (
"INSERT INTO blobs (sha256, bytes_c, mime, data) VALUES (?, ?, ?, ?)"
" ON CONFLICT (sha256) DO NOTHING"
)

Definition at line 301 of file cache.py.

Referenced by set().

◆ SQL_ITER_BLOBS_SHA256_BYTES_C

searx.favicons.cache.FaviconCacheSQLite.SQL_ITER_BLOBS_SHA256_BYTES_C
static
Initial value:
= (
"SELECT b.sha256, b.bytes_c FROM blobs b"
" JOIN blob_map bm "
" ON b.sha256 = bm.sha256"
" ORDER BY bm.m_time ASC"
)

Definition at line 294 of file cache.py.


The documentation for this class was generated from the following file:
  • /home/andrew/Documents/code/public/searxng/searx/favicons/cache.py