.oO SearXNG Developer Documentation Oo.
Loading...
Searching...
No Matches
searx.engines.command Namespace Reference

Functions

 init (engine_settings)
 
 search (query, params)
 
 _get_command_to_run (query)
 
 _get_results_from_process (results, cmd, pageno)
 
 __get_results_limits (pageno)
 
 __check_query_params (params)
 
 check_parsing_options (engine_settings)
 
 __parse_single_result (raw_result)
 

Variables

str engine_type = 'offline'
 
bool paging = True
 
list command = []
 
dict delimiter = {}
 
dict parse_regex = {}
 
str query_type = ''
 
list query_enum = []
 
dict environment_variables = {}
 
 working_dir = realpath('.')
 
str result_separator = '\n'
 
str result_template = 'key-value.html'
 
float timeout = 4.0
 
 _command_logger = logger.getChild('command')
 
dict _compiled_parse_regex = {}
 

Detailed Description

With *command engines* administrators can run engines to integrate arbitrary
shell commands.

.. attention::

   When creating and enabling a ``command`` engine on a public instance, you
   must be careful to avoid leaking private data.

The easiest solution is to limit the access by setting ``tokens`` as described
in section :ref:`private engines`.  The engine base is flexible.  Only your
imagination can limit the power of this engine (and maybe security concerns).

Configuration
=============

The following options are available:

``command``:
  A comma separated list of the elements of the command.  A special token
  ``{{QUERY}}`` tells where to put the search terms of the user. Example:

  .. code:: yaml

     ['ls', '-l', '-h', '{{QUERY}}']

``delimiter``:
  A mapping containing a delimiter ``char`` and the *titles* of each element in
  ``keys``.

``parse_regex``:
  A dict containing the regular expressions for each result key.

``query_type``:

  The expected type of user search terms.  Possible values: ``path`` and
  ``enum``.

  ``path``:
    Checks if the user provided path is inside the working directory.  If not,
    the query is not executed.

  ``enum``:
    Is a list of allowed search terms.  If the user submits something which is
    not included in the list, the query returns an error.

``query_enum``:
  A list containing allowed search terms if ``query_type`` is set to ``enum``.

``working_dir``:
  The directory where the command has to be executed.  Default: ``./``.

``result_separator``:
  The character that separates results. Default: ``\\n``.

Example
=======

The example engine below can be used to find files with a specific name in the
configured working directory:

.. code:: yaml

  - name: find
    engine: command
    command: ['find', '.', '-name', '{{QUERY}}']
    query_type: path
    shortcut: fnd
    delimiter:
        chars: ' '
        keys: ['line']

Implementations
===============

Function Documentation

◆ __check_query_params()

searx.engines.command.__check_query_params ( params)
private

Definition at line 197 of file command.py.

197def __check_query_params(params):
198 if not query_type:
199 return
200
201 if query_type == 'path':
202 query_path = params[-1]
203 query_path = expanduser(query_path)
204 if commonprefix([realpath(query_path), working_dir]) != working_dir:
205 raise ValueError('requested path is outside of configured working directory')
206 elif query_type == 'enum' and len(query_enum) > 0:
207 for param in params:
208 if param not in query_enum:
209 raise ValueError('submitted query params is not allowed', param, 'allowed params:', query_enum)
210
211

Referenced by searx.engines.command._get_command_to_run().

+ Here is the caller graph for this function:

◆ __get_results_limits()

searx.engines.command.__get_results_limits ( pageno)
private

Definition at line 191 of file command.py.

191def __get_results_limits(pageno):
192 start = (pageno - 1) * 10
193 end = start + 9
194 return start, end
195
196

Referenced by searx.engines.command._get_results_from_process().

+ Here is the caller graph for this function:

◆ __parse_single_result()

searx.engines.command.__parse_single_result ( raw_result)
private
Parses command line output based on configuration

Definition at line 225 of file command.py.

225def __parse_single_result(raw_result):
226 """Parses command line output based on configuration"""
227
228 result = {}
229
230 if delimiter:
231 elements = raw_result.split(delimiter['chars'], maxsplit=len(delimiter['keys']) - 1)
232 if len(elements) != len(delimiter['keys']):
233 return {}
234 for i in range(len(elements)): # pylint: disable=consider-using-enumerate
235 result[delimiter['keys'][i]] = elements[i]
236
237 if parse_regex:
238 for result_key, regex in _compiled_parse_regex.items():
239 found = regex.search(raw_result)
240 if not found:
241 return {}
242 result[result_key] = raw_result[found.start() : found.end()]
243
244 return result

Referenced by searx.engines.command._get_results_from_process().

+ Here is the caller graph for this function:

◆ _get_command_to_run()

searx.engines.command._get_command_to_run ( query)
protected

Definition at line 142 of file command.py.

142def _get_command_to_run(query):
143 params = shlex_split(query)
144 __check_query_params(params)
145
146 cmd = []
147 for c in command:
148 if c == '{{QUERY}}':
149 cmd.extend(params)
150 else:
151 cmd.append(c)
152
153 return cmd
154
155

References searx.engines.command.__check_query_params().

Referenced by searx.engines.command.search().

+ Here is the call graph for this function:
+ Here is the caller graph for this function:

◆ _get_results_from_process()

searx.engines.command._get_results_from_process ( results,
cmd,
pageno )
protected

Definition at line 156 of file command.py.

156def _get_results_from_process(results, cmd, pageno):
157 leftover = ''
158 count = 0
159 start, end = __get_results_limits(pageno)
160 with Popen(cmd, stdout=PIPE, stderr=PIPE, env=environment_variables) as process:
161 line = process.stdout.readline()
162 while line:
163 buf = leftover + line.decode('utf-8')
164 raw_results = buf.split(result_separator)
165 if raw_results[-1]:
166 leftover = raw_results[-1]
167 raw_results = raw_results[:-1]
168
169 for raw_result in raw_results:
170 result = __parse_single_result(raw_result)
171 if result is None:
172 _command_logger.debug('skipped result:', raw_result)
173 continue
174
175 if start <= count and count <= end: # pylint: disable=chained-comparison
176 result['template'] = result_template
177 results.append(result)
178
179 count += 1
180 if end < count:
181 return results
182
183 line = process.stdout.readline()
184
185 return_code = process.wait(timeout=timeout)
186 if return_code != 0:
187 raise RuntimeError('non-zero return code when running command', cmd, return_code)
188 return None
189
190

References searx.engines.command.__get_results_limits(), and searx.engines.command.__parse_single_result().

+ Here is the call graph for this function:

◆ check_parsing_options()

searx.engines.command.check_parsing_options ( engine_settings)
Checks if delimiter based parsing or regex parsing is configured correctly

Definition at line 212 of file command.py.

212def check_parsing_options(engine_settings):
213 """Checks if delimiter based parsing or regex parsing is configured correctly"""
214
215 if 'delimiter' not in engine_settings and 'parse_regex' not in engine_settings:
216 raise ValueError('failed to init settings for parsing lines: missing delimiter or parse_regex')
217 if 'delimiter' in engine_settings and 'parse_regex' in engine_settings:
218 raise ValueError('failed to init settings for parsing lines: too many settings')
219
220 if 'delimiter' in engine_settings:
221 if 'chars' not in engine_settings['delimiter'] or 'keys' not in engine_settings['delimiter']:
222 raise ValueError
223
224

Referenced by searx.engines.command.init().

+ Here is the caller graph for this function:

◆ init()

searx.engines.command.init ( engine_settings)

Definition at line 103 of file command.py.

103def init(engine_settings):
104 check_parsing_options(engine_settings)
105
106 if 'command' not in engine_settings:
107 raise ValueError('engine command : missing configuration key: command')
108
109 global command, working_dir, delimiter, parse_regex, environment_variables # pylint: disable=global-statement
110
111 command = engine_settings['command']
112
113 if 'working_dir' in engine_settings:
114 working_dir = engine_settings['working_dir']
115 if not isabs(engine_settings['working_dir']):
116 working_dir = realpath(working_dir)
117
118 if 'parse_regex' in engine_settings:
119 parse_regex = engine_settings['parse_regex']
120 for result_key, regex in parse_regex.items():
121 _compiled_parse_regex[result_key] = re.compile(regex, flags=re.MULTILINE)
122 if 'delimiter' in engine_settings:
123 delimiter = engine_settings['delimiter']
124
125 if 'environment_variables' in engine_settings:
126 environment_variables = engine_settings['environment_variables']
127
128

References searx.engines.command.check_parsing_options().

+ Here is the call graph for this function:

◆ search()

searx.engines.command.search ( query,
params )

Definition at line 129 of file command.py.

129def search(query, params):
130 cmd = _get_command_to_run(query)
131 if not cmd:
132 return []
133
134 results = []
135 reader_thread = Thread(target=_get_results_from_process, args=(results, cmd, params['pageno']))
136 reader_thread.start()
137 reader_thread.join(timeout=timeout)
138
139 return results
140
141

References searx.engines.command._get_command_to_run().

+ Here is the call graph for this function:

Variable Documentation

◆ _command_logger

searx.engines.command._command_logger = logger.getChild('command')
protected

Definition at line 99 of file command.py.

◆ _compiled_parse_regex

dict searx.engines.command._compiled_parse_regex = {}
protected

Definition at line 100 of file command.py.

◆ command

list searx.engines.command.command = []

Definition at line 88 of file command.py.

◆ delimiter

dict searx.engines.command.delimiter = {}

Definition at line 89 of file command.py.

◆ engine_type

str searx.engines.command.engine_type = 'offline'

Definition at line 86 of file command.py.

◆ environment_variables

dict searx.engines.command.environment_variables = {}

Definition at line 93 of file command.py.

◆ paging

bool searx.engines.command.paging = True

Definition at line 87 of file command.py.

◆ parse_regex

dict searx.engines.command.parse_regex = {}

Definition at line 90 of file command.py.

◆ query_enum

list searx.engines.command.query_enum = []

Definition at line 92 of file command.py.

◆ query_type

str searx.engines.command.query_type = ''

Definition at line 91 of file command.py.

◆ result_separator

str searx.engines.command.result_separator = '\n'

Definition at line 95 of file command.py.

◆ result_template

str searx.engines.command.result_template = 'key-value.html'

Definition at line 96 of file command.py.

◆ timeout

float searx.engines.command.timeout = 4.0

Definition at line 97 of file command.py.

◆ working_dir

searx.engines.command.working_dir = realpath('.')

Definition at line 94 of file command.py.