Python Packages

Includes packages that can be installed with pip for Python libraries

Package
Definition

py-runner

Main library (rierino_runner) for executing main Python processes, including ProcessorRunner and Py4JGateway as main entry points

py-util

Core library (rierino_util) for common file, object, jmespath and input/output processing functions, as well as basic Runner executions

py-media

Library (rierino_media) for image & video processing functions, including MediaEventProcess and MediaStatsEventProcess

py-dq

Library (rierino_dq) for data quality assessment functions, including DQEventHandler

py-tensor

Library (rierino_tensor) for training Tensorflow models, including TFModelProcess

py-spark

Library (rierino_spark) for training Spark jobs, including SparkModelProcess

py-custom

(Deprecated)

Library for executing custom jobs

(Deprecated in favor of more generic py-runner functions)

These packages can be used stand-alone or integrated into overall architecture using one of the following main approaches, based on the use case:

Integration into Sagas

Py4JEventHandler can be used to execute Python based EventHandler actions (such as rierino_dq.DQEventHandler or rierino_runner.ProcessEventHandler) from Java based runners, using extra container configurations in helm charts.

Sample values to pass on to runner helm charts to include Python container are as follows:

  • useExtraContainer: true

  • contents: [ {section: 'extra.sh', content: 'pip install git+https://github.com/rierino-open/[email protected]'} ]

With the default chart values, this configuration adds an extra container with py-dynamic:dynamic image, installs py-dq library and runs Py4JGateway to communicate with Java container over local gateway using ProcessPython action.

More details on Python based EventHandlers can be found here.

Batch Job Executions

Stand-alone Python jobs can be also executed using these libraries, typically deployed as Jobs or CronJobs using helm charts, although it is also possible to execute them through command line. These jobs are executed using Python based Runners (such as rierino_util.Runner or rierino_runner.ProcessorRunner).

Sample values to pass on to job helm charts to trigger Python processes are as follows:

  • pyRepo: rierino-open/py-runner

  • pyMainModule: rierino_util.ProcessorRunner --package=rierino_runner.processor --module=RestProcessor --internal=true --url=https://admin-api.xxx.rierinoclient.com/api/request/rpc/Ping --token={API_KEY}

With the default chart values, this configuration triggers ProcessorRunner which will run RestProcessor to call an internal API.

Last updated