Plugins and Configuration

SMQTK provides plugin and configuration utilities to support the creation of interface classes that have a convenient means of accessing implementing types, paired ability to dynamically instantiate interface implementations based on a configuration derived by constructor introspection.

While these two primary mixin classes function independently and can be utilized on their own, their combination is symbiotic and allows for users of derivative interfaces to create tools in terms of the interfaces and leave the specific selection of implementations for configuration time.

Later, we will introduce the two categories of configurable and (usually) pluggable class classes found within SMQTK.

The Pluggable Mixin

Motivation: We want to be able to define interfaces to generic concepts and structures that higher level tools can be defined around without strictly catering themselves to any particular implementation, while additionally allowing freedom in implementation variety without overly restricting implementations.

In SMQTK, this is addressed via the Pluggable abstract mixin class:

import abc
from smqtk.utils.plugin import Pluggable

class MyInterface(Pluggable):

    @abc.abstractmethod
    def my_behavior(self, x: str) -> int:
        """My fancy behavior."""

if __name__ == "__main__":
    # Discover currently available implementations and print out their names
    impl_types = MyInterface.get_impls()
    print("MyInterface implementations:")
    for t in impl_types:
        print(f"- {t.__name__}")

Interfaces and Implementations

Classes that inherit from the Pluggable mixin are considered either pluggable interfaces or plugin implementations depending on whether they fully implement abstract methods.

Interface implementations bundled within SMQTK are generally defined alongside their parent interfaces. However, other sources, e.g. other python packages, may expose their own plugin implementations via setting a system environment variable or via python extensions.

The Configurable Mixin

Motivation: We want generic helpers to enable serializable configuration for classes while minimally impacting standard class development.

SMQTK provides the Configurable mixin class as well as other helper utility functions in smqtk.utils.configuration for generating class instances from configurations. These use python’s inspect module to determine constructor parameterization and default configurations.

Currently this module uses the JSON-serializable format as the basis for input and output configuration dictionaries as a means of defining a relatively simple playing field for communication. Serialization and deserialization is detached from these configuration utilities so tools may make their own decisions there. Python dictionaries are used as a medium in between serialization and configuration input/output.

Classes that inherit from Configurable do need to at a minimum implement the get_config() instance method.

Algorithms and Representations - The Combination

Interfaces found in SMQTK are generally binned into two categories: representations and algorithms.

Algorithms are interfaces to some function or operation, specifically parameterized through their constructor and generally parameterized via the algorithm’s interface. The SmqtkAlgorithm base class inherits from both Pluggable and Configurable mixins so that all descendents gain access to the synergy they provide. These are located under the smqtk.algorithms sub-module.

Representations are interfaces to structures that are intended to specifically store some sort of data structure. Currently, the SmqtkRepresentation only inherits directly from Configurable, as there are some representational structures which desire configurability but to which variable implementations do not make sense (like DescriptorElementFactory). However most sub-classes do additionally inherit from Pluggable (like DescriptorElement). These are located under the smqtk.representation sub-module.

Implementing a Pluggable Interface

The following are examples of how to add and expose new plugin implementations for existing algorithm and representation interfaces.

SMQTK’s plugin discovery via the get_impls() method currently allows for finding a plugin implementations in 3 ways:

  • sub-classes of an interface type defined in the current runtime.

  • within python modules listed in the environment variable specified by YourInterface.PLUGIN_ENV_VAR. (default SMQTK environment variable name is SMQTK_PLUGIN_PATH, which is defined in Pluggable.PLUGIN_ENV_VAR).

  • within python modules specified under the entry point extensions namespace defined by YourInterface.PLUGIN_NAMESPACE (default SMQTK extension namespace is smqtk_plugins, which is defined in Pluggable.PLUGIN_NAMESPACE).

Within SMQTK

A new interface implementation within the SMQTK source-tree is generally implemented or exposed parallel to where the parent interface is defined. This has been purely for organizational purposes. Once we define our implementation, we will then expose that type in an existing module that is already referenced in SMQTK’s list of entry point extensions.

In this example, we will show how to create a new implementation for the Classifier algorithm interface. This interface is defined within SMQTK at, from the root of the source tree, python/smqtk/algorithms/classifier/_interface_classifier.py. We will create a new file, some_impl.py, that will be placed in the same directory.

We’ll define our new class, lets call it SomeImpl, in a file some_impl.py:

python/
└── smqtk/
    └── algorithms/
        └── classifier/
            ├── _interface_classifier.py
            ├── some_impl.py     # new
            └── ...

In this file we will need to define the SomeImpl class and all parent class abstract methods in order for the class to satisfy the definition of an “implementation”:

from smqtk.algorithms import Classifier

class SomeImpl (Classifier):
    """
    Some documentation for this specific implementation.
    """

    # Our implementation-specific constructor.
    def __init__(self, paramA=1, paramB=2):
        ...

    # Abstract methods from Configurable.
    # (Classifier -> SmqtkAlgorithm -> Configurable)
    def get_config(self):
        # As per Configurable documentation, this returns the same non-self
        # keys as the constructor.
        return {
            "paramA": ...,
            "paramB": ...,
        }

    # Classifier's abstract methods.
    def get_labels(self):
        ...

    def _classify_arrays(self, array_iter):
        ...

The final step to making this implementation discoverable is to add an import of this class to the existing hub of classifier plugins in python/smqtk/algorithms/classifier/_plugins.py:

...
from .some_impl import SomeImpl

With all abstract methods defined, this implementation will now be included in the returned set of implementation types for the parent Classifier interface:

>>> from smqtk.algorithms import Classifier
>>> Classifier.get_impls()
set([..., smqtk.algorithms.classifier.some_impl.SomeImpl, ...])

SomeImpl above should also be all set for configuration because it defines the one required abstract method get_config() and because its constructor is only anticipating JSON-compliant types. If more complicated types are desired by the constructor the additional methods would need to be overridden/extended as defined in the smqtk.utils.configuration module.

Within another python package

When implementing a pluggable interface in another python package, the proper method of export is via a package’s entry point specifications using the namespace key defined by the parent interface (by default the smqtk_plugins value is defined by smqtk.utils.plugin.Pluggable.PLUGIN_NAMESPACE).

For example, let’s assume that a separate python package, OtherPackage we’ll call it, defines a Classifier-implementing sub-class OtherClassifier in the module OtherPackage.other_classifier. This module location can be exposed via the package’s setup.py entry points metadata, using the smqtk_plugins key, like the following:

from setuptools import setup

...

setup(
    ...
    entry_points={
        'smqtk_plugins': 'my_plugins = OtherPackage.other_classifier'
    }
)

If this other package had multiple sub-modules in which SMQTK plugins were defined, the smqtk_plugins entry value may instead be a list:

setup(
    ...
    entry_points={
        'smqtk_plugins': [
            'classifier_plugins = OtherPackage.other_classifier',
            'other_plugins = OtherPackage.other_plugins',
        ]
    }
)

Reference

smqtk.utils.plugin

Helper functions and mixin interface for implementing class type discovery, filtering and a convenience mixin class.

This package provides a number of discover_via_… functions that return sets of type instances as found by the method described by that function.

These methods may be composed to create a pool of types that may be then filtered via the filter_plugin_types function to those types that are specifically “plugin types” for the given interface class. See the is_valid_plugin function documentation for what it means to be a “plugin” of an interface type.

While the above are defined in fairly general terms, the Pluggable class type defined last here is a mixin class that utilizes all of the above in a manner specific manner for the purposes of SMQTK. This mixin class defines the class-method get_impls() that will return currently discoverable plugins underneath the type it was called on. This discovery will follow the values of the PLUGIN_ENV_VAR and PLUGIN_NAMESPACE class variables defined in the interface class you are calling get_impls() from, using inherited values if not immediately specified.

Because these plugin semantics are pretty low level and commonly utilized, logging can be extremely verbose. Logging in this module, while still exists, is set to emit only at log level 1 or lower (“trace”).

NOTE: The type annotations for discover_via_subclasses and filter_plugin_types are currently set to the broad Type annotation. Ideally these should use Type[T] instead, but there is currently a known issue with mypy where it aggressively assumes that an annotated type must be constructable, so it emits an error when the functions are called with an abstract interface_type. When this is resolved in mypy these annotations should be updated.

exception smqtk.utils.plugin.NotAModuleError[source]

Exception for when the discover_via_entrypoint_extensions function found an entrypoint that was not a module specification.

exception smqtk.utils.plugin.NotUsableError[source]

Exception thrown when a pluggable class is constructed but does not report as usable.

class smqtk.utils.plugin.Pluggable[source]

Interface for classes that have plugin implementations

classmethod get_impls()Set[Type[P]][source]

Discover and return a set of classes that implement the calling class.

See the get_plugins function for more details on the logic of how implementing classes (aka “plugins”) are discovered.

The class-level variables PLUGIN_ENV_VAR and PLUGIN_HELPER_VAR may be overridden to change what environment and helper variable are looked for, respectively.

Returns

Set of discovered class types that are considered “valid” plugins of this type. See is_valid_plugin() for what we define a “valid” type to be be relative to this class.

classmethod is_usable()bool[source]

Check whether this class is available for use.

Since certain plugin implementations may require additional dependencies that may not yet be available on the system, or other runtime conditions, this method may be overridden to check for those and return a boolean saying if the implementation is available for usable. When this method returns True, the class is declaring that it should be constructable and usable in the current environment.

By default, this method will return True unless a sub-class overrides this class-method with their specific logic.

NOTES:
  • This should be a class method

  • When an implementation is deemed not usable, this should emit a

    (user) warning, or some other kind of logging, detailing why the implementation is not available for use.

Returns

Boolean determination of whether this implementation is usable in the current environment.

Return type

bool

smqtk.utils.plugin.discover_via_entrypoint_extensions(entrypoint_ns: str)Set[Type][source]

Discover and return types defined in modules exposed through the entry-point extensions defined for the given namespace by installed python packages.

Other installed python packages may define one or more extensions for a namespace, as specified by ns, in their “setup.py”. This should be a single or list of extensions that specify modules within the installed package where plugins for export are implemented.

Currently, this method only accepts extensions that export a module as opposed to specifications of a specific attribute in a module. This is due to other methods of type discovery not necessarily honoring the selectivity that specific attribute specification provides (Looking at you __subclasses__…).

For example, as a single specification string:

...
entry_points={
    "smqtk_plugins": "my_package = my_package.plugins"
]
...

Or in list form of multiple specification strings:

...
entry_points = {
    "smqtk_plugins": [
        "my_package_mode_1 = my_package.mode_1.plugins",
        "my_package_mode_2 = my_package.mode_2.plugins",
    ]
}
...
Parameters

entrypoint_ns – The name of the entry-point mapping in to look for extensions under.

Returns

Set of discovered types from the modules and class types specified in the extensions under the specified entry-point.

smqtk.utils.plugin.discover_via_env_var(env_var: str)Set[Type][source]

Discover and return types specified in python-importable modules specified in the the given environment variable.

We expect the given environment variable to define zero or more python module paths from which to yield all contained type definitions (i.e. things that descent from type). If there is an empty path element, it is skipped (e.g. “foo::bar:baz” will only attempt importing foo, bar and baz modules).

These python module paths should be separated with the same separator as would be used in the PYTHONPATH environment variable specification.

If a module defines no class types, then no types are included from that source for return.

An expected use-case for this discovery method is for modules that are not installed but otherwise accessible via the python search path. E.g. local modules, modules accessible through PYTHONPATH search path modification, modules accessible through sys.path modification.

Any errors raised from attempting to import a module are propagated upward.

Parameters

env_var – The name of the environment variable to read from.

Raises

ModuleNotFoundError – When one or more module paths specified in the given environment variable are not importable.

Returns

Set of discovered types from the modules specified in the environment variable’s contents.

smqtk.utils.plugin.discover_via_subclasses(interface_type: Type)Set[Type][source]

Utilize the __subclasses__ to discover nested subclasses for a given interface type.

This approach will be able to observe any implementations that have been defined, anywhere at all, at the point of invocation, which can circumvent efforts towards specificity that other discovery methods may provide. For example, discover_via_entrypoint_extensions may return a single type that was specifically exported from a module whereas this method will, called afterwards, yield all the other types defined in that entry-point-imported module.

The use of this discovery method may also result in different returns depending on the import state at the time of invocation. E.g. further imports may increase the quantity of returns from this function.

This function uses depth-first-search when traversing sub-class tree.

Reference:

https://docs.python.org/3/library/stdtypes.html#class.__subclasses__

NOTE: subclasses are retained via weak-references, so if a normal condition

is exposing types from something that otherwise raised an exception or if a local definition is leaking, apparently an import gc; gc.collect() wipes out the return as long as it’s not referenced, of course as long as its reference is not retained by something.

Parameters

interface_type – The interface type to recursively find sub-classes under.

Returns

Set of recursive subclass types under interface_type.

smqtk.utils.plugin.filter_plugin_types(interface_type: Type, candidate_pool: Collection[Type])Set[Type][source]

Filter the given set of types to those that are “plugins” of the given interface type.

See the documentation for is_valid_plugin() for what we define a “plugin type” to be relative to the given interface_type.

We consider that there may be duplicate type instances in the given candidate pool. Due to this we will consider an instance of a type only once and return a set type to contain the validated types.

Parameters
  • interface_type – The parent type to filter on.

  • candidate_pool – Some iterable of types from which to collect interface type plugins from.

Returns

Set of types that are considered “plugins” of the interface types following the above listed rules.

smqtk.utils.plugin.is_valid_plugin(cls: Type, interface_type: Type)bool[source]

Determine if a class type is a valid candidate for plugin discovery.

In particular, the class type cls must satisfy several conditions:

  1. It must not literally be the given interface type.

  2. It must be a strict subtype of interface_type.

  3. It must not be an abstract class. (i.e. no lingering abstract methods or properties if the abc.ABCMeta metaclass has been used).

  4. If the cls is a subclass of Pluggable, it must report as usable via its is_usable() class method.

Logging for this function, when enabled can be very verbose, and is only active with a logging level of 1 or lower.

Parameters
  • cls – The class type whose validity is being tested

  • interface_type – The base class under consideration

Returns

True if the class is a valid candidate for discovery, and False otherwise.

Return type

bool

smqtk.utils.configuration

Helper interface and functions for generalized object configuration, to and from JSON-compliant dictionaries.

While this interface and utility methods should be general enough to add JSON-compliant dictionary-based configuration to any object, this was created in mind with the SMQTK plugin module.

Standard configuration dictionaries should be JSON compliant take the following general format:

{
    "type": "one-of-the-keys-below",
    "ClassName1": {
        "param1": "val1",
        "param2": "val2"
    },
    "ClassName2": {
        "p1": 4.5,
        "p2": null
    }
}

The “type” key is considered a special key that should always be present and it specifies one of the other keys within the same dictionary. Each other key in the dictionary should be the name of a Configurable inheriting class type. Usually, the classes named within a block inherit from a common interface and the “type” value denotes a selection of a specific sub-class for use, though this is not required property of these constructs.

class smqtk.utils.configuration.Configurable[source]

Interface for objects that should be configurable via a configuration dictionary consisting of JSON types.

classmethod from_config(config_dict: Dict, merge_default: bool = True)C[source]

Instantiate a new instance of this class given the configuration JSON-compliant dictionary encapsulating initialization arguments.

This base method is adequate without modification when a class’s constructor argument types are JSON-compliant. If one or more are not, however, this method then needs to be overridden in order to convert from a JSON-compliant stand-in into the more complex object the constructor requires. It is recommended that when complex types are used they also inherit from the Configurable in order to hopefully make easier the conversion to and from JSON-compliant stand-ins.

When this method does need to be overridden, this usually looks like the following pattern:

class MyClass (Configurable):

    @classmethod
    def from_config(cls, config_dict, merge_default=True):
        # Optionally guarantee default values are present in the
        # configuration dictionary.  This statement pairs with the
        # ``merge_default=False`` parameter in the super call.
        # This also in effect shallow copies the given non-dictionary
        # entries of ``config_dict`` due to the merger with the
        # default config.
        if merge_default:
            config_dict = merge_dict(cls.get_default_config(),
                                     config_dict)

        #
        # Perform any overriding here.
        #

        # Create and return an instance using the super method.
        return super(MyClass, cls).from_config(config_dict,
                                               merge_default=False)

This method should not be called via super unless an instance of the class is desired.

Parameters
  • config_dict (dict) – JSON compliant dictionary encapsulating a configuration.

  • merge_default (bool) – Merge the given configuration on top of the default provided by get_default_config.

Returns

Constructed instance from the provided config.

abstract get_config()[source]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns

JSON type compliant configuration dictionary.

Return type

dict

classmethod get_default_config()Dict[str, Any][source]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns

Default configuration dictionary for the class.

Return type

dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}
smqtk.utils.configuration.cls_conf_from_config_dict(config: Dict, type_iter: Iterable[Type[T]])Tuple[Type[T], Dict][source]

Helper function for getting the appropriate type and configuration sub-dictionary based on the provided “standard” SMQTK configuration dictionary format (see above module documentation).

Parameters
  • config – Configuration dictionary to draw from.

  • type_iter – An iterable of class types to select from.

Raises

ValueError

This may be raised if:
  • type field not present in config.

  • type field set to None

  • type field did not match any available configuration in the given config.

  • Type field did not specify any implementation key.

Returns

Appropriate class type from type_iter that matches the configured type as well as the sub-dictionary from the configuration. From this return, type.from_config(config) should be callable.

smqtk.utils.configuration.cls_conf_to_config_dict(cls: Type, conf: Dict)Dict[source]

Helper function for creating the appropriate “standard” smqtk configuration dictionary given a Configurable-implementing class and a configuration for that class.

This very simple function simply arranges a semantic class key and an associated dictionary into a normal pattern used for configuration in SMQTK:

>>> class SomeClass (object):

… pass >>> cls_conf_to_config_dict(SomeClass, {0: 0, ‘a’: ‘b’}) == { … ‘type’: ‘smqtk.utils.configuration.SomeClass’, … ‘smqtk.utils.configuration.SomeClass’: {0: 0, ‘a’: ‘b’} … } True

Parameters
  • cls (type[Configurable]) – A class type implementing the Configurable interface.

  • conf (dict) – SMQTK standard type-optioned configuration dictionary for the given class and dictionary pair.

Returns

“Standard” SMQTK JSON-compliant configuration dictionary

Return type

dict

smqtk.utils.configuration.configuration_test_helper(inst: C, config_ignored_params: Union[Set, FrozenSet] = frozenset({}), from_config_args: Sequence = ())Tuple[C, C, C][source]

Helper function for testing the get_default_config/from_config/get_config methods for class types that in part implement the Configurable mixin class. This function also tests that inst’s parent class type’s get_default_config returns a dictionary whose keys’ match the constructor’s inspected parameters (except “self” of course).

This constructs 3 additional instances based on the given instance following the pattern:

inst-1  ->  inst-2  ->  inst-3
        ->  inst-4

This refers to inst-2 and inst-4 being constructed from the config from inst, and inst-3 being constructed from the config of inst-2. The equivalence of each instance’s config is cross-checked with the other instances. This is intended to check that a configuration yields the same class configurations and that the config does not get mutated by nested instance construction.

This function uses assert calls to check for consistency.

We return all instances constructed in case the caller wants to make additional instance integrity checks.

Parameters
  • inst (Configurable) – Configurable-mixin inheriting class to test.

  • config_ignored_params (set[str]) – Set of parameter names in the instance type’s constructor that are ignored by get_default_config and from_config. This is empty by default.

  • from_config_args (tuple) – Optional additional positional arguments to the input inst.from_config method after the configuration dictionary.

Returns

Instance 2, 3, and 4 as described above.

Return type

(Configurable,Configurable,Configurable)

smqtk.utils.configuration.from_config_dict(config: Dict, type_iter: Iterable[Type[C]], *args: Any)C[source]

Helper function for instantiating an instance of a class given the configuration dictionary config from available types provided by type_iter via the Configurable interface’s from_config class-method.

args are additionally positional arguments to be passed to the type’s from_config method on return.

Example: >>> from smqtk.representation import DescriptorElement >>> example_config = { … ‘type’: ‘smqtk.representation.descriptor_element.local_elements.DescriptorMemoryElement’, … ‘smqtk.representation.descriptor_element.local_elements.DescriptorMemoryElement’: {}, … } >>> inst = from_config_dict(example_config, DescriptorElement.get_impls(), … ‘type-str’, ‘some-uuid’) >>> from smqtk.representation.descriptor_element.local_elements import DescriptorMemoryElement >>> isinstance(inst, DescriptorMemoryElement) True

Raises
  • ValueError

    This may be raised if:
    • type field not present in config.

    • type field set to None

    • type field did not match any available configuration in the given config.

    • Type field did not specify any implementation key.

  • AssertionError – This may be raised if the class specified as the configuration type, is present in the given type_iter but is not a subclass of the Configurable interface.

  • TypeError – Insufficient/incorrect initialization parameters were specified for the specified type’s constructor.

Parameters
  • config – Configuration dictionary to draw from.

  • type_iter – An iterable of class types to select from.

  • args (object) – Other positional arguments to pass to the configured class’ from_config class method.

Returns

Instance of the configured class type as specified in config and as available in type_iter.

smqtk.utils.configuration.make_default_config(configurable_iter: Iterable[Type[C]])Dict[str, Union[None, str, Dict]][source]

Generated default configuration dictionary for the given iterable of Configurable-inheriting types.

For example, assuming the following simple class that descends from Configurable, we would expect the following behavior:

>>> # noinspection PyAbstractClass
>>> class ExampleConfigurableType (Configurable):
...     def __init__(self, a, b):
...        ''' Dummy constructor '''
>>> make_default_config([ExampleConfigurableType]) == {
...     'type': None,
...     'smqtk.utils.configuration.ExampleConfigurableType': {
...         'a': None,
...         'b': None,
...     }
... }
True

Note that technically ExampleConfigurableType is still abstract as it does not implement get_config. The above call to make_default_config still functions because we only use the get_default_config class method and do not instantiate any types given to this function. While functionally acceptable, it is generally not recommended to draw configurations from abstract classes.

Parameters

configurable_iter – An iterable of class types class types that sub-class Configurable.

Returns

Base configuration dictionary with an empty type field, and containing the types and initialization parameter specification for all implementation types available from the provided getter method.

smqtk.utils.configuration.to_config_dict(c_inst: smqtk.utils.configuration.Configurable)Dict[source]

Helper function that transforms the configuration dictionary retrieved from configurable_inst into the “standard” SMQTK configuration dictionary format (see above module documentation).

For example, with a simple DataFileElement: >>> from smqtk.representation.data_element.file_element import DataFileElement >>> e = DataFileElement(filepath=’/path/to/file.txt’, readonly=True) >>> to_config_dict(e) == { … “type”: “smqtk.representation.data_element.file_element.DataFileElement”, … “smqtk.representation.data_element.file_element.DataFileElement”: { … “filepath”: “/path/to/file.txt”, … “readonly”: True, … “explicit_mimetype”: None, … } … } True

Parameters

c_inst (Configurable) – Instance of a class type that subclasses the Configurable interface.

Returns

Standard format configuration dictionary.

Return type

dict

Reload Use Warning

While the smqtk.utils.plugin.get_plugins() function allows for reloading discovered modules for potentially new content, this is not recommended under normal conditions. When reloading a plugin module after pickle serializing an instance of an implementation, deserialization causes an error because the original class type that was pickled is no longer valid as the reloaded module overwrote the previous plugin class type.