SMQTK v0.2.2 Release Nodes

This minor release primarily adds classifier algorithm and classification representation support, a new service web application for nearest-neighbors algorithms, as well as additional documentation.

Also, this release adds a few more command line tools, especially of note is iqrTrainClassifier.py that can train a classifier based on the saved state of the IQR demonstration application (also a new feature).

Updates / New Features since v0.2.1

Classifiers

  • Added generic Classifier algorithm interface.
  • Added SupervisedClassifier intermediate interface.

Classification Elements

  • Added classification result encapsulation interface.
  • Added in-memory implementation
  • Added PostgreSQL implementation
  • Added file-based implementation
  • Added ClassificationElementFactory implementation.

Data Elements

  • Added DataFileElement implementation the optional use of the tika module for file content type extraction. Falls back to previous method when tika module not found or fails.

Descriptor Elements

  • Moved additional implementation specific documentation into docs/ directory.
  • Moved additional implementation specific configuration and example files into etc/smqtk/.
  • Moved PostgresDescriptorElement implementation out of nested sub-module into a single module in implementations directory.

Descriptor Generators

  • Removed PARALLEL class variable (parameterized in pertinent implementation constructors).
  • Added CaffeDescriptorGenerator implementation, which is more generalized and model agnostic, using the Caffe python interface.

Documentation

  • Added web-service documentation directory and moved applicable documentation files there.
  • Added more/better documentation on IQR demonstration application.
  • Added documentation on saving IQR state and training/using a supervised classifier based on it.

Tools / Scripts

  • Added descriptor compute script that reads from a file-list text file specifying input data file paths, and asynchronously computes descriptors. Uses JSON configuration file for algorithm and element backend specification.
  • Added tool for training a supervised classifier based on an IQR session state.
  • Added tool for classifying a sequence of input file paths, outputting paths that classified as the input label (highest confidence).
  • Converted iqr_app_model_generation.py to run as a command line tool with arguments, rather than an example script.

Web / Services

  • Added NearestNeighborServiceServer, which provides web-service that returns the nearest N neighbors to the given descriptor element.
  • Added ability to save IQR state via a new button in web interface. This file is used with the IQR classifier training script.

Fixes since v0.2.1

Custom LibSVM

  • Fixed an issue where svm_save_model would crash when saving a 2-class SVM model.
  • Fixed an issue where svm_save_model would save an extra, unexpected file when saving a 2-class SVM model.

Descriptor Elements

  • Fix threading joining in elements_to_matrix (when using non-multiprocessing mode).
  • Fixed configuration use in DescriptorElementFactory.from_config.

Data Sets

  • Removed is_usable abstract method. Redundant with Pluggable base class.

Docs

  • Made sphinx_server.py executable.
  • Fixed whitespacing issue with docs/algorithms.rst that prevented display of ToC sections.
  • Updated/Fixed various class/function doc-strings.

Utils

  • Fixed smqtk.utils.plugin.get_plugins to handle skipping intermediate interfaces between the base class and implementation classes, as well as to skip implementation classes that do not fully implement abstract methods.