SMQTK v0.7.0 Release Notes

This minor release incorporates various fixes and enhancements to representation and algorithms interfaces and implementations.

A new docker image has been added to wrap the IQR web interface and headless services. This image can either be used as a push-button image ingestion and IQR interface container, or as a fully feature environment to play around with SMQTK, Caffe deep-learning-based content description and IQR.

A major departure has happened for some representation structures, like DataElements, as they are no longer considered hashable and now have interfaces reflecting their mutability. Representation structures, by their nature of having arbitrary backends, may be modifiable my external agents interacting in a separate manner with the backend being used. This has also opened up the ability to provide algorithm implementations with DataElement instances instead of filepaths for desired byte content and many implementations have transitioned over to using this pattern. There is nothing fundamentally wrong with requesting file-path input, however it is restricting as to where configuration files or data models may come from.

Updates / New Features since v0.6.2

Algorithms

Descriptor Generators

  • Added KWCNN DescriptorGenerator plugin

Build System

  • Added setup.py script in support of installation by pip. Updated CMake code to install python components via this scripts.

  • Added SMQTK_BUILD_FLANN and SMQTK_BUILD_LIBSVM to CMake for optionally building libSVM and Flann (both default ON).

Classifier Interface

  • Added default ClassificationElementFactory that uses the in-memory back-end.

Compute Functions

  • Added minibatch kmeans based descriptor clustering function with CLI interface.

Descriptor Elements

  • Revised implementation of in-memory representation, doing away with global cache.

  • Added optimization to Postgres backend for a slightly faster has_vector implementation.

Descriptor Generator

  • Removed lingering assumption of pyflann module presence in colordescriptor.py.

Devops::Ansible

  • Added initial Ansible roles for SMQTK and Caffe dependency.

Devops::Docker

  • Revised default IQR service configuration file to take into account recently added session expiration support. Defaults were used before, but now it needs to be specifically enabled as by default expiration is not enabled.

  • Added IQR / playground docker container setup. Includes: - CPU + NVIDIA GPU capable docker file. - Optional input image tiling. - Optional startup of RESTfule NN and IQR services.

Documentation

  • Updated build and installation documentation.

  • Added missing utility script documentation hooks.

  • Standardized utility script definition of argument parser generation function for documentation use.

Girder

  • Added initial simple Girder plugin to link to an external IQR webapp instance.

Misc.

  • Added algo/rep/iqr imports to top level __init__.py to make basic functionality available without special imports.

Representation

Data Elements

  • Added plugin for Girder-hosted data elements

  • Added from_uri member function as well as global function to handle instance construction or selection via URI string specification.

  • Postgres data element will now automatically create its configured table if it doesn’t exist and authentication and sufficient privileges.

Descriptor Element

  • Postgres descriptor element will now automatically create its configured table if it doesn’t exist and authentication and sufficient privileges.

Descriptor Index

  • Postgres descriptor index will now automatically create its configured table if it doesn’t exist and authentication and sufficient privileges.

Scripts

  • Add script to conveniently make Ball-tree hash index model given an existing hash2uuids.pickle model file required for the LSHNearestNeighborsIndex implementation.

  • compute_many_descriptor.py batch size parameter now defaulted to 0 instead of 256.

  • Add script to cluster an index of descriptors via mini-batch kmeans (scikit-learn).

  • Added script wraping the use of the mini-batch kmeans descriptor clustering function.

  • Added scripts and notebooks for retrieving MEMEX-specific data from ElasticSearch.

  • Moved-command line scripts to the smqtk.bin sub-module in order to use setuptool support for cross-platform executable generation.

  • classifier_kfold_validation utility now only uses MemoryClassificationElement instead of letting it be configurable.

  • Added script for finding nearest neighbors of a set of UUIDs given a nearest neighbors index.

  • Added script to add GirderDataElements to a data set

Utilities

  • Started a module containing URL-base utility functions, initially adding a url-join function similar in capability to os.path.join.

  • Added fixed tile cropping to image transform tool.

  • Added utility functions to detect mimetypes of files via file-magic or tika optional dependencies.

Web

  • Updated/Rearchitected IqrSearchApp (now IqrSearchDispatcher) to be able to spawn multiple IQR configurations during runtime in addition to any configured in the input configuration JSON file. This allows external applications to manage configuration storage and generation.

  • Added directory for Girder plugins and added an initial one that, given a folder with the correct metadata attached, can initialize an IQR instance based on that configuration, and then link to IQR web interface (uses existing/updated IqrSearch web app).

  • Added ability to automatically login via a valid Girder token and parent Girder URL for token/user verification. This primarily allows restricted external IQR instance creation and automatic login from Girder redirects.

  • Mongo session information block at bottom IQR app page now only shows up when running server in debug mode.

  • Added document showing complete use case with IQR RESTful webservice using the IQR docker image with LEEDS Butterfly data. Includes expected results users should be able to replicate.

Fixes since v0.6.2

Documentation

  • Fixed issues caused by moving scripts out of ./bin/ to ./python/smqtk/bin.

Scripts

  • Fix logging bug in compute_many_descriptors.py when file path has unicode in it.

  • Removed final loop progress report from compute_many_descriptors.py as it did not report valid statistics.

  • Fixed deprecated import of flask-basicauth module.

  • Fixed DescriptorFileElement cache-file save location directory when configured to use subdirectories. Now no longer creates directories to store only a single file. Previous file-element roots are not compatible with this change and need to be re-ingested.

  • Fixed IQR web app url prefix check

Metrics

  • Fixed cosine distance function to return angular distance.

Utilities

  • SmqtkObject logger class accessor name changed to not conflict with flask.Flask logger instance attribute.

Web

  • Fixed Flow upload browse button to not only allow directory selection on Chrome.