Frederico Valente; Augusto Silva; Carlos Manuel Azevedo Costa; José Miguel Franco Valiente; César Suárez-Ortega
International Journal of Image Mining (IJIM), Vol. 1, No. 2/3, 2015
Machine learning and imaging analytics are major algorithmic components of the software used by medical practitioners in the diagnosis and treatment of diseases. Whether employed by computer aided diagnosis (CADx) or content-based image retrieval (CBIR) tools, the accuracy and relevance of the results to the practitioner are paramount to the success of any such application. In order to improve on the existing results researchers often find themselves in the need to explore various approaches and methodologies, often using very large datasets and multiple sources of information. Each of these trials can, by itself, be a very time-consuming operation. One tried and true strategy to speed up operations is the use of a distributed computing platform (delivering the computational load to a number of machines). This raises a set of problems which are often orthogonal to a researcher's interest such as which algorithmic implementations scale or how to distribute data and tasks on the grid. In this article, we present a framework that empowers researchers to quickly design sets of tests, schedule their execution and have them automatically allocated to a grid environment for execution. We describe the design and implementation of the solution, and present as an example an experiment concerning the classification of mammography segmentations.