Abstract
Purpose: :
Experiments to measure intrinsic optical signals with optical coherence tomography generate large volumes of data requiring many hours to post process. Development of post process algorithms requires a flexible and comfortable programming interface. We maintain our existing Matlab programming environment and port the computation to a cluster of under-utilized office computers.
Methods: :
To measure slow changes during dark adaptation, 40 high density (512 * 256 pts) volume tomograms were recorded before a bleaching stimulus, and 70 similar tomograms were recorded during the following 30 minute recovery period. Each test generates ~30 GB of spectral data to be translated to structural information in a post-processing step. Segmentation of intraretinal layers, to isolate stimulus induced changes to a particular tissue type takes a similar amount of time. Processing each volume requires about 30 minutes on a typical high-end desktop computer. Serially processing the entire dataset requires nearly 24 hours. Two methods of exploiting existing desktop computing power to process the multiple volumes in parallel were examined. A first system uses only Microsoft filesharing to distribute and collect workload on a small office network. The system uses compiled Matlab code distributed to the workers so that each processor on each computer individually processes a file generated by the acquisition computer. A total of 10 processors on one quad core and three dual core machines were used. A second system used Matlab code to interface and monitor a Condor Cluster job management system. The Condor system connected 45 worker computers to job profiles created by the local Matlab script.
Results: :
Calculation time for the small network system was reduced by a factor of ~10 while the Condor system reduced computation by a factor of ~25. The small, homemade management system had the advantage of reduced overhead on workers, but was less robust.
Conclusions: :
Cluster computing provides a low cost computing option with dramatic impact. Ideally, the time to compute the entire dataset is reduced to nearly the time required to compute a single volume. In our experience the factor of improvement reduced compute times from days to hours, rendering previously daunting tasks manageable. Using Matlab may be slower than some lower level development environments but provides a platform of understandable code that members of the entire lab can contribute to. Compiling that code to distribute to worker computers on a cluster topography allows for a very low cost method of implementing parallel computing.
Keywords: image processing • imaging methods (CT, FA, ICG, MRI, OCT, RTA, SLO, ultrasound) • retina