Image filters (Algorithm Phase 1)
The aim is to bring the image into a black-white format which contains no white black fills. Depending on the original image, either grayscale or black and white, the number of applied filters will vary. A threshold with the fill removal filter is applied regardless of the input. Additionally if the input is grayscale, a Blur alongside a Sobel or Canny filter will be used. Idealy, the set of connected curves would indicate the boundaries of objects.
Tracing (Algorithm Phase 2)
Contour tracing algorithms are formed by simple rules for image traversal where points are selected into groups that constitute paths. Two tracing algorithms have been designed, one slower but more precise, and one faster but with poorer precision. The former is a modified version of Moore-NeighBor algorithm.
General considerations
Image vectorization, as introduced in this work, depends largely on the size of the image and the set of paths that can be extracted from it. Based on this, realtime processing of small images of up to HD quality (e.g. 2MP with 15-30fps) can be achieved on modest architectures such as Intel Atom. On the other hand, processing images of 1MP to 10GP can be optimized on high throughput architectures such as Intel Xeon or AMD Opteron.
It is usually quite difficult to evaluate a new hardware architecture, mainly due to the large set of features that it provides. The proposed application can act as a benchmark when using images exceeding 16MP (e.g. over 4096x4096), or HD video streams (e.g. greater than 1366x768). Running the application in a realtime video mode will output a score representing the average fps. When processing images, execution time can be measured either as a whole or per processing step (i.e. for filter, extraction, or reduction respectively). Both methods can be used to differentiate between processors, to evaluate different load conditions -- for CPU, GPU, or RAM -- and even compare entire systems.
Image vectorization can be performed on either SISD or SIMD units, making it a good fit for heterogeneous and high performance homogeneous architectures alike. The problem can be solved using homogeneous manycore CPUs, but low power heterogeneous processing units are still best suited for this task. Extra specialized units, such as GPUs, are able to complement the CPU, and achieve high throughput, and low CPU utilization. Architecture dependent instruction sets, such as SSE or AVX can further enhance the overall performance of the algorithms, without increasing the CPU load.
glupescu@clusterLG:~/Desktop/mvec/mvec$ ./mev
OPTION INFO
-------------------------------------------
-help,-h list commands
-mode 1 threshold + contour
-mode 2 blur + sobel + contour
-mode 3 gauss + sobel + contour
-mode 4 gauss + sobel + contourSSE
-mode 5 blurSSE + sobel + contourSSE
-mode 6 canny + contourSSE
-mode 7 blurCL + sobelCL + contourCL
-webcam realtime processing, webcam[0] as input CV
-video [filename] realtime processing, vidfile[0] as inputCV
-image [filename] process each image file
-limit [num] limit total stream frame count
-buf [num] buffer limit in MB
-aprox [num] specify level of aproximation
-fast force fast contour extraction alg
-gui display image processing steps
-ps output postscript file (out.ps)
The source code for the program can be found here https://gitorious.org/mvec/mvec
It supports any combination of OpenMP, OpenCL, SSE, OpenCV/CImg and can act as a benchmark. It can be compiled both on Linux and Windows (it uses cmake). Tested on Ubuntu 11/12 and Windows 7.


