Team:Aachen/Notebook/Software/Measurarty

From 2014.igem.org

Revision as of 23:50, 16 October 2014 by Mjoppich (Talk | contribs)

Measurarty

Measurarty, the evil player in the game of Cellock Holmes and WatsOn. Measurarty is the pathogene detection logic behind our project. Using our Measuiarty algorithm we want to automatically detect pathogenes from the chip photos delivered by WatsOn, without human interaction. Besides reducing the risk of human errors, this makes our device usable by almost everyone.

Aachen Measurarty Intro button.png

Measurarty - An Introduction

Our device control software is able to take images of incubated chips in the device. Yet that does not bring is close to the answer of the question

Is there a pathogen detected?

In fact, answering this question seems trivial for a human. Just check whether there has a colony grown in the chip and you're done. It's even easier with our chip system, because these show fluorescence wherever a pathogene has been detected.

But is this an as easy task for a computer? Actually not. The task of automatic detection is tried to be answered from several disciplines in computer science, starting with pattern recognition over machine learning and finally by medical imaging chairs.

We would like to present a pipeline here for this task, that makes use of easy segmentation and classification algorithms. First we segment the target image using Statistical Region Merging (SRM) in order to find regions of similar properties. After this step we can apply a segmentation using histogram thresholding in HSV color space to find candidate regions for pathogenes. Finally a classification algorithm can detect the pathogene on our chips.

Aachen Puzzels button.png

Statistical Region Merging (SRM)

Before we want to briefly introduce Statistical Region Merging (SRM), we would like to explain why we need this step, and why this algorithm is an ideal choice.

Compared to other clustering algorithms, SRM is quite leightweight, delivers yet deterministic results and is not dependant on a certain seed (like k-means for example).

On the other hand it can create as many refinements as one wants and therefore is flexible enough for the task here. Finally there's already been knowledge about this algorithm in the group.

Statistical Region Merging (SRM) [1] is a clustering technique also used directly for image segmentation. A region $R$ is a set of pixels and the cardinality $\lvert R \rvert$ determines how many pixels are in a region. Starting with a sorted set of connected regions (w. r. t. some distance function $f$), two regions $R$ and $R'$ are merged if the qualification criteria $\vert \overline{R'}-\overline{R} \vert \leq \sqrt{b^2(R)+b^2(R')}$ with $b(R) = g \cdot \sqrt{\frac{\ln \frac{\mathcal{R}_{\lvert R \rvert}}{\delta}}{2Q\lvert R \rvert}}$ is fulfilled. Therefore, $\mathcal{R}_{\lvert R \rvert}$ is the set of regions with $\lvert R \rvert$ pixels. Typically $Q$ is chosen as $Q \in \lbrack 256, 1\rbrack$ and $\delta = \frac{1}{\lvert I \rvert^2}$.

The $Q$ parameter mainly influences the merging process. See Figure SRM Regions for an example. Choosing lower values for $Q$, the regions are becoming more coarse. Using a union-find structure, the segmentation does not need to be recalculated for each $Q$ level. For the step from $q$ to $\frac{q}{2}$, simply the qualification criteria needs to be applied to the regions from the $q$ result. A MATLAB implementation can be found in [2].

Aachen srm regions 3.PNG Aachen srm regions 2.PNG
SRM Regions (random color)
Different Regions from a SRM run starting at $Q=256$ top left going to $Q=1$ bottom right. Each region is assigned a random color.
SRM Regions (average color)
Different Regions from a SRM run starting at $Q=256$ top left going to $Q=1$ bottom right. Each region is assigned the average color of that region.

[1] Nock R, Nielsen F. Statistical region merging. IEEE Transactions on PAMI. 2004;26:1452–8.

[2] Boltz S. Statistical region merging matlab implementation; 2014. Available from: [http://www.mathworks.com/matlabcentral/fileexchange/25619-image-segmentation-using-statistical-region-merging] . Accessed 12 Dec 2013.

Aachen SEgment button.png

Segmentation

In the segmentation stage all background regions get removed. This task is quite crucial. If one removes too few, the final stage of finding pathogenes might get irritated. On the other hand, if one removes too many regions, positive hits might get removed early before detection. This surely also must be avoided.

We opted for a simple thresholding step because it showed that while being easy, it is an effective weapon against the uniform background. In fact, the good image quality we wanted to reach with our device allows now less sophisticated methods. Also the less computational intensive the steps are, the better they might even run directly on the Raspberry Pi in our device!

The HSV thresholding is performed on each component seperately (for more information on the HSV color space we refer to [http://en.wikipedia.org/wiki/HSL_and_HSV Wikipedia]). The first component is the hui, which we select to be inbetween $0.462$ and $0.520$ to select any blue-greenish (turquoise) color. We will not see bright green due to the filter selection in our device. The saturation value must be high, between $0.99$ and $1.0$. Finally the Value must be between $0.25$ and $0.32$, which assumes a relatively dark-ish color.

Indeed, these values are not problem specific, but specific for each setup and therefore must be determined experimentally.

The remainder of this stage creates a mask of pixels that fulfill the conditions.


% Auto-generated by colorThresholder app on 15-Oct-2014
%-------------------------------------------------------
function [maskedRGBImage] = createMask(srmimg)
RGB = srmimg;

% Convert RGB image to chosen color space
I = rgb2hsv(RGB);

% Define thresholds for channel 1 based on histogram settings
channel1Min = 0.462;
channel1Max = 0.520;

% Define thresholds for channel 2 based on histogram settings
channel2Min = 0.99;
channel2Max = 1.000;

% Define thresholds for channel 3 based on histogram settings
channel3Min = 0.25;
channel3Max = 0.32;

% Create mask based on chosen histogram thresholds
BW = (I(:,:,1) >= channel1Min ) & (I(:,:,1) <= channel1Max) & ...
    (I(:,:,2) >= channel2Min ) & (I(:,:,2) <= channel2Max) & ...
    (I(:,:,3) >= channel3Min ) & (I(:,:,3) <= channel3Max);

% Initialize output masked image based on input image.
maskedRGBImage = RGB;

% Set background pixels where BW is false to zero.
maskedRGBImage(repmat(~BW,[1 1 3])) = 0;

end

Aachen Classify button.png

Classification

Automatic Classification


function [mask seg] = automaticseeds(maskedImg)
end

Smoothness Index

For position prediction in virtual environments, jitter or noise in the output signal is not wanted while often present. Since discovering smooth areas is a similar problem to jitter detection, a simple method for determining jitter can be used to measure non-jitter, smoothness [1]. It is assumed that jitter-free areas of a position signal do not differ in velocity.

Smooth areas don't differ in intensity, and therefore only low changes in velocity (intensity change) can be recorded. For the reduction of noise, this operation is performed on the smoothed input image. Then the smoothness $s$ of a pixel $p$ can be determined as: \begin{equation} s(p) = \sum\limits_{p' \in \mathcal{N}_k} \nabla(p') / \arg\max\limits_{p} s(p) \end{equation}

Using a thresholding, $TS_l \leq s(p) \leq TS_u \wedge TI_l \leq I \leq TI_u$, different areas, such as background or pathogene, can be selected.

For the empirical choice of thresholds it can be argued that these are hand tailored to the specific case. While this surely is true to a certain extent, the here presented method has been successfully tested on images from a completely different domain, and no changes to the thresholds have been made to make it work. A proper theoretical evaluation is emphasized, however probably is not the aim of the iGEM competition.

Finally selecting for the red region, this delivers the location of possible pathogenes. Since the size of the agar chips is variable but fixed, a quantitative analysis can be performed by counting pixels for instance.

[1] Joppich M, Rausch D, Kuhlen T. Adaptive human motion prediction using multiple model approaches. In: Virtuelle und Erweiterte Realität, 10. Workshop der GI-Fachgruppe VR/AR. Shaker Verlag; 2013. p. 169–80.

Empirical Evaluation

Using our Matlab code we found the lower threshold for the smoothness index to be $TS_l = 0.85$ and the upper threshold $TS_u = \infty$. Similarly for $TI_l = 235$ and $TI_u = \infty$.

For these settings we can find a response already in images taken after 42 minutes.

Ideally one would rate the quality of the image segmentation using some ground truth, such as manual delineations. This yet has to be done for our method. However, from visual observations, our method is showing promising results.


Aachen 14-10-15 Medal Cellocks iNB.png

Achievements

  • versatile
  • robust
  • works quickly
  • gif show

Source Code

  • Matlab Code
  • C++ project
  • link github