# Image Processing¶

This page contains specific implementation details of image processing functions. Additionally, the next page contains an interactive image processing function viewer. It allows for adjusting inputs and seeing results of many different functions on several different sample input images. Before talking about processing functions, we are first going to talk about inputting resources.

### Images

Images are mostly input at the beginning of processing, but there area few functions such as bitwise_and() that take additional images as inputs. When first loading the image, it can be constructed with either a path to the image or a raw numpy array. When loading an additional image into a function, it can be a path, a numpy array, or a state. Loading an image from a path simply requires a string of an absolute or relative path to the image.

plant = ih.imgproc.Image("/path/to/your/image.png")


Most of the time other methods of input are not required, but they can be useful in certain circumstances. OpenCV represnts images as numpy arrays – effectively, a large matrix that has a blue, green, and red value at each of the indices. By allowing images to be loaded in as a numpy array directly, Image Harvest can be used in conjunction with any other image tool that also uses OpenCV or represents images as numpy arrays.

im = cv2.imread("/path/to/your/image.png")
plant = ih.imgproc.Image(im)


Finally, you can pass in a state to a function to use a previously saved image. Because states require you to save something first, you cannot instantiate an image with a state. This is particularly useful when recoloring a thresholded image:

plant = ih.imgproc.Image("/path/to/your/image.png")
plant.save("base")
plant.convertColor("bgr", "gray")
plant.threshold(127)
plant.bitwise_and("base")


### ROI's

Regions Of Interest can also be input as either a list, or a json file. If a region of interest is input as a list it should be of the form:

[ystart, yend, xstart, xend]


If a ROI is input as a json file, it should look like the following:

{
"ystart": YOUR_VALUE,
"yend": YOUR_VALUE,
"xstart": YOUR_VALUE,
"xend": YOUR_VALUE
}


Each individual ROI argument has some special options. Each argument can be auto-filled by using a value of -1. For “xstart” and “ystart” this simply assigns a value of 0, but for “xend” and “yend” this fills the width and height of the image respectively. This isn’t as necessary for the starting values, but can be useful for the ending values. For example, let’s say we want an roi that skips the first 400 pixels from the left / top side of our image:

# List style
[400, -1, 400, -1]
# Json style
{
"ystart": 400,
"yend": -1,
"xstart": 400,
"xend": -1
}


Instead of using -1 you can also use “x” and “y” to represent the full width and height respectively. Adjusting the previous example:

# List style
[400, "y", 400, "x"]
# Json style
{
"ystart": 400,
"yend": "y",
"xstart": 400,
"xend": "x"
}


Finally, each individual argument can have a single arithmetic operation in it (except for multiplication). Utilizing this, we can create fairly complex ROI’s without too much effort. For example, here is an ROI that targets only the bottom half of your image, and ignores 300 pixels on both the left and right sides:

# List style
["y / 2", "y", 300, "x - 300"]
# Json style
{
"ystart": "y / 2",
"yend": "y",
"xstart": 300,
"xend": "x - 300"
}

class ih.imgproc.Image(input, outputdir='.', writename=None, dev=False, db=None, dbid=None)[source]

An individual image. Each image is loaded in as its own instance of the Image class for processing.

adaptiveThreshold(value, adaptiveType, thresholdType, blockSize, C)[source]
Parameters: value – Intensity value for the pixels based on the thresholding conditions. adaptiveType (str) – Adaptive algorithm to use, should be either ‘mean’ or ‘gaussian’. thresholdType (str) – Threshold type, should be either ‘binary’ or ‘inverse’. blockSize (int) – The window size to consider while thresholding, should only be an odd number. C (int) – A constant subtracted from the calculated mean in each window.

Thresholds an image by considering the image in several different windows instead of the image as a whole. This function is a wrapper to the OpenCV function adaptiveThreshold. Specifying ‘mean’ for adaptiveType calculates a simple mean of the area, wheras specifying ‘gaussian’ calculates a weighted sum based upon a Gaussian Kernel. Specifying ‘binary’ for thresholdType means that a particular intensity value must beat the threshold to be kept, whereas specifying ‘inverse’ means that a particular intensity value must lose to the threshold to be kept. Similar to a normal thresholding function, the image must be converted to grayscale first. This can be done using the convertColor() function, however, if your image is of type ‘bgr’, this is handled automatically.

addWeighted(image, weight1, weight2)[source]
Parameters: image (str of np.ndarray) – The image to add. weight1 (float) – The weight to apply to the current image. weight2 (float) – The weight to apply to the additional image.

This function is a wrapper to the OpenCV function addWeighted. This function adds/blends an additional image to the current based on the provided weights. Both positive and negative weights can be used.

bitwise_and(comp)[source]
Parameters: comp (str or np.ndarray) – The comparison image. The resulting mask. numpy.ndarray

Performs logical AND between the input image and the comp image. The comp input is very versatile, and can be one of three input types, an image, a path, or a saved state. An image input is a raw numpy array, and this input type will be passed through to the function without modification. If a path is specified, ih attempts to load the file as an image, and pass it to the function. Finally, the input is checked against the currently saved image states. If it matches, the corresponding state is passed to the function. The function assumes that the two input images are of matching type – if they are not an error will be thrown. By default, images loaded from a path are loaded as ‘bgr’ type images. For more information on states, see save().

bitwise_not()[source]

Inverts the image. If the given image has multiple channels (i.e. is a color image) each channel is processed independently.

bitwise_or(comp)[source]
Parameters: comp (str or np.ndarray) – The comparison image. The resulting mask. numpy.ndarray

Performs logical OR between the input image and the comp image. The comp input is very versatile, and can be one of three input types, an image, a path, or a saved state. An image input is a raw numpy array, and this input type will be passed through to the function without modification. If a path is specified, ih attempts to load the file as an image, and pass it to the function. Finally, the input is checked against the currently saved image states. If it matches, the corresponding state is passed to the function. The function assumes that the two input images are of matching type – if they are not an error will be thrown. By default, images loaded from a path are loaded as ‘bgr’ type images. For more information on states, see save().

bitwise_xor(comp)[source]
Parameters: comp (str or np.ndarray) – The comparison image. The resulting mask. numpy.ndarray

Performs exclusive logical OR between the input image and the comp image. The comp input is very versatile, and can be one of three input types, an image, a path, or a saved state. An image input is a raw numpy array, and this input type will be passed through to the function without modification. If a path is specified, ih attempts to load the file as an image, and pass it to the function. Finally, the input is checked against the currently saved image states. If it matches, the corresponding state is passed to the function. The function assumes that the two input images are of matching type – if they are not an error will be thrown. By default, images loaded from a path are loaded as ‘bgr’ type images. For more information on states, see save().

blur(ksize, anchor=(-1, -1), borderType='default')[source]
Parameters: ksize (tuple) – The size of the kernel represented by a tuple (width, height). Both numbers should be odd and positive. anchor (tuple) – The anchor point for filtering. Default is (-1, -1) which is the center of the kernel. borderType (str) – The type of border mode used to extrapolate pixels outside the image.

Smoothes an image using the normalized box filter. This function is a wrapper to the OpenCV function blur. Increasing the kernel size increase the window considered when applying a blur. The anchor by default is the center of the kernel, however you can alter the anchor to consider different areas of the kernel. When blurring on the edge of the image, values for pixels that would be outside of the image are extrapolated. The method of extrapolation depends on the specified ‘borderType’, and can be one of ‘default’, ‘constant’, ‘reflect’, or ‘replicate’.

colorFilter(logic, roi=None)[source]
Parameters: logic (str) – The logic you want to run on the image. roi (list or roi file) – The roi you want to apply the filter to

This function applies a color filter defined by the input logic, to a targeted region defined by the input roi. The logic string itself is fairly complicated. The string supports the following characters: ‘+’, ‘-‘, ‘*’, ‘/’, ‘>’, ‘>=’, ‘==’, ‘<’, ‘<=’, ‘and’, ‘or’, ‘(‘, ‘)’, ‘r’, ‘g’, ‘b’, ‘max’, and ‘min’ as well as any numeric value. The logic string itself must be well formed – each operation, arg1 operator arg2, must be surrounded by parenthesis, and the entire statement must be surrounded by parenthesis. For example, if you want to check the intensity of the pixel, your logic string would be: ‘(((r + g) + b) < 100)’. This string in particular will only keep pixels whose intensity is less than 100. Similar rules apply for ‘and’ and ‘or’ operators. Let’s say we only want to keep pixels whose intensity is less than 100, OR both the red and blue channels are greater than 150, the logic string would be: ‘((((r + g) + b) < 100) or ((r > 150) and (b > 150)))’. The more complex your logic is the harder it is to read, so you may want to consider breaking up complex filtering into multiple steps for readability. Finally, despite the fact this function solves arbitrary logic, it is very fast.

contourChop(binary, basemin=100)[source]
Parameters: binary (str of np.ndarray) – The binary image to find contours of. basemin (int) – The minimum area a contour must have to be considered part of the foreground.

This function works very similiarly to the contourCut() function, except that this function does not crop the image, but removes all contours that fall below the threshold.

contourCut(binary, basemin=100, padding=[0, 0, 0, 0], resize=False, returnBound=False, roiwrite='roi.json')[source]
Parameters: binary (str or np.ndarray) – The binary image to find contours of. basemin (int) – The minimum area a contour must have to be considered part of the foreground. padding (int) – Padding add to all sides of the final roi. returnBound (bool) – If set, instead of cropping the image, simply write the detected roi. resize (bool) – Whether or not to resize the image.

This function crops an image based on the size of detected contours in the image – clusters of pixels in the image. The image is cropped such that all contours that are greater than the specified area are included in the final output image. image is cropped such that all contours that are greater than the specified area are included in the final output image. If returnBound is set, instead of actually cropping the image, the detected roi is written to a file instead. Otherwise, the detected roi is passed into the crop() function, with the given resize value. This function is useful for getting accurate height and width of a specific plant, as well as removing outlying clusters of non-important pixels.

convertColor(intype, outtype)[source]
Parameters: intype (str) – The input image type outtype (str) – The output image type The converted image. numpy.ndarray KeyError

Converts the given image between color spaces, based on the given types. Supported types are: bgr, gray, hsv, lab, and ycrcb. Note, you cannot restore color to a gray image with this function, for that you must use bitwise_and with an appropriate mask + image.

crop(roi, resize=False)[source]
Parameters: roi (list or roi file) – A list corresponding to the area of the image you want. List should be of the form [ystart, yend, xstart, xend] resize (bool) – If True, actually adjusts the size of the image, otherwise just draws over the part of the image not in the roi.

This function crops the image based on the given roi [ystart, yend, xstart, xend]. There are two crop options, by default, the function doesn’t actually resize the image. Instead, it sets each pixel not in the roi to black. If resize is set to True, the function will actually crop the image down to the roi.

destroy()[source]

Destroys all currently open windows.

drawContours()[source]

A helper function that draws all detected contours in the image onto the image.

edges(threshold1, threshold2, apertureSize=3, L2gradient=False)[source]
Parameters: threshold1 (int) – First threshold for the hysteresis procedure. threshold2 (int) – Second threshold for the hysteresis procedure. apertureSize (int) – aperture size used for the Sobel operator. Must be odd, postive, and less than 8. Used to calculate Image gradient magnitude, if true then $$L = \sqrt{(dI/dx)^2 + (dI/dy)^2}$$, if false then $$L = dI/dx + dI/dy$$.

This function calculates the edges of an image using the Canny edge detection algorithm using the Sobel operator. This function is a wrapper to the OpenCV function Canny.

equalizeColor()[source]

This function calls the equalizeHist() function on each individual channel of a color image, and then returns the merged result.

equalizeHist()[source]

This function is a wrapper to the OpenCV function equalizeHist. This function equalizes the histogram of a grayscale image by stretching the minimum and maximum values to 0 and 255 respectively. If this is run on a color image it will be converted to gray scale first.

extractBins(binlist)[source]
Parameters: binlist (list) – The specified bins (color ranges) to count. The number of pixels that fall in each bin. list

This function counts the number of pixels that fall into the range as specified by each bin. This function expects the input to be a list of dictionaries as follows:

 binlist = [
{"name": "bin1",
"min": [B, G, R],
"max": [B, G, R]
},
{"name": "bin2",
...
]


Each range is defined by 6 values. A minimum and maximum blue, green, and red value. The returned list is very similar to the input list, except a ‘count’ key is added to each dictionary:

returnlist = [
{"name": "bin1",
"min": [B, G, R],
"max": [B, G, R],
"count": VALUE
},
...
]


Where ‘VALUE’ is the number of pixels that fall into the corresponding range. A list is used instead of a dictionary as the base structure to maintain order for writing to the output database. When using this function within a workflow, the order you specify your bins is the order in which they will show up in the database, and the name you specify for you bin will be the column name in the database.

extractColorChannels()[source]

This function extracts the total number of pixels of each color value for each channel.

extractColorData(nonzero=True, returnhist=False)[source]
Parameters: nonzero (bool) – Whether or not to look at only nonzero pixels. Default true. Mean & median for each channel. list

This function calculates a normalized histogram of each individual color channel of the image, and returns the mean & median of the histograms for the channels specified. Because images are imported with the channels ordered as B,G,R, the output list is returned the same way. The returned list always looks like this: [ [BlueMean, BlueMedian], [GreenMean, GreenMedian], [RedMean, RedMedian] ]. Mean values always come before median values. If nonzero is set to true (default) the function will only calculate mediapytn and means based on the non-black pixels. If you are connected to a database, the entire histogram is saved to the database, not just the mean and median.

extractConvexHull()[source]
Returns: The area of the convex hull. int

Returns the area of the convex hull around all non black pixels in the image. The point of this function is not to threshold, so the contours are generate from all the pixels that fall into the range [1, 1, 1], [255, 255, 255]

extractDimensions()[source]
Returns: A list corresponding to the height and width of the image. list

Returns a list with the following form: [height, width]

extractDimsFromROI(roi)[source]
Parameters: roi (list or roi file) – The roi to calculate height from. A list corresponding to the calculated height and width of the image. list

Returns a list with the follwoing form: [height, width]. This functions differs from the extractDimensions() in the way that height is calculated. Rather than calculating the total height of the image, the height is calculated from the top of the given ROI.

extractFinalPath()[source]

This function writes the absolute path of the output file to the database.

extractLabels(fname, meta_labels)[source]
Parameters: fname (str) – The output file name to write. meta_labels (dict) – A dictionary containing required meta info.

Meta labels should look like:

meta_labels {
"label_name": roi,
"label_name2": roi
}

extractMinEnclosingCircle()[source]
Returns: The center, and radius of the minimum enclosing circle. int

Returns the center and radius of the minimum enclosing circle of all non-black pixels in the image. The point of this function is not to threshold, so the contours are generated from all the pixels that fall into the range [1, 1, 1], [255, 255, 255].

extractMoments()[source]
Returns: A dictionary corresponding to the different moments of the image. dict

Calculates the moments of the image, and returns a dicitonary based on them. Spatial moments are prefixed with ‘m’, central moments are prefixed with ‘mu’, and central normalized moments are prefixed with ‘nu’. This function is a wrapper to the OpenCV function moments.

extractPixels()[source]
Returns: The number of non-black pixels in the image. int

Returns the number of non-black pixels in the image. Creates a temporary binary image to do this. The point of this function is not to threshold, so the binary image is created by all pixels that fall into the range [1, 1, 1], [255, 255, 255].

fill(roi, color)[source]
Parameters: roi (list or roi file) – A list corresponding to the area of the image you want. List should be of the form [ystart, yend, xstart, xend] color (list) – A list corresponding to BGR values to fill the corresponding area with.

Fills the given roi with the given color.

floodFill(mask, low, high, writeColor=(255, 255, 255), connectivity=4, fixed=False, seed=(0, 0), findSeed=False, seedMask=None, binary=False)[source]
Parameters: mask (str or np.ndarray) – A binary image corresponding to the area you don’t want to fill. seed (Tuple (x, y)) – The beginning point to use for filling. low (Tuple (b, g, r) or (i,)) – Maximal lower brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component. high (Tuple (b, g, r) or (i,)) – Maximal upper brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component. writeColor (tuple (b, g, r) or (i,)) – The color to write to the filled region. Default (255, 255, 255). connectivity (int) – The number of neighboring pixels to consider for the flooding operation. Should be 4 or 8. fixed (boolean) – If True, calculates color differences relative to the seed. findSeed (boolean) – If True, picks a seed point based on contours in the seedMask image. seedMask (str or np.ndarray) – Binary image to select seed from. binary (boolean) – Specify if input image is binary.

This function is a wrapper to the OpenCV function floodFill. This function floods the region of an image based on calculated color differences from neighbors or from the seed. When flooding a binary image all input color tuples should have 1 value instead of 3.

gaussianBlur(ksize, sigmaX=0, sigmaY=0, borderType='default', roi=None)[source]
Parameters: ksize (tuple) – The size of the kernel represented by a tuple (width, height). Both numers should be odd and positive. sigmaX (float) – The standard deviation in the x direction. If 0, the value is calculated based on the kernel size. sigmaY (float) – The standard deviation in the y direction. If 0, the value is equal to sigmaX. borderType (str) – The type of border mode used to extrapolate pixels outside the image.

This function blurs an image based on a Gaussian kernel. When blurring on the edge of the image, values for pixels that would be outside of the image are extrapolated. The method of extrapolation depends on the specified ‘borderType’, and can be one of ‘default’, ‘constant’, ‘reflect’, or ‘replicate’. This function is a wrapper to the OpenCV function GaussianBlur.

getBounds()[source]
Returns: The bounding box of the image. list

This function finds the bounding box of all contours in the image, and returns a list of the form [miny, maxy, minx, maxx]

kmeans(k, criteria, maxiter=10, accuracy=1.0, attempts=10, flags='random', labels=None)[source]
Parameters: k (int) – Number of colors in final image. criteria (str) – Determination of how the algorithm stops execution. Should be one of ‘accuracy’, ‘iteration’, or ‘either’. maxiter (int) – Maximum number of iterations of the algorithm. accuracy (float) – Minimum accuracy before algorithm finishes executing. attempts (int) – Number of times the algorithm is executed using different initial guesses. flags (str) – How to determine initial centers should be either ‘random’ or ‘pp’.

This function is a wrapper to the OpenCV function kmeans Adjusts the colors in the image to find the most compact ‘central’ colors. The amount of colors in the resulting image is the specified value ‘k’. The colors are chosen based upon the minimum amount of adjustment in the image necessary. The criteria parameter determines when the algorithm stops. If ‘accuracy’ is specified, the algorithm runs until the specified accuracy is reached. If ‘iteration’ is specified, the algorithm runs the specified number of iterations. If ‘either’ is specified, the algorithm runs until one of the conditions is satisfied. The flags parameter determines the initial central colors, and should be either ‘random’ – to generate a random initial guess – or ‘pp’ to use center initialization by Arthur and Vassilvitskii.

knn(k, labels, remove=[])[source]
Parameters: k (int) – Number of nearest neighbors to use labels (file) – Path to label file. More info below remove (list) – Labels to remove from final image.

This function is a wrapper to the OpenCV function KNearest. The label file should contain training data in json format, using the label name of keys, and all the colors matching that label as an array value. Each color should be a list of 3 values, in BGR order. That is:

{
"plant": [
[234, 125, 100],
[100, 100, 100]
],
"pot": [
...
}


When creating your label file, make sure to use helpful names. Calling each set of colors “label1”, “label2” e.t.c provides no meaningful information. The remove list is the list of matched labels to remove from the final image. The names to remove should match the names in your label file exactly. For example, let’s say you have the labels “plant”, “pot”, “track”, and “background” defined, and you only want to keep pixels that match the “plant” label. Your remove list should be specified as [“pot”, “track”, “background”].

list()[source]

Lists all saved states.

mask()[source]

This function convers the image to a color mask by performing the following operations:

1. convertColor(“bgr”, “gray”)
2. threshold(0)
3. convertColor(“gray”, “bgr”)
meanshift(spatial_radius, range_radius, min_density)[source]

Segments the image into clusters based on nearest neighbors. This function is a wrapper to the pymeanshift module. For details on the algorithm itself: Mean shift: A robust approach toward feature space analysis.

medianBlur(ksize)[source]
Parameters: ksize (int) – The size of the kernel (ksize x ksize). Should be odd and positive.

This function smoothes an image using the median filter. The kernel is set to size (ksize, ksize). The anchor position is assumed to be the center. This function is a wrapper to the opencv function medianBlur.

morphology(morphType, ktype, ksize, anchor=(-1, -1), iterations=1, borderType='default')[source]
Parameters: morphType (str) – The type of morphology to perform. Should be dilate, erode, open, close, gradient, tophat, or blackhat. ktype (str) – the type of the kernel, should be rect, ellipse, or cross. ksize (tuple) – The size of the kernel represented by a tuple (width, height). Both numbers should be odd and positive. anchor (tuple) – The anchor point for filtering. Default is (-1, -1) which is the center of the kernel. iterations (int) – The number of times to perform the specified morphology. borderType (str) – The type of border mode used to extrapolate pixels outside the image.

This function performs morphological operations based on the inputted values. This function is a wrapper to the OpenCv function morphologyEx. When performing the morphology on the edges of the image, values for pixels that would be outside of the image are extrapolated. The method of extrapolation depends on the specified ‘borderType’, and can be one of ‘default’, ‘constant’, ‘reflect’, or ‘replicate’.

normalizeByIntensity()[source]

Normalizes each channel of the pixel by its intensity. For each pixel, the intensity is defined as $$I = R + G + B$$, where $$R,G,B$$ are the color values for that pixel. We calculate new color values by multiplying the original number by 255, and dividing by the intensity, that is, $$r = \frac{255 \cdot R}{I} , g = \frac{255 \cdot G}{I}, b = \frac{255 \cdot B}{I}$$.

resize(state=None)[source]

If the image is larger than conf.maxArea, resize its total area down to conf.maxArea. This function is primarily used for viewing purposes, and as such, it does not resize the base image, but creates a copy to resize instead.

resizeSelf(scale=None, width=None, height=None)[source]
Parameters: scale (float) – Value to scale image by. width (int) – Target width of image. height (int) – Target height of image.

Resizes the current image. If scale is set, it simply resizes the width and height of the image based on the scale. If only one of width or height is set, it scales the other accordingly. If both width and height are set, it scales the image to the exact size specified.

restore(name)[source]
Parameters: name (str OR any hashable type.) – The name the image is saved under.

Reloads a previously saved image from the ‘states’ variable.

rotateColor(color)[source]
Parameters: color (list) – Color shift to perform. Should be [b, g, r].

Shifts the entire color of the image based on the values in the color list.

save(name)[source]
Parameters: name (str OR any hashable type.) – The name to save the image under.

This function saves the current image in the ‘states’ variable under the specified name. It can then be reloaded using the restore() method.

show(title=None, state=None)[source]
Parameters: title (str) – The title to give the display window, if left blank one will be created. None

Displays the image in a window. Utilizes the resize() function. If a title is not specified, the window will be named ‘windowX’ where X is the number of times show has been called.

split(channel)[source]
Parameters: channel (int) – The channel to select from the image.

This function is a wrapper to the OpenCV function split. Splits an image into individually channels, and selects a single channel to be the resulting image (Remember, color images have channel order BGR). No validation is done on channel number, so it is possible to provide a channel number that does not exist. For example, calling split on an bgr image with channel = 2 will extract the red channel from the image.

threshold(thresh, max=255, type='binary')[source]
Parameters: Returns: thresh (int) – Threshold value. max (int) – Write value for binary threshold. type (str) – Threhsold type. The thresholded image. numpy.ndarray if the specified threshold type doesn’t exist.

Thresholds the image based on the given type. The image must be grayscale to be thresholded. If the image is of type ‘bgr’ it is automatically converted to grayscale before thresholding. Supported types are: binary, inverse, truncate, tozero, and otsu.

wait()[source]

Waits until a key is pressed, then destroys all windows and continues program execution.

write(name=None)[source]

Writes the current image to the given output directory, with the given name.