stan2018

100 Reputation

4 Badges

6 years, 249 days

MaplePrimes Activity


These are Posts that have been published by stan2018

Summary:Hough transform is a feature extraction algorithm widely used in digital image analysis. It identifies objects with specified shapes by letting all edge pixels in the image “vote” and extract candidates with the highest votes. The classical Hough transform is associated with the identification of straight lines, but with some modification, the algorithm can be used to detect objects of arbitrary shapes (usually circles, ellipses) in an image. Therefore, Hough transform has various applications in the field of computer vision. Samir and I implemented Hough transform to detect straight lines and circles in real world images and the results are below:

You can access the applications here for more details and try the algorithm on your own image!

Line Detection with Hough Transform

Circle Detection with Hough Transform

For people who want a bit more of the theory behind:

The voting procedure in Hough transform starts with an edge detector (we used the sobel edge detector in the applications). After locating all the edge pixels, the program will iterate through each of the pixels and record their “votes”. It is important to address that the voting procedure is completed in parameter space. For example, a straight line is usually in the form of y = a*x+b, the parameter space will be in terms of a and b. Hence y=2*x + 3 will be a point (2,3)  in the a-b space. However, notice that we cannot represent vertical lines with this system; it would take infinite memory to store a vertical line. Thus we use the polar coordinates system x*cos(θ) + y*sin(θ) - p = 0 instead, resulting in a θ -p space. Note that an image pixel (x1,y1) is represented by the intersection of all lines that satisfy x1*cos(θ) + y1*sin(θ) - p = 0, which in θ-p space is denoted as a sinusoidal curve. Hence the intersection of two different sinusoidal curves in θ-p space is essentially a unique, straight line that passes through two different image pixels. An intersection of 500 votes means that specific line passes through 500 edge pixels, implying that it is likely a solid contour line in the original image. In this way, by converting all edge pixels to sinusoidal curves in θ-p space, the algorithm is letting all pixels “vote”. In order to record the sinusoidal curves, we need an accumulator of θ * p dimensions. In the line detection application we used a matrix such that <= θ <= 180 and -sqrt(r2 +c2 ) <= p <= sqrt(r2 +c2 ), where r and c are the height (rows) and width (columns) of the image.

The voting procedure for circle detection is similar; the difference is in how we construct the accumulator to accommodate different parametrization for circles. Notice that a circle has three essential pieces of information, the x,y coordinates of centre and the radius. This suggests that our accumulator might have to be in three dimensions. In practice, circular Hough transform is usually performed with known or estimated radius, when the radius is not known, the most straightforward solution is indeed to test a range of possible radii with a 3-dimensional accumulator. However, this might be quite expensive, especially for images with high level of details.

Feel free to leave a comment if you have any question  :)  

In order to explore the use of signal processing package in image processing, @Samir Khan and I created this application that performs discrete Haar wavelet transform on images to achieve both lossy (irreversible) and lossless (reversible) compression.

Haar wavelet compression modifies the image matrix to increase the number of zero entries in the matrix, which results in a sparse matrix that can be stored more efficiently, thus reducing the file size. A Haar wavelet transform groups adjacent items in the matrix, stores the average and difference of the two numbers. Notice that this computation is reversible since knowing the values of a, b and given that x1-x2 = a; (x1+x2)/2 = b, we can solve for x1 and x2. Haar wavelet compression is taking advantage of the property that neighboring pixels in an image usually share very similar value; hence recursively applying Haar wavelet transform to the rows and columns of an image matrix significantly increases the number of zero entries. In the application we achieved a compression ratio of 1.296 (number of non-zero entries in original: number of non-zero entries in processed matrix).

The fact that Haar wavelet transform is reversible means that we can use it to perform lossless image compression: the decompressed image will be exactly the same as the image before compression. Transmission and temporary storage of the data would be made more efficient without any loss of details in the image.

But what if the file size is still too big or we simply don’t need that many details in the image? That is when lossy compression comes into use. By omitting some details/fidelity, lossy compression is able to achieve notably smaller file size. In this application, we apply a filter to the transformed image matrix, setting entries that are close to zeros to actual zeros (i.e. pick a positive number ϵ such that all x < ϵ are changed to 0 in the matrix). The value of ϵ directly impacts the quality of the decompressed image so should be chosen carefully in practice. In this application, we chose ϵ = 0.01, which results in a compression ratio of 19.38, but instead produces a very blurry image after reversing the compression.

(left: Original image, right: lossy compression with ϵ = 0.01)

The application can be accessed here for more details.

In an attempt to explore the field of image processing, @Samir Khan and I created an application (download here) that demonstrates the removal of two types of noises from an image through frequency and spatial filtering.

Periodic noises and salt & pepper noises are two common types of image noises, usually caused by errors during the image capturing or data transmission process. Periodic noises result in repetitive patterns being added onto the original image, while salt & pepper noises are the irregular appearance of dark pixels in the bright area and bright pixels in the dark area of the image. In this application, we artificially generate these noises and pollute a clean picture in order to demonstrate the removal techniques.

(Fig 1: Picture of Waterloo Office taken by Sophie Tan            Fig 2: Converted to greyscale for processing, added two noises)

In order to remove periodic noises from the image, we apply a 2D Fourier Transform to convert the image from spatial domain to frequency domain, where periodic noises can be visually detected as separate, discrete spikes and therefore easily removed.

(Fig 3 Frequency domain of the magnitude of the image)

One way to remove salt and pepper noises is to apply a median filter to the image. In this application, we run a 3 by 3 kernel across the image matrix that sorts and places the median among the 9 elements as the new matrix entry, thus resulting in the whole image being median-filtered.

Comparison of the image before and after noise removal:

Please refer to the application for more details on the implementation of the two removal techniques.

 

Hello, everyone! My name’s Sophie and I’m an intern at Maplesoft. @Samir Khan asked me to develop a couple of demonstration applications using the DeepLearning package - my work is featured on the Application Center

I thought I’d describe two critical commands used in the applications – DNNClassifier() and DNNRegressor().

The DNNClassifier calls tf.estimator.DNNClassifier from the Tensorflow Python API. This command builds a feedforward multilayer neural network that is trained with a set of labeled data in order to perform classification on similar, unlabeled data.

Dataset used for training and validating the classifier has the type DataFrame in Maple. In the Prediction of malignant/benign of breast mass example, the training set is a DataFrame with 32 columns in total, with column labels: “ID Number”, “Diagnosis”, “radius”, “texture”, etc. Note that labeling the columns of the dataset is mandatory, as later the neural network needs to identify which feature column corresponds to which list of values.

Feature columns are what come between the raw input data and the classifier model; they are required by Tensorflow to specify how the input data should be transformed before given to the model. Maple now supports three types of Feature Columns, including:

  • NumericColumn that represents real, numerical figure,
  • CategoricalColumn that denotes categorical(ordinal) data
  • BucketizedColumn that organizes continuous data into a discrete number buckets with specified boundaries.

In this application, the input data consists of 30 real, numeric values that represents physical traits of a cell nucleus computed from a digitized image of the breast mass. We create a list of NumericColumns by calling

with(DeepLearning):
fc := [seq(NumericColumn(u,shape=[1]), u in cols[3..])]:

where cols is a list of column labels and shape[1] indicates that each data input is just a single numeric value.

When we create a DNNClassifier, we need to specify the feature columns (input layer), the architecture of the neural network (hidden layers) and the number of classes (output layer). Recall that the DNNClassifier builds a feedforward multilayer neural network, hence when we call the function, we need to indicate how many hidden layers we want and how many nodes there should be on each of the layer. This is done by passing a list of non-negative integers as the parameter hidden_units when we call the function. In the example, we did:

classifier := DNNClassifier(fc, hidden_units=[20,40,20],num_classes=2):

where we set 3 hidden layer each with 20, 40, 20 nodes respectively. In addition, there are 30 input nodes (i.e. the number of feature columns) and 1 output node (i.e. binary classification). The diagram below illustrates a simpler example with an input layer with 3 nodes, 2 hidden layers with 7, 5 nodes and an output layer with 1 node.

(Created using NN-SVG by https://github.com/zfrenchee/NN-SVG)

After we built the model, we can train it by calling

classifier:-Train(train_data[3..32], train_data[2], steps = 256, num_epochs = 3, shuffle = true):

where we

  1. Give the training data (train_data[3..32]) and the corresponding labels (train_data[2]) to the model.
  2. Specified that the entire dataset will be passed to the model for three times and each iteration has 256 steps.
  3. Specified that data batches for training will be created by randomly shuffling the tensors.

Now the training process is complete, we can use the validation set to evaluate the effectiveness of our model.

classifier:-Evaluate(test_data[3..32],test_data[2], steps = 32);

The output indicates an accuracy of ~92.11% in this case. There are more indices like accuracy_basline, auc, average_loss that help us decide if we need to modify the architecture for better performance.

We then build a predictor function that takes an arbitrary set of measurements as a DataSeries and returns a prediction generated by the trained DNN classifier.

predictor := proc (ds) classifier:-Predict(Transpose(DataFrame(ds)), num_epochs = 1, shuffle = false)[1] end proc;

Now we can pass a DataSeries with 30 labeled rows to the predictor: (Recall the cols is a list of the column names)

ds := DataSeries([11.49, 14.59, 73.99, 404.9, 0.1046, 8.23E-02, 5.31E-02, 1.97E-02, 0.1779, 6.57E-02, 0.2034, 1.166, 1.567, 14.34, 4.96E-03, 2.11E-02, 4.16E-02, 8.04E-03, 1.84E-02, 3.61E-03, 12.4, 21.9, 82.04, 467.6, 0.1352, 0.201, 0.2596, 7.43E-02, 0.2941, 9.18E-02], labels = cols[3..]); 
predictor(ds);

The output indicates that the probability of this data being a class _id [0] is ~90.79%. In other words, according to our model, the probability of this breast mass cell being benign is ~90.79%.

The use of the DNNRegressor is very similar (almost identical) to that of the Classifier, the only significant difference is that while the Classifier predicts discrete labels as classes, the Regressor predicts a continuous qualitative result with the provided data (Note that CategoricalColumn is still applicable). For more details about the basic usage of the DNNRegressor, please refer to Predicting the burnt area of a forest fires with DNN Regressor.

 

Page 1 of 1