Binariser la magnitude de DFT

**wilfryjules** · 04/03/2013, 00h06

Bonjour à tous !! Cher geeks !!

Voilà la bombe à désamorcer, ce n'est pas un travail d'amateur

: Je cherche à binariser la magnitude de la transformée de Fourier d'une image, autrement dit appliquer un seuil binaire pour obtenir une image en noir et blanc !! . Mais j'ai un problème de format je pense !!

Vous trouverez le code original en tapant sur google: "DFT cranfield toby"

Voici le code de la fonction qui donne la magnitude de la DFT (J'ai bien commenté vous inquiétez pas !! ):

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
 
 
Mat create_spectrum_magnitude_display(Mat& complexImg, bool rearrange)
{
 
    Mat planes[2]; // planes est donc un tableau de 2 matrices (2 images: planes[0]=Partie Réelle, planes[1]=Partie Imaginaire) 
 
    /* 4. TRANSFORM THE REAL AND COMPLEX VALUES TO MAGNITUDE
   	void split(const MatND& mtx, vector<MatND>& mv)
	=> Divides multi-channel array into several single-channel arrays. 
    mtx – The source multi-channel array
    mv – The destination vector of arrays; The number of arrays must match mtx.channels() . The arrays themselves will be reallocated if needed
	NB: Ici on sépare la partie réelle de la partie imaginaire !!
	planes[0] = Partie Réelle   et  planes[1] = Partie Imaginaire
 
	void magnitude(const Mat& x, const Mat& y, Mat& magnitude)
    => Calculates magnitude of 2D vectors:  vector<vector<Point> > v2d 
    x – The floating-point array of x-coordinates of the vectors
    y – The floating-point array of y-coordinates of the vectors; must have the same size as x
    dst – The destination array; will have the same size and same type as x
	*/
	split(complexImg, planes);
    magnitude(planes[0], planes[1], planes[0]); 
    Mat mag = (planes[0]).clone();
 
 
	/*	5. SWITCH TO LOGARITHMIC SCALE
	It turns out that the dynamic range of the Fourier coefficients are too large to be displayed on the screen, so we use the logarithmic transform. 
	void log(const MatND& src, MatND& dst)
    => Calculates the natural logarithm of every array element.
    src – The source array
    dst – The destination array; will have the same size and same type as src
	*/
	mag += Scalar::all(1);   // M1 = log(1 + M) <=> compute log(1 + sqrt(Re(DFT(img))**2 + Im(DFT(img))**2))
    log(mag, mag);
 
 
	/* 6. CROP AND REARRANGE 
	Fo visualization purposes we may also rearrange the quadrants of the result, so that the origin (0,0), corresponds to the image center. 
	*/
    if (rearrange)
    {
        // re-arrange the quadrants
        shiftDFT(mag);
    }
 
 
	/* 7. NORMALIZE 
	This is done again for visuaisation purposes. We now have the magnitudes, however this are still out of our image display range of zero to one. We normalize our values to this range using
	normalize();
 
    void normalize(const MatND& src, MatND& dst, double alpha, double beta, int normType)
	=> Normalizes array’s norm or the range
    src – The source array
    dst – The destination array; will have the same size as src
    alpha – The lower range boundary for range normalization
    beta – The upper range boundary for range normalization
    normType – The normalization type 
   	*/
    normalize(mag, mag, 0, 1, CV_MINMAX);
 
    return mag;
 
}

Je souhaite donc faire un Threshold, c'est à dire binariser "mag"!!

Mais le soucis c'est que je ne comprends rien aux formats des images, et apparemment on a appliqué le log ainsi que le normlize() à notre mag, donc je sais pas trop quel est le format de ce truc !!

Voici le code de la fonction shift (pour inverser chaque quart de l'image avec celui diamétralement opposé ! ==> en gros les coins de l'image deviennent le centre!! )

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
 
// Rearrange the quadrants of a Fourier image so that the origin is at
// the image center
 
void shiftDFT(Mat& fImage )
{
  	Mat tmp, q0, q1, q2, q3;
 
	/* first crop the image, if it has an odd (=impair) number of rows or columns.
	Opérateur & bit à bit par -2 (en complément à 2: -2 = 111111111....10) de façon à éliminer le premier bit 2^0 (en cas de nombre impair de ligne ou colonne on prend le nombre pair juste en dessous)*/
	fImage = fImage(Rect(0, 0, fImage.cols & -2, fImage.rows & -2)); 
	int cx = fImage.cols/2;
	int cy = fImage.rows/2;
 
	/* Rearrange the quadrants of Fourier image so that the origin is at the image center */
	q0 = fImage(Rect(0, 0, cx, cy));
	q1 = fImage(Rect(cx, 0, cx, cy));
	q2 = fImage(Rect(0, cy, cx, cy));
	q3 = fImage(Rect(cx, cy, cx, cy));
 
	/* On inverse chaque quart de l'image avec son autre quart diagonalement opposé 
	Ainsi chaque coin de l'image se retrouve au centre de l'image, y compris l'origine */
 
	/* On inverse q0 et q3 */
	q0.copyTo(tmp);
	q3.copyTo(q0);
	tmp.copyTo(q3);
 
	/* On inverse q1 et q2  */
	q1.copyTo(tmp);
	q2.copyTo(q1);
	tmp.copyTo(q2);
}

Voici mon main:

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
 
 
 
int main( int argc, char** argv )
{
 
  Mat img, imgGray;	// image object
  Mat padded;		// fourier image objects and arrays
  Mat complexImg;
  Mat planes[2], mag, mag_int, magThres;
 
  int N, M; // fourier image sizes
 
  const string originalName = "Input Image (grayscale)"; // window name
  const string spectrumMagName = "Magnitude Image (log transformed)"; // window name
  const string spectrumMagThresName = "Threshold of the Magnitude Image (log transformed)"; 
 
  bool keepProcessing = true;	// loop control flag
  int  key;						// user input
  int  EVENT_LOOP_DELAY = 40;	// delay for GUI window
                                // 40 ms equates to 1000ms/25fps = 40ms per frame
 
  // if command line arguments are provided try to read image/video_name
  // otherwise default to capture from attached H/W camera
 
    if( ( argc == 2 && (!(img = imread( argv[1], CV_LOAD_IMAGE_COLOR)).empty()))
 
    {
        // create window object (use flag=0 to allow resize, 1 to auto fix size)
 
          namedWindow(originalName, 0);
	  namedWindow(spectrumMagName, 0);
 
	  // start main loop
 
	  while (keepProcessing) 
	  {
		  EVENT_LOOP_DELAY = 0;
 
		  /*
           void cvtColor(const Mat& src, Mat& dst, int code)
           => Converts image from one color space to another
           src – The source image, 8-bit unsigned, 16-bit unsigned ( CV_16UC... ) or single-precision floating-point
           dst – The destination image; will have the same size and the same depth as src
           code – The color space conversion code; see the discussion
    
           The function converts the input image from one color space to another. In the case of transformation to-from RGB color space the ordering of the channels should be specified explicitly (RGB or BGR).
 
           The conventional ranges for R, G and B channel values are:
		   0 to 255 for CV_8U images
		   0 to 65535 for CV_16U images and
		   0 to 1 for CV_32F images.
 
		  */
		  cvtColor(img, imgGray, CV_BGR2GRAY);
 
 
		  /* 1. EXPAND THE IMAGE TO AN OPTIMAL SIZE
		  The performance of the DFT depends of the image size. It tends to be the fastest for image sizes that are multiple of 2, 3 or 5. 
		  We can use the copyMakeBorder() function to expand the borders of an image. 
		  */
		  M = getOptimalDFTSize( imgGray.rows );
	  	  N = getOptimalDFTSize( imgGray.cols );
	  	  copyMakeBorder(imgGray, padded, 0, M - imgGray.rows, 0, N - imgGray.cols, BORDER_CONSTANT, Scalar::all(0));
 
		  /* 2. MAKE PLACE FOR BOTH THE COMPLEX AND REAL VALUES
		  The result of the DFT is a complex. Then the result is 2 images (Imaginaire + Réelle), and the frequency domains range is much larger than the spatial one. Therefore we need to store in float !
		  That's why we will convert our input image "padded" to float and expand it to another channel to hold the complex values. 
		  static MatExpr Mat::zeros(Size size, int type)
		  => Returns a zero array of the specified size and type.
		  size – Alternative to the matrix size specification Size(cols, rows).
		  type – Created matrix type.
		  Mat A;
		  A = Mat::zeros(3, 3, CV_32F);
          In the example above, a new matrix is allocated only if A is not a 3x3 floating-point matrix. Otherwise, the existing matrix A is filled with zeros.
		  */
		  planes[0] = Mat_<float>(padded);
		  planes[1] = Mat::zeros(padded.size(), CV_32F);
 
		  /* Creates one multichannel array out of several single-channel ones. */
	  	  merge(planes, 2, complexImg);
 
		  /* 3. MAKE THE DISCRETE FOURIER TRANSFORM 
		  The result of thee DFT is a complex image : "complexImg" */
		  dft(complexImg, complexImg);
 
		  // CREATE MAGNITUDE FOR OUTPUT (ETAPES 4, 5, 6, et 7)
		  mag = create_spectrum_magnitude_display(complexImg, true);
 
		  // convert mag to grayscale
 
 
		  /*
	      double threshold(const Mat& src, Mat& dst, double thresh, double maxVal, int thresholdType)
		  => Applies a fixed-level threshold to each array element
		  src – Source array (single-channel, 8-bit of 32-bit floating point)
		  dst – Destination array; will have the same size and the same type as src
		  thresh – Threshold value
		  maxVal – Maximum value to use with THRESH_BINARY and THRESH_BINARY_INV thresholding types
		  thresholdType – Thresholding type (see the discussion)
     	  */
		//  threshold(mag_int, magThres, 100,  255, THRESH_BINARY);
 
		  // display image in window
		  imshow(originalName, imgGray);
		  imshow(spectrumMagName, mag);
	     // imshow(spectrumMagThresName, magThres);
 
		  // start event processing loop (very important,in fact essential for GUI)
	      // 40 ms roughly equates to 1000ms/25fps = 4ms per frame
 
		  key = waitKey(EVENT_LOOP_DELAY);
 
		  if (key == 'x')
		  {
 
	   		// if user presses "x" then exit
 
			  	std::cout << "Keyboard exit requested : exiting now - bye!"
				  		  << std::endl;
	   			keepProcessing = false;
		  }
	  }
 
      return 0;
    }
 
    // not OK : main returns -1
 
    return -1;
}
/******************************************************************************/

**math_lab** · 04/03/2013, 13h14

Etant donné que l'image est normalisée, je dirais que ton image contient des floats entre 0 et 1. Donc ton threshold devrait pouvoir se faire en spécifiant des floats ou doubles a la fonction (genre 0.5 et 1.0).
Le problème c'est ce log, j'ai aucune idée de pourquoi tu veux faire un seuillage sur une Fourier, et donc je sais pas si le seuillage doit être fait sur la valeur log ou pas. Si tu dois repasser en non-log, il doit bien y avoir une fonction opencv qui le fait. Sinon, tu peux t'amuser a parcourir l'image pixel par pixel pour transformer tes valeurs et faire le seuillage.

**wilfryjules** · 05/03/2013, 02h51

MERCiiiiiiiiiiiiiiii

tu es un petit génie Mat !!!!!!!!!!!!!!!!!

Voilà la solution :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
 
 threshold(mag, magThres, 0.7 , 255, THRESH_BINARY);

Pour répondre à ta question Mat, c'est très utile de binariser la magnitude !! En effet la magnitude de la DFT donne la géométrie de ton image !! donc en binarisant tu obtient une tache blanche sur noir, qui te donne l'orientation de ton image !!! Et tu peux alors faire une étude de blobs !!

Binariser la magnitude de DFT

OpenCV

Vue hybride

Discussions similaires

Partager

Partager