Stereo library 2012 pointers

ronak25 · ‎10-11-2012

I am passing you how (bad) the disparity image looks like. I am sure, if you use the same sensor, results will look much better. This image is obtained from 64 min disparity and 128 number of disparities. Best wishes for your endeavors. Tell me how it goes.

Ronak.

Klemen · ‎10-13-2012

Hello again. I have yet another set of questions.:-)0

I would like to know what is critetia to select a baseline distance? And also the angle of cameras - what is the relationship between this and dept accuracy? Should the cameras be mounted so that the optical axes are parallel or form an angle? The angle also determines the minumum distance that can be measured right?

What about the projector to enhance/add the texture?Pico projector using a random pattern is a good but expensive choice. Any good alternatives?

I followed the advice of Ronak and used two same models of cameras - webcam Logitech c210 pair. It is not a superb camera but it has a fixed focus and access to all properties (white balance, contrast etc...). I am currently trying to set up a stereo system using this.

After getting the depth image, is the rectified image aligned wit it? Can I overlay the texture on the point cloud directly?

Thank you in advance for any suggestions.

Best regards, K

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

ronak25 · ‎10-15-2012

Hi Klemen,

Nice to see that you have got hold of two same cameras. Kindly read my answers below.

Q: I would like to know what is critetia to select a baseline distance? And also the angle of cameras - what is the relationship between this and dept accuracy? Should the cameras be mounted so that the optical axes are parallel or form an angle? The angle also determines the minumum distance that can be measured right?

A: For stereo utility, be default, you do not require a fixed baseline or tilt (or a lack of it). In other words, you can always have a setup that you like. However, different parameters play different roles in your depth resolution (smaller the better, tentatively inverse of accuracy) and coverage. For instance, if you increase you baseline your overlap between the images (and hence coverage decreases). So, in order to have a good coverage, small baseline is desirable. Also, you can apply tilt but once you apply it your depth resolution starts increasing (gets worse) for far objects and starts decreasing (gets better) for nearer obejcts. In conclusion, there is no magic setup for stereo. It always depends on what kind of depth and overlap you have in mind. However, personally I prefer smaller baselines with little (if at all any) tilt.

Q: What about the projector to enhance/add the texture?Pico projector using a random pattern is a good but expensive choice. Any good alternatives?

A: It is always better, but not required if lighting is good and you have sufficient texture, to have random patterned light. One thing you must avoid is light containing a repeatitive pattern. In my experience, you can always carry on your task without patterned light, if you manage to control your light and have textured objects.

Q: After getting the depth image, is the rectified image aligned wit it? Can I overlay the texture on the point cloud directly?

A: Yes. The depth image, by default, is aligned wrt the rectified image until and unless while calibrating you have selected the option of rendering back to the original image. You can overlay the texture on point cloud using OpenCV/ OpenGL. You can get point colud display either from NI or from AQSense without texture mapping.

Let me know if you have any more questions. I shall be happy to answer all.

Regards,

Ronak.

Klemen · ‎10-17-2012

Hello Ronak,

i have tried to set up a stereo system using the two cameras i mentioned in the previous post. My settings were:

Cameras focus = fixed focus 400 mm (camera technical specifications), no tilt angle

a. Baseline distance = 200 mm.

b. Baseline distance = 100 mm.

Image (setup b.):

Using this setup i get the following 3D reconstruction (see attached file):

Now i have a couple of questions:

1. The surface should be straight. Why is there so much deviation? Is this because of lack of texture?

2. There is white padding around the object - is this also the result of the lack of texture, combined with the matching window size? The background is white and uniform, so getting texture there is impossible without a projector.

I am attaching everything needed to reconstruct the calibration and measurements for setup b. if need be.

You have mentioned that i should flip the images - so looking at the cameras from the front, the right camera produces left image, and the left camera the right image? I followed your advice and this is obviously right, but that is contradictory to the stereo vision concepts manual (Manual 2012), where the setup is diffrent?

You also said that to determine the relative position of both cameras, around 6-7 grid images are enough. Is this after individual camera calibration (after already correcting distortions etc) right? I think to effectively correct the lens distortion, more grid images should be acquired.

Thank you for all your help and best regards,

K

P.s. Attaching the files in 3 steps because of the upload limit.

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

Klemen · ‎10-17-2012

The second attachment...

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

Klemen · ‎10-17-2012

And third...

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

ronak25 · ‎10-18-2012

Hi Klemen,

Great work. Finally you are able to get 3D. Kindly find my answers below:

Q: The surface should be straight. Why is there so much deviation? Is this because of lack of texture?

A: According to your stereo setup, 100 mm baseline, without tilt angles, 1 inch between two dots and 25 inch object distance, you should get a depth resolution of 0.02 inch. For this computation, I have assumed some more parameters: 8mm focal-length and 4.8x3.6 mm sensor-size. That is, depth values, if correct, can be in 0.02 inch range of the correct depth values. I see that variation in the depth image. In order to improve your resolution you can bring the object closer or increase the focal-length. Alternately, you can increase the baseline and tilt your cameras. However, given a stereo system, you will always have a region of uncertainity. This issue is not because of lack of texture.

Q: There is white padding around the object - is this also the result of the lack of texture, combined with the matching window size? The background is white and uniform, so getting texture there is impossible without a projector.

A: Yes this is definitely due to lack of texture. Can't do much about it except using a patterned (random) light projection.

Q: You have mentioned that i should flip the images - so looking at the cameras from the front, the right camera produces left image, and the left camera the right image? I followed your advice and this is obviously right, but that is contradictory to the stereo vision concepts manual (Manual 2012), where the setup is diffrent?

A: Ohh..That was just for explaining the stereo concept. So, we never really thought from that perspective. However, I shall inform to the documentation team about this. Thank you very much for intimating. I really appreciate this.

Q: You also said that to determine the relative position of both cameras, around 6-7 grid images are enough. Is this after individual camera calibration (after already correcting distortions etc) right? I think to effectively correct the lens distortion, more grid images should be acquired.

A: 6-7 grid images should be sufficient for all the tasks, if you capture them properly. So, if you look at the stereo example, you only need to cover one V-region on either left or right side of the grid. In practice, 5 images should be sufficient. We advise for more so that user need not to worry about angles and coverage.

I hope this will help.

Regards,

Ronak.

Klemen · ‎10-18-2012

Dear Ronak!

Thank you very much, I honestly appreciate all your help!

Best regards,

K

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

Klemen · ‎10-18-2012

Sorry, I have just one more question regarding the depth resolution you answered to my first question in the last post and then (hopefully) I will leave you in peace.

The equation to calculate the depth resolution is:

dZ = (dD*Z^2)/(f*B)

dZ = depth resolution

dD = disparity resolution (in Labview 16-bit, so is depth resolution 1/16 or what?)

Z = distance of object

f = focal length

B = baseline

Is this correct? Did you use the same computation?

What is the role of the sensor size in this equation that you mentioned?

To further satisfy my curiosity I promise I will open a book and stop bothering you with such questions. I don't consider myself lazy, but your explanations have been very clear and to the point and you know what human nature is like - always trying to take the easiest way. Have to remind mysel constantly that the easiest way is not always the right way...

Best regards,

K

https://decibel.ni.com/content/blogs/kl3m3n

"Kudos: Users may give one another Kudos on the forums for posts that they found particularly helpful or insightful."

ronak25 · ‎10-18-2012

Hi Klemen,

No sweat. I would love to answer. I have chosen this path many times in my life. So, I empathize with you :). Yes, I do use this equation. However, we also have a way to compute depth resolution if we have camera tilts. We can also measure theoretical values of many other parameters such as spatial resolution, spatial coverage, height coverage, etc. Coming to the equation, for knowing dD, which is a disparity resolution, one needs to know the size of the sensor. Disparity resolution is the accuracy with which we can compute the disparity. It is generally 1/5th of the size of the pixel on sensor-grid. So, if you have a 4.8 mm horizontal image grid with 640 columns, pixel size = 4.8/640. We take max of pixel width and height and use as dD. It has nothing to do with bit-depth. I hope this helps.

Regards,

Ronak.

Machine Vision

Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers

Re: Stereo library 2012 pointers