hi, basically i am trying to be a "graphics programmer" by trying to understand and write 3d camera from scratch or 2d screen.. basically one has xyz coordinate in 3d space while the screen has 2d coordinates of u and v,
w, z index. naturally u and v are made of
u=x/z and v=y/z of sorts and then the z index should follow z in 3dspace so objects closer to camera will have lower index of sorts i guess... but i cant help but feel i have not graps the thing bits by bits and how each two dimensional pixel/sprites operates based on that matrix, so, it feels doable but at the same vague as hell.
i hope i can get some enlightment here...
Here we go again...
You need to learn what a transformation matrix is. This time it's the projection matrix.
You also need to learn how to use Google.
@previous (C)
> > easier
> it definitely will since you wont talk as long as a youtube video...if you are not a bot.
Thanks for the vote of confidence but I doubt it. The explanation needs to take a certain amount of time because there are a number of things you need to understand. Here's the most condensed explanation I can give you:
1. Start with a diagram of what you are trying to achieve -
https://www.google.com/search?q=projection+frustum&tbm=isch
2. Identify the following:
- camera position: the viewing point, which we'll call C
- near plane: the clipping plane closest to C - objects in front of this plane (between C & near won't be visible)
- far plane: the clipping plane at the back of the projection - objects between near and far will be visible and objects beyond far will not)
- frustum: the box shaped like a trapezoid between near and far planes, inside of which objects will be visible
- field of view (fov) angle, which we'll call θ: draw a line from C to the top edge of the near plane and another line from C to the bottom edge of the near plane, then θ is the angle between those two lines.
- n: the distance between C and the near plane
- f: the distance between C and the far plane
3. Bear in mind that by convention, the z axis is negative in the direction moving away from the camera towards the clipping planes. This may seem counter-intuitive but as you'll hopefully see, it works better.
4. I am not going to explain this step in much detail, because as I told you before, you first need to understand how matrix multiplication works. I highly recommend
https://www.youtube.com/watch?v=rHLEWRxRGiM if this doesn't make sense to you yet. For now just take it as given that your goal here is to derive a 4D matrix (M) that tranforms model space to image space. In other words, when M is multiplied by points inside the frustum, effectively it transforms the shape of the frustum from a trapezoid into a cube bounded by (1, 1, 1) and (-1, -1, -1). Again, a diagram would help here, but you're going to have to use your imagination.
If we call the point we want to transform p = (x, y, z, 1) and the transformed point, p*, then that transformation looks like this:
M.p = p*
| X 0 0 0 | | x | | Xx |
| 0 Y 0 0 | . | y | = | Yy |
| 0 0 a b | | z | | az + b |
| 0 0 -1 0 | | 1 | | -z |
5. So, what are X, Y, a & b, and why -1 => -z?
- Firstly, the -1 appears in the bottom row of the matrix because we need to reverse the direction of the z axis.
- This is because the front of the transformed image cube has a z value of 1 and the back has a z value of -1.
- -z is the distance of a point, p, AWAY from the camera, so the greater the value of -z, the further away from the camera p will be.
- The further away something is from you, the smaller it appears, right?
- To go from 4D to 3D, you divide through by the w coordinate (again, if you don't understand why, you need to go back to basics).
- So the reason -z has to be the w coordinate of the transformed point is to do with perspective - something divided by a large number = a small number & vice versa.
- Understanding this concept is key to understanding how perspective projection works, so spend as much time as you need wrapping your head around it.
6. We can figure out what a & b are via algebra, by taking a couple of points with known values in image space.
Let p = center of the near plane: (0, 0, -n), and q = center of the far plane: (0, 0, -f).
Applying our transformation, we have:
M.p = M.(0, 0, -n, 1)
= (0, 0, -an + b, n)
and
M.q = M.(0, 0, -f, 1)
= (0, 0, -af + b, f)
As I said previously, to go from 4D to 3D, divide through by w, so:
p* = (0, 0, -an + b, n)
=> (0, 0, (-an + b) / n, n / n)
=> (0, 0, (-an + b) / n, 1)
=> (0, 0, (-an + b) / n)
and
q* = (0, 0, -af + b, f)
=> (0, 0, (-af + b) / f, f / f)
=> (0, 0, (-af + b) / f, 1)
=> (0, 0, (-af + b) / f)
We know that p* should then be (0, 0, 1) and q* must be (0, 0, -1) - the center of the front and back faces of our image cube. So now we have:
(-an + b) / n = 1
and
(-af + b) / f = -1
-a + b/n = 1
-a + b/f = -1
b/n - b/f = 2
(bf - bn) / nf = 2
b(f - n) = 2nf
b = 2nf / (f - n)
(-n + f)a = n + f
a = (n + f) / (f - n)
7. Now for X and Y. First, note:
- Remembering the line you drew from the camera to the top edge of the near plane, if you draw another line from the camera to the center of the near plane, hopefully you can see you get a right-angled triangle.
- The angle between these two lines is θ/2 (because you've basically divided the fov in half... see how this might be easier to explain in a video)
- Basic trig: tan a = opposite / adjacent
Let p = mid/top of the near plane: (0, y, -n)
tan (θ/2) = y / n
y = n.tan(θ/2)
p = (0, n.tan(θ/2), -n)
M.p = M.(0, n.tan(θ/2), -n, 1)
= (0, Y.n.tan(θ/2), -n(n+f)/(f-n) + 2nf/(f-n), n)
In 3D, this is (0, Y.tan(θ/2), -(n+f)/(f-n) + 2f/(f-n)) which we know must be (0, 1, 1) in image space. Thus:
Y.tan(θ/2) = 1
Y = 1/tan(θ/2)
Similarly, we can work out what X is using exactly the same process... and it's the same:
X = 1/tan(θ/2)
8. Finally, put it all together, our completed projection matrix is:
| 1/tan(θ/2) 0 0 0 |
| 0 1/tan(θ/2) 0 0 |
| 0 0 (n+f)/(f-n) 2nf/(f-n) |
| 0 0 -1 0 |
Note that if you set f and n as the same value, you get a division by zero... so you don't want to do that.
If you can bear to sit through a 47 minute video, here is the one which helped my own understanding of this the most:
https://www.youtube.com/watch?v=mpTl003EXCY(Edited 7 minutes later.)