Manual camera positions in Meshroom?

Hello everyone, i’m new here.

I am stuck in the beginning steps of a project to make a custom/unique PC-case based on the left engine of the Whizzing Arrow II from the show Oban Star Racers, which i have found a low-res rotating GIF of.

I have split the gif into its 240 frames, meaning each image is rotationally offset by 1.5 degrees.
When i put the images into any photogrammetry software i can find, the calculated camera motion is at best a half-circle, at worst garbage. This is due to the lack of surface features, as the original model’s texture is basically flat colours. The best result i’ve gotten so far is with Meshroom, which is why i’m writing here, and also because it is modular/node-based.

Is there any way to give this known positional data to the Meshroom pipeline? I’ve seen a couple of github issues opened on the topic, but all of them have been solved without stating the solution, as far as i can tell.

What is more, is that while i know the coordinates (x,y,z), as well as the yaw and roll, i don’t know the pitch of the camera. Does anyone know how i can calculate it from the images?

My main objective with this is to find the curvature of the inlet cowl so i can 3D-print it.
If i get anything more, great. If not, it’s fine - i can work with that.

1 Like

Rich Radke put out an amazing resource of recorded lectures for one of his university courses. This specific video encompasses how a real world 3D point is eventually represented in an image pixel, and all of the parameters that affect it. At the timestamp (41:34) could be the most relevant part for you I think, but there is some additional context before that which could prove helpful. I think it could help you but also may at least be interesting to you.

Let me know your thoughts!


Thank you so much for the pointer, i’ll definitely give it a watch, seems like just the kind of video i needed.

1 Like

Ok, watched a couple of his videos, and i think i understand how the problem can be solved, i’m just murky on the finer details. I’m gonna lay out how i understand it so far, please correct me if i’m wrong.

For a 3D point “i” percieved in camera “j” as the 2D coordinates (x_ij, y_ij), the following matrix equation can be set up:
[ij] ~ [K] * ( [R]*[i] + [t] )
ij is the 3x1 matrix (x_ij, y_ij, 1), which is fully known.
K is the camera calibration matrix, where 1 variable is unknown (alpha).
R is the rotation matrix, where 1 variable is unknown (Euler rotation of x).
i is the 3x1 matrix (X, Y, Z) in the world coordinate system, all 3 unknown.
t is the translation matrix, which is fully known.

The z-equation is “redundant” as it accounts for the value of the “linear up to scale” operator (~), which leaves 2 independent equation with 5 unknown variables.
Then, it should be solvable by only setting this up for the same point seen in 3 different camera positions. 6 independent equations with 5 variables.

Of course, including additional points in the those same shots willyield additional sets of equations which probably will give slightly different answers due to the low resolution of the images.
Is there an equavalent of regression, but for equations instead of values?. Or do i just have to do this some n amout of time and take the average of the result?