×

3D trajectory reconstruction under perspective projection. (English) Zbl 1398.68592

Summary: We present an algorithm to reconstruct the 3D trajectory of a moving point from its correspondence in a collection of temporally non-coincidental 2D perspective images, given the time of capture that produced each image and the relative camera poses at each time instant. Triangulation-based solutions do not apply, as multiple views of the point may not exist at each time instant. We represent a 3D trajectory using a linear combination of compact trajectory basis vectors, such as the discrete cosine transform basis, that have been shown to approximate object independence. We note that such basis vectors are also coordinate independent, which allows us to directly use camera poses estimated from stationary areas in the scene (in contrast to nonrigid structure from motion techniques where cameras are simultaneously estimated). This reduces the reconstruction optimization to a linear least squares problem, allowing us to robustly handle missing data that often occur due to motion blur, texture deformation, and self occlusion. We present an algorithm to determine the number of trajectory basis vectors, individually for each trajectory via a cross validation scheme and refine the solution by minimizing the geometric error. The relationship between point and camera motion can cause degeneracies to occur. We geometrically analyze the problem by studying the relationship of the camera motion, point motion, and trajectory basis vectors. We define the reconstructability of a 3D trajectory under projection, and show that the estimate approaches the ground truth when reconstructability approaches infinity. This analysis enables us to precisely characterize cases when accurate reconstruction is achievable. We present qualitative results for the reconstruction of several real-world scenes from a series of 2D projections where high reconstructability can be guaranteed, and report quantitative results on motion capture sequences.

MSC:

68T45 Machine vision and scene understanding

Software:

SBA; SIFT; EPnP
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Akhter, I., Sheikh, Y., & Khan, S. (2009). In defense of orthonormality constraints for nonrigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[2] Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2008). Nonrigid structure from motion in trajectory space. In Advances in Neural Information Processing Systems.
[3] Akhter, I; Sheikh, Y; Khan, S; Kanade, T, Trajectory space: A dual representation for nonrigid structure from motion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 1442-1456, (2011) · doi:10.1109/TPAMI.2010.201
[4] Avidan, S; Shashua, A, Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 348-357, (2000) · doi:10.1109/34.845377
[5] Bartoli, A., Gay-Bellile, V., Castellani, U., Peyras, J., Olsen, S. I., & Sayd, P. (2008). Coarse-to-fine low-rank structure-from-motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[6] Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3D faces. In ACM transactions on Graphics (SIGGRAPH).
[7] Brand, M. (2001). Morphable 3D models from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[8] Brand, M. (2005). A direct method for 3D factorization of nonrigid motion observed in 2D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[9] Bregler, C., Hertzmann, A., & Biermann, H. (1999). Recovering non-rigid 3D shape from image streams. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[10] Dai, Y., Li, H., & He, M. (2012). A simple prior-free method for non-rigid structure-from-motion factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. · Zbl 1294.68134
[11] Del Bue, A. (2008). A factorization approach to structure from motion with shape priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[12] Del Bue, A., Llad, X., & Agapito, L. (2006). Non-rigid metric shape and motion recovery from uncalibrated images using priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[13] Faugeras, O., Luong, Q.-T., & Papadopoulou, T. (2001). The geometry of multiple images: The laws that govern the formation of images of a scene and some of their applications. Cambridge: MIT Press. · Zbl 1002.68183
[14] Fayad, J., Agapito, L., & Del Bue, A. (2010). Piecewise quadratic reconstruction of non-rigid surface from monocular sequences. In Proceedings of the European Conference on Computer Vision.
[15] Fischler, MA; Bolles, RC, Random sample consensus: A paradigm for model Fitting with applications to image analysis and automated cartography, Communications of the ACM, 24, 381-395, (1981) · doi:10.1145/358669.358692
[16] Gotardo, PFU; Martinez, AM, Computing smooth time-trajectories for camera and deformable shape in structure from motion with occlusion, IEEE Transactions on Pattern Analysis and Machine Intelligence, 33, 2051-2065, (2011) · doi:10.1109/TPAMI.2011.50
[17] Hamidi, M; Pearl, J, Comparison of the cosine and Fourier transforms of Markov-I signal, IEEE Transactions on Acoustics, Speech, and Signal Processing, 24, 428-429, (1976) · doi:10.1109/TASSP.1976.1162839
[18] Hartley, R., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press. · Zbl 1072.68104 · doi:10.1017/CBO9780511811685
[19] Hartley, R, In defense of the eight-point algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 580-593, (1997) · doi:10.1109/34.601246
[20] Hartley, R., & Vidal, R. (2008). Perspective nonrigid shape and motion recovery. In Proceedings of the European Conference on Computer Vision.
[21] Kaminski, JY; Teicher, M, A general framework for trajectory triangulation, Journal of Mathematical Imaging and Vision, 21, 27-41, (2004) · Zbl 1478.94052 · doi:10.1023/B:JMIV.0000026555.79056.b8
[22] Lladó, X; Bue, A; Agapito, L, Non-rigid metric reconstruction from perspective cameras, Image and Vision Computing, 28, 1339-1353, (2010) · doi:10.1016/j.imavis.2010.01.014
[23] Longuet-Higgins, HC, A computer algorithm for reconstructing a scene from two projections, Nature, 293, 133-135, (1981) · doi:10.1038/293133a0
[24] Lourakis, MIA; Argyros, AA, SBA: A software package for generic sparse bundle adjustment, ACM Transactions on Mathematical Software, 36, 1-30, (2009) · Zbl 1364.65052 · doi:10.1145/1486525.1486527
[25] Lowe, DG, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60, 91-110, (2004) · doi:10.1023/B:VISI.0000029664.99615.94
[26] Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2003). An invitation to 3-D vision: From images to geometric models. New York: Springer.
[27] Moreno-Noguer, F., Lepetit, V., & Fua, P. (2007). EPnP: Efficient perspective-n-point camera pose estimation. In Proceedings of the International Conference on Computer Vision.
[28] Olsen, S., & Bartoli, A. (2007). Using priors for improving generalization in non-rigid structure-from-motion. In Proceedings of British Machine Vision Conference.
[29] Östlund, J., Varol, A., Ngo, D. T., & Fua, P. (2012). Laplacian meshes for monocular 3D shape recovery. In Proceedings of the European Conference on Computer Vision.
[30] Ozden, KE; Cornelis, K; Eychen, LV; Gool, LV, Reconstructing 3D trajectories of independently moving objects using generic constraints, Computer Vision and Image Understanding, 93, 1453-1471, (2004)
[31] Paladini, M., Del Bue, A., Stosic, M., Dodig, M., Xavier, J., & Agapito, L. (2009). Factorization for non-rigid and articulated structure using metric projections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. · Zbl 1235.68280
[32] Park, H. S., Shiratori, T., Matthews, I., & Sheikh, Y. (2010). 3D reconstruction of a moving point from a series of 2D projections. In Proceedings of the European Conference on Computer Vision.
[33] Salzmann, M; Pilet, J; Ilic, S; Fua, P, Surface deformation models for nonrigid 3D shape recovery, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1481-1487, (2007) · doi:10.1109/TPAMI.2007.1080
[34] Shashua, A., & Wolf, L. (2000). Homography tensors: On algebraic entities that represent three views of static or moving planar points. In Proceedings of the European Conference on Computer Vision.
[35] Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3d human figures using 2D image motion. In Proceedings of the European Conference on Computer Vision.
[36] Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics (SIGGRAPH).
[37] Taylor, J., Jepson, A. D., & Kutulakos, K. N. (2010). Non-rigid structure from locally-rigid motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[38] Tomasi, C; Kanade, T, Shape and motion from image streams under orthography: A factorization method, International Journal of Computer Vision, 9, 137-154, (1992) · doi:10.1007/BF00129684
[39] Torresani, L., Yang, D., Alexander, G., & Bregler, C. (2001). Tracking and modeling non-rigid objects with rank constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[40] Torresani, L., & Bregler, C. (2002). Space-time tracking. In Proceedings of the European Conference on Computer Vision. · Zbl 1034.68685
[41] Torresani, L., Hertzmann, A., & Bregler, C. (2008). Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 878-892.
[42] Torresani, L., Hertzmann, A., & Bregler, C. (2003). Learning non-rigid 3D shape from 2D motion. In Advances in Neural Information Processing Systems.
[43] Valmadre, J., & Lucey, S. (2012). General trajectory prior for non-rigid reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[44] Vidal, R., & Abretske, D. (2006). Nonrigid shape and motion from multiple perspective views. In Proceedings of the European Conference on Computer Vision.
[45] Vidal, R., & Hartley, R. (2004). Motion segmentation with missing data by powerfactorization and generalized pca. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[46] Wexler, Y., & Shashua, A. (2000). On the synthesis of dynamic scenes from reference views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[47] Wolf, L; Shashua, A, On projection matrices \({\cal P}^{k} → {\cal P}^{2}, k =3, … \), 6, and their applications in computer vision, International Journal of Computer Vision, 48, 53-67, (2002) · Zbl 1012.68749 · doi:10.1023/A:1014855311993
[48] Xiao, J., & Kanade, T. (2004). Non-rigid shape and motion recovery: Degenerate deformations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. · Zbl 1098.68890
[49] Xiao, J; Chai, J; Kanade, T, A closed-form solution to non-rigid shape and motion recovery, International Journal of Computer Vision, 67, 233-246, (2006) · Zbl 1477.68438 · doi:10.1007/s11263-005-3962-9
[50] Yan, J., & Pollefeys, M. (2005). A factorization-based approach to articulated motion recovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[51] Zhu, S., Zhang, L., & Smith, B. M. (2010). Model evolution: An incremental approach to non-rigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.