Mostrar el registro sencillo del ítem

Segmentación multi-modal de imágenes RGB-D a partir de mapas de apariencia y de profundidad geométrica

dc.creatorSalazar, Isail
dc.creatorPertuz, Said
dc.creatorMartínez , Fabio
dc.date2020-05-15
dc.identifierhttps://revistas.itm.edu.co/index.php/tecnologicas/article/view/1538
dc.identifier10.22430/22565337.1538
dc.descriptionClassical image segmentation algorithms exploit the detection of similarities and discontinuities of different visual cues to define and differentiate multiple regions of interest in images. However, due to the high variability and uncertainty of image data, producing accurate results is difficult. In other words, segmentation based just on color is often insufficient for a large percentage of real-life scenes. This work presents a novel multi-modal segmentation strategy that integrates depth and appearance cues from RGB-D images by building a hierarchical region-based representation, i.e., a multi-modal segmentation tree (MM-tree). For this purpose, RGB-D image pairs are represented in a complementary fashion by different segmentation maps. Based on color images, a color segmentation tree (C-tree) is created to obtain segmented and over-segmented maps. From depth images, two independent segmentation maps are derived by computing planar and 3D edge primitives. Then, an iterative region merging process can be used to locally group the previously obtained maps into the MM-tree. Finally, the top emerging MM-tree level coherently integrates the available information from depth and appearance maps. The experiments were conducted using the NYU-Depth V2 RGB-D dataset, which demonstrated the competitive results of our strategy compared to state-of-the-art segmentation methods. Specifically, using test images, our method reached average scores of 0.56 in Segmentation Covering and 2.13 in Variation of Information.en-US
dc.descriptionLos algoritmos clásicos de segmentación de imágenes explotan la detección de similitudes y discontinuidades en diferentes señales visuales, para definir regiones de interés en imágenes. Sin embargo, debido a la alta variabilidad e incertidumbre en los datos de imagen, se dificulta generar resultados acertados. En otras palabras, la segmentación basada solo en color a menudo no es suficiente para un gran porcentaje de escenas reales. Este trabajo presenta una nueva estrategia de segmentación multi-modal que integra señales de profundidad y apariencia desde imágenes RGB-D, por medio de una representación jerárquica basada en regiones, es decir, un árbol de segmentación multi-modal (MM-tree). Para ello, la imagen RGB-D es descrita de manera complementaria por diferentes mapas de segmentación. A partir de la imagen de color, se implementa un árbol de segmentación de color (C-tree) para obtener mapas de segmentación y sobre-segmentación. Desde de la imagen de profundidad, se derivan dos mapas de segmentación independientes, los cuales se basan en el cálculo de primitivas de planos y de bordes 3D. Seguidamente, un proceso de fusión jerárquico de regiones permite agrupar de manera local los mapas obtenidos anteriormente en el MM-tree. Por último, el nivel superior emergente del MM-tree integra coherentemente la información disponible en los mapas de profundidad y apariencia. Los experimentos se realizaron con el conjunto de imágenes RGB-D del NYU-Depth V2, evidenciando resultados competitivos, con respecto a los métodos de segmentación del estado del arte. Específicamente, en las imágenes de prueba, se obtuvieron puntajes promedio de 0.56 en la medida de Segmentation Covering y 2.13 en Variation of Information.es-ES
dc.formatapplication/pdf
dc.formattext/xml
dc.formattext/html
dc.languagespa
dc.languageeng
dc.publisherInstituto Tecnológico Metropolitano (ITM)en-US
dc.relationhttps://revistas.itm.edu.co/index.php/tecnologicas/article/view/1538/1634
dc.relationhttps://revistas.itm.edu.co/index.php/tecnologicas/article/view/1538/1669
dc.relationhttps://revistas.itm.edu.co/index.php/tecnologicas/article/view/1538/1724
dc.relation/*ref*/P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour Detection and Hierarchical Image Segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 5, pp. 898–916, May. 2011. https://doi.org/10.1109/TPAMI.2010.161
dc.relation/*ref*/X. Wang, Y. Tang, S. Masnou, and L. Chen, “A Global/Local Affinity Graph for Image Segmentation,” IEEE Trans. Image Process., vol. 24, no. 4, pp. 1399–1411, Apr. 2015. https://doi.org/10.1109/TIP.2015.2397313
dc.relation/*ref*/J. Han, L. Shao, D. Xu, and J. Shotton, “Enhanced Computer Vision With Microsoft Kinect Sensor: A Review,” IEEE Trans. Cybern., vol. 43, no. 5, pp. 1318–1334, Oct. 2013. https://doi.org/10.1109/TCYB.2013.2265378
dc.relation/*ref*/N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from RGBD images,” Comput. Vis. -- ECCV 2012 12th Eur. Conf. Comput. Vis., pp. 746–760, Berlin, 2012. https://doi.org/10.1007/978-3-642-33715-4_54
dc.relation/*ref*/X. Ren, L. Bo, and D. Fox, “RGB-(D) scene labeling: Features and algorithms,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, 2012, pp. 2759–2766. https://doi.org/10.1109/CVPR.2012.6247999
dc.relation/*ref*/S. Gupta, P. Arbelaez, and J. Malik, “Perceptual organization and recognition of indoor scenes from RGB-D images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, 2013, pp. 564–571. https://doi.org/10.1109/CVPR.2013.79
dc.relation/*ref*/Z. Li, X. M. Wu, and S. F. Chang, “Segmentation using superpixels: A bipartite graph partitioning approach,” in 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, 2012, pp. 789–796. https://doi.org/10.1109/CVPR.2012.6247750
dc.relation/*ref*/R. Nock and F. Nielsen, “Statistical region merging,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, pp. 1452–1458, Nov. 2004. https://doi.org/10.1109/TPAMI.2004.110
dc.relation/*ref*/J. Yang, Z. Gan, K. Li, and C. Hou, “Graph-Based Segmentation for RGB-D Data Using 3-D Geometry Enhanced Superpixels,” IEEE Trans. Cybern., vol. 45, no. 5, pp. 927–940, May 2015. https://doi.org/10.1109/TCYB.2014.2340032
dc.relation/*ref*/A. Richtsfeld, T. Mörwald, J. Prankl, M. Zillich, and M. Vincze, “Learning of perceptual grouping for object segmentation on RGB-D data,” J. Vis. Commun. Image Represent., vol. 25, no. 1, pp. 64–73, Jan. 2014. https://doi.org/10.1016/j.jvcir.2013.04.006
dc.relation/*ref*/L. Cruz, D. Lucio, and L. Velho, “Kinect and rgbd images: Challenges and applications,” in Graphics, Patterns and Images Tutorials (SIBGRAPI-T), 2012 25th SIBGRAPI Conference on, Ouro Preto, 2012, pp. 36–49. https://doi.org/10.1109/SIBGRAPI-T.2012.13
dc.relation/*ref*/K. Chen, Y.-K. Lai, and S.-M. Hu, “3D indoor scene modeling from RGB-D data: a survey,” Comput. Vis. Media, vol. 1, no. 4, pp. 267–278, Dec. 2015. https://doi.org/10.1007/s41095-015-0029-x
dc.relation/*ref*/D. Lin, G. Chen, D. Cohen-Or, P. A. Heng, and H. Huang, “Cascaded Feature Network for Semantic Segmentation of RGB-D Images,” in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 1320–1328. https://doi.org/10.1109/ICCV.2017.147
dc.relation/*ref*/J. McCormac, A. Handa, S. Leutenegger, and A. J. Davison, “SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?,” in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2697–2706. https://doi.org/10.1109/ICCV.2017.292
dc.relation/*ref*/W. Wang and U. Neumann, “Depth-aware cnn for rgb-d segmentation,” in Proceedings of the European Conference on Computer Vision (ECCV), Switzerland, 2018, pp. 135–150. https://doi.org/10.1007/978-3-030-01252-6_9
dc.relation/*ref*/Y. Guo, Y. Liu, T. Georgiou, and M. S. Lew, “A review of semantic segmentation using deep neural networks,” Int. J. Multimed. Inf. Retr., vol. 7, no. 2, pp. 87–93, Jun. 2018. https://doi.org/10.1007/s13735-017-0141-z
dc.relation/*ref*/D. Huang, J.-H. Lai, C.-D. Wang, and P. C. Yuen, “Ensembling over-segmentations: From weak evidence to strong segmentation,” Neurocomputing, vol. 207, pp. 416–427, Sep. 2016. https://doi.org/10.1016/j.neucom.2016.05.028
dc.relation/*ref*/J. Smisek, M. Jancosek, and T. Pajdla, “3D with Kinect,” in Consumer depth cameras for computer vision, London: Springer, 2013, pp. 3–25. https://doi.org/10.1007/978-1-4471-4640-7_1
dc.relation/*ref*/M. Maire, P. Arbelaez, C. Fowlkes, and J. Malik, “Using contours to detect and localize junctions in natural images,” in 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Ak, 2008, pp. 1–8. https://doi.org/10.1109/CVPR.2008.4587420
dc.relation/*ref*/P. Arbelaez, “Boundary extraction in natural images using ultrametric contour maps,” in Computer Vision and Pattern Recognition Workshop, 2006. CVPRW’06. Conference on, New York, 2006, pp. 182. https://doi.org/10.1109/CVPRW.2006.48
dc.relation/*ref*/C. Feng, Y. Taguchi, and V. R. Kamat, “Fast plane extraction in organized point clouds using agglomerative hierarchical clustering,” in 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, 2014, pp. 6218–6225. https://doi.org/10.1109/ICRA.2014.6907776
dc.relation/*ref*/R. Hulik, M. Spanel, P. Smrz, and Z. Materna, “Continuous plane detection in point-cloud data based on 3D Hough Transform,” J. Vis. Commun. Image Represent., vol. 25, no. 1, pp. 86–97, Jan. 2014. https://doi.org/10.1016/j.jvcir.2013.04.001
dc.relation/*ref*/T. H. Kim and K. M. Lee, S. U. Lee, “Learning full pairwise affinities for spectral segmentation,” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jul. 2013, pp. 1690-1703. https://doi.org/10.1109/TPAMI.2012.237
dc.relation/*ref*/P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “From contours to regions: An empirical evaluation,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, 2009, pp. 2294–2301. https://doi.org/10.1109/CVPR.2009.5206707
dc.relation/*ref*/R. Unnikrishnan, C. Pantofaru, and M. Hebert, “Toward Objective Evaluation of Image Segmentation Algorithms,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 929–944, Jun. 2007. https://doi.org/10.1109/TPAMI.2007.1046
dc.relation/*ref*/M. Meilǎ, “Comparing clusterings: an axiomatic view,” in Proceedings of the 22nd international conference on Machine learning, Aug. 2005, pp. 577–584. https://doi.org/10.1145/1102351.1102424
dc.relation/*ref*/A. Goder and V. Filkov, “Consensus clustering algorithms: Comparison and refinement,” in Proceedings of the Meeting on Algorithm Engineering & Expermiments, Jan. 2008, pp. 109–117. http://dl.acm.org/citation.cfm?id=2791204.2791215
dc.rightsCopyright (c) 2020 TecnoLógicasen-US
dc.rightshttp://creativecommons.org/licenses/by-nc-sa/4.0en-US
dc.sourceTecnoLógicas; Vol. 23 No. 48 (2020); 143-161en-US
dc.sourceTecnoLógicas; Vol. 23 Núm. 48 (2020); 143-161es-ES
dc.source2256-5337
dc.source0123-7799
dc.subjectImage segmentationen-US
dc.subjectover-segmentationen-US
dc.subjectRGB-D imagesen-US
dc.subjectdepth informationen-US
dc.subjectmulti-modal segmentationen-US
dc.subjectSegmentación de imágeneses-ES
dc.subjectsobre-segmentaciónes-ES
dc.subjectimágenes RGB-Des-ES
dc.subjectinformación de profundidades-ES
dc.subjectsegmentación multi-modales-ES
dc.titleMulti-modal RGB-D Image Segmentation from Appearance and Geometric Depth Mapsen-US
dc.titleSegmentación multi-modal de imágenes RGB-D a partir de mapas de apariencia y de profundidad geométricaes-ES
dc.typeinfo:eu-repo/semantics/article
dc.typeinfo:eu-repo/semantics/publishedVersion
dc.typeResearch Papersen-US
dc.typeArtículos de investigaciónes-ES


Ficheros en el ítem

FicherosTamañoFormatoVer

No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem