|
[1] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. “Imagenet classification with deep convolutional neural networks. “, In Neural Information Processing Systems (NIPS), 2012. [2] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. “Rich feature hierarchies for accurate object detection and semantic segmentation.” In proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2014. [3] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. “SSD: Single shot multibox detector.” In proceedings of the European Conference on Computer Vision (ECCV), 2016. [4] K. He, G. Gkioxari, P. Dollar, and R. B. Girshick. “Mask R-CNN.” In Proceedings of the International Conference on Computer Vision (ICCV), 2017. [5] Jonathan Long, Evan Shelhamer, and Trevor Darrell. “Fully convolutional networks for semantic segmentation.” In proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2015. [6] Kuznietsov, Yevhen, Jörg Stückler, and Bastian Leibe. "Semi-supervised deep learning for monocular depth map prediction." In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [7] I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab. Deeper depth prediction with fully convolutional residual networks. In IEEE International Conference on 3D Vision (3DV), 2016. [8] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. “Mobilenets: Efficient convolutional neural networks for mobile vision applications.” arXiv preprint arXiv:1704.04861, 2017 [9] Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. “Segnet: A deep convolutional encoder-decoder architecture for image segmentation.” In IEEE transactions on pattern analysis and machine intelligence, 2017. [10] Marvin Teichmann, Michael Weber, Marius Zoellner, Roberto Cipolla, and Raquel Urtasun. “Multinet: Real-time joint semantic reasoning for autonomous driving.” arXiv preprint arXiv:1612.07695, 2016. [11] Zhaowei Cai and Quanfu Fan and Rogerio Feris and Nuno Vasconcelos,” A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection”, In proceeding of the European Conference on Computer Vision (ECCV), 2016 [12] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. arXiv preprint, arXiv:1612.03144, 2016. [13] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, “Going Deeper with Convolutions.”, In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015 [14] C.Szegedy, V.Vanhoucke, S.Ioffe,J.Shlens, and Z.Wojna, “Rethinking the inception architecture for computer vision.” arXiv preprint, arXiv:1512.00567, 2015 [15] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille, “Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs.” arXiv preprint, arXiv: 1412.7062, 2014 [16] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille, “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.”, arXiv preprint, arXiv:1606.00915 2016 [17] Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam ,“Rethinking Atrous Convolution for Semantic Image Segmentation”, arXiv preprint, arXiv: 1706.05587 2017 [18] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam, “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation”,arXiv preprint, arXiv: 1802.02611, 2018 [19] Maoke Yang Kun Yu Chi Zhang Zhiwei Li Kuiyuan Yang, “DenseASPP for Semantic Segmentation in Street Scenes”, In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [20] Hengshuang Zhao and Jianping Shi and Xiaojuan Qi and Xiaogang Wang and Jiaya Jia, “Pyramid Scene Parsing Network”, In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [21] Songtao Liu, Di Huang, and Yunhong Wang, “Receptive Field Block Net for Accurate and Fast Object Detection”, In proceeding of the European Conference on Computer Vision (ECCV), 2018 [22] Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L. “Bottom-up and top-down attention for image captioning and vqa.” , arXiv preprint, arXiv:1707.07998, 2017 [23] Bahdanau, D., Cho, K., Bengio.Y, “Neural machine translation by jointly learning to align and translate.”, arXiv preprint, arXiv:1409.0473, 2014 [24]Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I., “Attention is all you need.” In Neural Information Processing Systems (NIPS), 2017 [25] Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y. “Graph attention ´networks.” , arXiv preprint,arXiv:1710.10903, 2017 [26] Wang F., Jiang, M. Qian, C. Yang, S. Li, C. Zhang, H., Wang X., Tang X “Residual attention network for image classification.” In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [27] Hu J., Shen, L. Sun, G.: “Squeeze-and-excitation networks.” arXiv:1709.01507 (2017) [28] Wang, X., Girshick, R., Gupta, A., He, K. “Non-local neural networks.”, arXiv preprint, arXiv:1711.07971, 2017 [29] Dai, Jifeng and He, Kaiming and Sun Jian, “Instance-aware Semantic Segmentation via Multi-task Network Cascades”, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 [30] Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji and Yichen Wei,"Fully Convolutional Instance-aware Semantic Segmentation", In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 [31] S. Liu, X. Qi, J. Shi, H. Zhang, and J. Jia. “Multi-scale patch aggregation (MPA) for simultaneous detection and segmentation.” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [32] Shu Liu, Lu Qi, Qin, Jianping Shi, Jiaya Jia, “Path Aggregation Network for Instance Segmentation”, In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [33] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. “U-net: Convolutional networks for biomedical image segmentation.” In International Conference on Medical image computing and computer-assisted intervention, 2015. [34] Jonas Uhrig, Marius Cordts, Uwe Franke, and T. Brox. “Pixel-level encoding and depth layering for instance-level semantic labeling.” In proceedings of the German Conference on Pattern Recognition (GCPR), 2016. [35] Iasonas Kokkinos. “Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory.”, arXiv preprint, arXiv:1609.02132, 2016. [36] Marvin Teichmann, Michael Weber, Marius Zoellner, Roberto Cipolla, and Raquel Urtasun, “Multinet: Real-time joint semantic reasoning for autonomous driving.”, arXiv preprint,arXiv:1612.07695, 2016. [37] Davy Neven, Bert De Brabandere, Stamatios Georgoulis, Marc Proesmans and Luc Van Gool,” Fast Scene Understanding for Autonomous Driving”, In proceedings of IEEE Symposium on Intelligent Vehicles, 2017 [38] Alex Kendall, Yarin Gal, Roberto Cipolla, “Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics”, In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [39] Z. Chen, V. Badrinarayanan, C. Lee, and A. Rabinovich. “GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks”, arXiv preprint, arXiv:1711.02257 (2017) [40] H. Ha, S. Im, J. Park, H.-G. Jeon, and I. S. Kweon, “High-Quality Depth from Uncalibrated Small Motion Clip,” In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. [41] N. Kong and M. J. Black, “Intrinsic depth: Improving depth transfer with intrinsic images,” In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015. [42] D. Eigen, C. Puhrsch, and R. Fergus. Depth map prediction from a single image using a multi-scale deep network. In Advances in neural information processing systems, 2014. [43] D. Xu, W. Wang, H. Tang, H. Liu, N. Sebe, and E. Ricci, “Structured attention guided convolutional neural fields for monocular depth estimation,” In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [44] S. Xie and Z. Tu, “Holistically-Nested Edge Detection,” In proceedings of International Journal of Computer Vision, 2017. [45] B. Li, C. Shen, Y. Dai, A. van den Hengel, and M. He.” Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs”. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. [46] F. Liu, C. Shen, G. Lin, and I. Reid. Learning depth from single monocular images using deep convolutional neural fields. In IEEE transactions on pattern analysis and machine intelligence, 2016. [47] H. Fu, M. Gong, C. Wang, K. Batmanghelich, and D. Tao, “Deep ordinal regression network for monocular depth estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 [48] Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel, “Understanding the Effective Receptive Field in Deep Convolutional Neural Networks”, In Neural Information Processing Systems (NIPS), 2016 [49] F. Yu and V. Koltun. “Multi-scale context aggregation by dilated convolutions.” In proceedings of the International Conference on Learning Representations (ICLR), 2016 [50] Yanghao Li, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, “Scale-Aware Trident Networks for Object Detection”, arXiv preprint, arxiv:1901.01892, 2019 [51] H. Noh, S. Hong, and B. Han. Learning deconvolution network for semantic segmentation. In IEEE International Conference on Computer Vision, In Proceedings of the International Conference on Computer Vision (ICCV), 2015 [52] I. Laina, C. Rupprecht, “Deeper depth prediction with fully convolutional residual networks”, In 3D Vision (3DV), 2016 Fourth International Conference on. IEEE, 2016 [53] D. Eigen, C. Puhrsch, and R. Fergus, “Depth map prediction from a single image using a multi-scale deep network,” In Advances in neural information processing systems, 2014 [54] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár, “Focal Loss for Dense Object Detection”, In Proceedings of the International Conference on Computer Vision (ICCV), 2017. [55] L. Zwald and S. Lambert-Lacroix. “The berhu penalty and the grouped effect.”, arXiv preprint, arXiv:1207.6868, 2012. [56] Wang, Z. Simoncelli, E.P Bovik, “Multiscale structural similarity for image quality assessment.”, Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, 2004. [57] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. |