Abstract
Successful computer-aided diagnosis systems typically rely on training datasets containing sufficient and richly annotated images. However, detailed image annotation is often time consuming and subjective, especially for medical images, which becomes the bottleneck for the collection of large datasets and then building computer-aided diagnosis systems. In this article, we design a novel computer-aided endoscopy diagnosis system to deal with the multi-classification problem of electronic endoscopy medical records (EEMRs) containing sets of frames, while labels of EEMRs can be mined from the corresponding text records using an automatic text-matching strategy without human special labeling. With unambiguous EEMR labels and ambiguous frame labels, we propose a simple but effective pooling scheme called Multi-class Latent Concept Pooling, which learns a codebook from EEMRs with different classes step by step and encodes EEMRs based on a soft weighting strategy. In our method, a computer-aided diagnosis system can be extended to new unseen classes with ease and applied to the standard single-instance classification problem even though detailed annotated images are unavailable. In order to validate our system, we collect 1,889 EEMRs with more than 59K frames and successfully mine labels for 348 of them. The experimental results show that our proposed system significantly outperforms the state-of-the-art methods. Moreover, we apply the learned latent concept codebook to detect the abnormalities in endoscopy images and compare it with a supervised learning classifier, and the evaluation shows that our codebook learning method can effectively extract the true prototypes related to different classes from the ambiguous data.
- Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Susstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 11 (2012), 2274--2282. Google Scholar
Digital Library
- Z. K. Baker and V. K. Prasanna. 2005. A computationally efficient engine for flexible intrusion detection. IEEE Trans. VLSI Syst. 13, 10 (2005), 1179--1189. Google Scholar
Digital Library
- Md. Khayrul Bashar, Kensaku Mori, Yasuhito Suenaga, Takayuki Kitasaka, and Yoshito Mekada. 2008. Detecting informative frames from wireless capsule endoscopic video using color and texture features. In MICCAI. 603--610. Google Scholar
Digital Library
- Ylan Boureau, Francis Bach, Yann Lecun, and Jean Ponce. 2010. Learning mid-level features for recognition. In CVPR. 2559--2566.Google Scholar
- Anna M. Buchner, Muhammad W. Shahid, Michael G. Heckman, Murli Krishna, Marwan Ghabril, Muhammad Hasan, Julia E. Crook, Victoria Gomez, Massimo Raimondo, and Timothy Woodward. 2010. Comparison of probe-based confocal laser endomicroscopy with virtual chromoendoscopy for classification of colon polyps. Gastroenterology 138, 3 (2010), 834--842.Google Scholar
Cross Ref
- Xinqi Chu, Chee Khun Poh, Liyuan Li, Kap Luk Chan, Shuicheng Yan, Weijia Shen, That Mon Htwe, Jiang Liu, Joo Hwee Lim, and Eng Hui Ong. 2010. Epitomized summarization of wireless capsule endoscopic videos for efficient visualization. In MICCAI. 522--529. Google Scholar
Digital Library
- Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, and John R. Smith. 2014. Automated medical image modality recognition by fusion of visual and text information. In MICCAI. 487--495.Google Scholar
- M. T. Coimbra and J. P. S. Cunha. 2006. MPEG-7 visual descriptors-contributions for automated feature extraction in capsule endoscopy. IEEE Trans. Circuits Syst. Video Technol. 16, 5 (2006), 628--637. Google Scholar
Digital Library
- Yang Cong, Shuai Wang, Ji Liu, Jun Cao, Yunsheng Yang, and Jiebo Luo. 2015. Deep sparse feature selection for computer aided endoscopy diagnosis. Pattern Recogn. 48, 3 (2015), 907--917. Google Scholar
Digital Library
- Yang Cong, Junsong Yuan, and Ji Liu. 2011. Sparse reconstruction cost for abnormal event detection. In CVPR. IEEE, 3449--3456. Google Scholar
Digital Library
- Yang Cong, Junsong Yuan, and Ji Liu. 2013. Abnormal event detection in crowded scenes using sparse representation. Pattern Recogn. 46, 7 (2013), 1851--1864. Google Scholar
Digital Library
- Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR, Vol. 1. IEEE, 886--893. Google Scholar
Digital Library
- T. Deselaers, L. Pimenidis, and H. Ney. 2008. Bag-of-visual-words models for adult image classification and filtering. In ICPR. 1--4.Google Scholar
- Shenghua Gao, Liang Tien Chia, and Wai Hung Tsang. 2011. Multi-layer group sparse coding -- for concurrent image classification and annotation. In CVPR. 2809--2816. Google Scholar
Digital Library
- Shenghua Gao, Ivor Wai-Hung Tsang, Liang-Tien Chia, and Peilin Zhao. 2010. Local features are not lonely--Laplacian sparse coding for image classification. In CVPR. IEEE, 3555--3561.Google Scholar
- K. Gono, T. Obi, M. Yamaguchi, N. Ohyama, H. Machida, Y. Sano, S. Yoshida, Y. Hamamoto, and T. Endo. 2004. Appearance of enhanced tissue features in narrow-band endoscopic imaging. J. Biomed. Opt. 9, 3 (2004), 568--577.Google Scholar
Cross Ref
- H. He, F. Kong, and J. Tan. 2016. DietCam: Multi-view food recognition using a multi-kernel SVM. IEEE J. Biomed. Health Inform. 20, 3 (2016), 848--855.Google Scholar
Cross Ref
- H. He, Z. Shao, and J. Tan. 2015. Recognition of car makes and models from a single traffic-camera image. IEEE Trans. Intell. Transp. Syst. 16, 6 (2015), 1--11.Google Scholar
Digital Library
- Chun Rong Huang, Pau Choo Chung, Bor Shyang Sheu, Hsiu Jui Kuo, and P. Mikulas. 2008. Helicobacter pylori-related gastric histology classification using support-vector-machine-based feature selection. IEEE Trans. Inf. Technol. Biomed. 12, 4 (2008), 523--531. Google Scholar
Digital Library
- Yongzhen Huang, Zifeng Wu, Liang Wang, and Tieniu Tan. 2013. Feature coding in image classification: A comprehensive study. IEEE Trans. Pattern Anal. Mach. Intell. 36, 3 (2013), 493--506. Google Scholar
Digital Library
- D. K. Iakovidis, S. Tsevas, and A. Polydorou. 2010. Reduction of capsule endoscopy reading times by unsupervised image mining. Comput. Med. Imag. Graph. 34, 6 (2010), 471--478.Google Scholar
Cross Ref
- Herve Jegou, Matthijs Douze, Cordelia Schmid, and Patrick Perez. 2010. Aggregating local descriptors into a compact image representation. In CVPR. 3304--3311.Google Scholar
- Herve Jegou, Florent Perronnin, Matthijs Douze, Jorge Sanchez, Patrick Perez, and Cordelia Schmid. 2012. Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34, 9 (2012), 1704--1716. Google Scholar
Digital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NIPS. 1097--1105. Google Scholar
Digital Library
- R. Kumar, Q. Zhao, S. Seshamani, G. Mullin, G. Hager, and T. Dassopoulos. 2012. Assessment of Crohn’s disease lesions in wireless capsule endoscopy images. IEEE Trans. Biomed. Eng. 59, 2 (2012), 355--362.Google Scholar
Cross Ref
- B. Li and M. Q. Meng. 2009. Computer-aided detection of bleeding regions for capsule endoscopy images. IEEE Trans. Biomed. Eng. 56, 4 (2009), 1032--1039.Google Scholar
Cross Ref
- B. Li and M. Q. Meng. 2012. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection. IEEE Trans. Inf. Technol. Biomed. 16, 3 (2012), 323--329. Google Scholar
Digital Library
- Baopu Li, Guoqing Xu, Ran Zhou, and Tianfu Wang. 2015. Computer aided wireless capsule endoscopy video segmentation. Med. Phys. 42, 2 (2015), 645--652.Google Scholar
Cross Ref
- M. Mackiewicz, J. Berens, and M. Fisher. 2008. Wireless capsule endoscopy color video segmentation. IEEE Trans. Med. Imag. 27, 12 (2008), 1769--1781.Google Scholar
Cross Ref
- A. V. Mamonov, I. N. Figueiredo, P. N. Figueiredo, and Y. H. Tsai. 2014. Automated polyp detection in colon capsule endoscopy. IEEE Trans. Med. Imag. 33, 7 (2014), 1488--1502.Google Scholar
Cross Ref
- Irfan Mehmood, Muhammad Sajjad, and Sung Wook Baik. 2014. Video summarization based tele-endoscopy: A service to efficiently manage visual data generated during wireless capsule endoscopy procedure. J. Med. Syst. 38, 9 (2014), 1--9. Google Scholar
Digital Library
- Azadeh Sadat Mozafari and Mansour Jamzad. 2016. A SVM-based model-transferring method for heterogeneous domain adaptation. Pattern Recogn. 56 (2016), 142--158. Google Scholar
Digital Library
- Manabu Muto, Hirokazu Higuchi, Yasumasa Ezoe, Takahiro Horimatsu, Shuko Morita, Shin Ichi Miyamoto, and Tsutomu Chiba. 2011. Differences of image enhancement in image-enhanced endoscopy: Narrow band imaging versus flexible spectral imaging color enhancement. J. Gastroenterol. 46, 8 (2011), 998--1002.Google Scholar
Cross Ref
- E. Pasolli, F. Melgani, D. Tuia, F. Pacifici, and W. J. Emery. 2014. SVM active learning approach for image classification using spatial information. IEEE Trans. Geosci. Remote Sens. 52, 52 (2014), 2217--2233.Google Scholar
Cross Ref
- Florent Perronnin, Jorge Sanchez, and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification. In ECCV. 119--133. Google Scholar
Digital Library
- F. Riaz, A. Hassan, R. Nisar, and M. Dinis-Ribeiro. 2015. Content-adaptive region-based color texture descriptors for medical images. Leukemia 27, 4 (2015), e90--2.Google Scholar
- F. Riaz, F. B. Silva, M. D. Ribeiro, and M. T. Coimbra. 2012. Invariant gabor texture descriptors for classification of gastroenterology images. IEEE Trans. Biomed. Eng. 59, 10 (2012), 2893--2904.Google Scholar
Cross Ref
- Jorge Sanchez, Florent Perronnin, Thomas Mensink, and Jakob Verbeek. 2013. Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vis. 105, 3 (2013), 222--245. Google Scholar
Digital Library
- Amit Satpathy, Xudong Jiang, and How Lung Eng. 2014. LBP-based edge-texture features for object recognition. IEEE Trans. Image Process. 23, 5 (2014), 1953--1964.Google Scholar
Cross Ref
- Bernhard Scholkopf, John Platt, and Thomas Hofmann. 2007. Efficient sparse coding algorithms. In NIPS. 801--808. Google Scholar
Digital Library
- R. Shahidi, M. R. Bax, Maurer Cr Jr, J. A. Johnson, E. P. Wilkinson, B. Wang, J. B. West, M. J. Citardi, K. H. Manwaring, and R. Khadem. 2003. Implementation, calibration and accuracy testing of an image-enhanced endoscopy system. IEEE Trans. Med. Imag. 21, 12 (2003), 1524--1535.Google Scholar
Cross Ref
- Zhenzhou Shao, Yong Guan, Hongsheng He, and Jindong Tan. 2014. Geometry constrained sparse embedding for multi-dimensional transfer function design in direct volume rendering. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA’14). 1290--1295.Google Scholar
Cross Ref
- Y. Shen, P. P. Guturu, and B. P. Buckles. 2012. Wireless capsule endoscopy video segmentation using an unsupervised learning approach based on probabilistic latent semantic analysis with scale invariant features. IEEE Trans. Inf. Technol. Biomed. 16, 1 (2012), 98--105. Google Scholar
Digital Library
- S. Wang, Y. Cong, H. Fan, L. Liu, X. Li, S. Yang, Y. Tang, H. Zhao, and H. Yu. 2016. Computer-aided endoscopic diagnosis without human specific labeling. IEEE Trans. Biomed. Eng. 63, 11 (2016), 2347--2358.Google Scholar
Cross Ref
- C. H. Wu, Y. N. Sun, and C. C. Chang. 2007. Three-dimensional modeling from endoscopic video using geometric constraints via feature positioning. IEEE Trans. Biomed. Eng. 54, 7 (2007), 1199--1211.Google Scholar
Cross Ref
- Zhongwen Xu, Yi Yang, and Alexander G. Hauptmann. 2015. A discriminative CNN video representation for event detection. In CVPR. 1798--1807.Google Scholar
- Jianchao Yang, Kai Yu, Yihong Gong, and T. Huang. 2009. Linear spatial pyramid matching using sparse coding for image classification. In CVPR. 1794--1801.Google Scholar
- Jianchao Yang, Kai Yu, and Thomas Huang. 2010. Efficient highly over-complete sparse coding using a mixture model. In ECCV. 113--126. Google Scholar
Digital Library
- X. Yu, J. Yang, T. Wang, and T. Huang. 2015. Key point detection by max pooling for tracking. IEEE Trans. Cybern. 45 (2015), 444--452.Google Scholar
- Y. Yuan, J. Wang, B. Li, and Q. H. Meng. 2015. Saliency based ulcer detection for wireless capsule endoscopy diagnosis. IEEE Trans. Med. Imag. 34, 10 (2015), 1.Google Scholar
Cross Ref
- Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. 2014. Good practice in large-scale learning for image classification. IEEE Trans. Softw. Eng. 36, 3 (2014), 507--520. Google Scholar
Digital Library
- Chunjie Zhang, Jing Liu, Qi Tian, Changsheng Xu, Hanqing Lu, and Songde Ma. 2011. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In CVPR. 1673--1680. Google Scholar
Digital Library
Index Terms
Multi-Class Latent Concept Pooling for Computer-Aided Endoscopy Diagnosis
Recommendations
Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
ICCV '13: Proceedings of the 2013 IEEE International Conference on Computer VisionIn graph-based semi-supervised learning approaches, the classification rate is highly dependent on the size of the availabel labeled data, as well as the accuracy of the similarity measures. Here, we propose a semi-supervised multi-class/multi-label ...
Self-supervised multimodal reconstruction pre-training for retinal computer-aided diagnosis
AbstractComputer-aided diagnosis using retinal fundus images is crucial for the early detection of many ocular and systemic diseases. Nowadays, deep learning-based approaches are commonly used for this purpose. However, training deep neural ...
Highlights- Self-supervised multimodal pre-training improves retinal computer-aided diagnosis.
Improving the performance of computer-aided diagnosis systems using semi-supervised learning: a survey and analysis
The healthcare sector generates important amount of medical data on a daily basis, several machine learning (ML) methods have been developed and studied in order to usefully exploit this substantial sum of information generated colossally, in a wide range ...






Comments