Computed tomographic (CT) colonography, or virtual colonoscopy, is a promising technique for screening colorectal cancers by use of CT scans of the colon [1].  Current CT technology allows a single image set of the colon to be acquired in 10-20 seconds, which translates into an easier, more comfortable examination than is available with other screening tests. For CT colonography to be a clinically practical means of screening for colon cancers, the technique must be feasible for interpreting a large number of images in a time-effective fashion, and it must facilitate the detection of polyps—a precursor of colorectal cancers—with high accuracy. Currently, however, interpretation of an entire CT colonography examination is time-consuming, and the reader performance for polyp detection varies substantially [2, 3].

To overcome these difficulties while providing a high detection performance of polyps, researchers are developing computer-aided detection (CAD) schemes that automatically detect suspicious lesions in CT colonography images [4]. CAD for CT colonography provides the locations of the suspicious polyps to radiologists, which offers a second opinion that has the potential to improve radiologists’ detection performance.

Polyps appear as bulbous, caplike structures that adhere to the colonic wall and protrude to the lumen, whereas folds appear as elongated, ridgelike structures, and the colonic wall appears as a large, nearly flat, cuplike structure. Therefore, most CAD schemes employ a model-based approach for the detection of polyp candidates, in which shape analysis methods that differentiate among these distinct types of shapes plays an essential role [5-9]. Nevertheless, there are many naturally occurring normal colonic structures that occasionally imitate such shapes, and therefore the resulting polyp candidates typically include many false positives. The reduction of such false positives is often performed by first extracting a set of image features from segmented polyp regions, followed by application of a statistical classifier to the feature space for discrimination of false positives from actual polyps [8, 10-13]. Such CAD schemes tend to show a high sensitivity in the detection of polyps; however, they tend to suffer from a much large number of false positives than that of human readers [4].

The overall goal of this study is to achieve a high performance in the detection of polyps on CT colonographic images by effectively incorporating an appearance-based object recognition approaches into a model-based CAD scheme.  The specific contribution of our studies is to develop a fast kernel feature analysis that, in combination with a shape-based polyp detection method, can efficiently differentiate polyps from false positives and thus improve the detection performance of polyps. The key idea behind the proposed algorithm is to reconstruct a feature space by use of a feature mapping that maps the original, raw feature space into a higher dimensional feature space. Such a high-dimensional feature space is expected to have a greater classification power than that of the original feature space, as suggested by the Vapnik-Chervonenkis theory [14]. We evaluated our fast kernel feature analysis on texture-based features that were extracted from the polyp candidates generated by our shape-based CAD scheme. The main contribution of this paper lies in the appearance-based approach that improves sparse kernel feature analysis for the classification of texture-based features. The method is then tested using real CT colonography data to show that the improved algorithm is faster than the sparse kernel feature analysis while achieving comparable accuracy.

A kernel function provides a flexible and effective learning mechanism, and the choice of a kernel function should reflect prior knowledge about the problem at hand. However, it is often difficult for us to exploit the prior knowledge on patterns to choose a kernel function, and it is an open question how to choose the best kernel function for a given data set. According to no free lunch theorem [40] on machine learning, there is no superior kernel function in general, and the performance of a kernel function rather depends on applications. The three kernel functions Gaussian, Polynomial, linear, are chosen since they were known to have good performances in the field of bioinformatics [40-45].


The remainder of this paper is organized as follows. Section II provides a brief overview of our proposed appearance-based recognition scheme. Here, we describe how kernel feature analysis of the texture-based features of polyps complements the model-based polyp detection schemes. Section III provides a brief review of the existing kernel-based feature extraction methods.  In Section IV, we present the proposed kernel feature analysis for the detection of polyp candidates. In Section V we propose a composite data dependant Kernel for improving the accuracy. In Section VI, we evaluate the reconstruction and classification performance of the proposed kernel feature analysis algorithm based on the texture-based features of polyps on CT colonographic images. Section VII presents the conclusion.