The proposed Optimised Kernel KPCA and AKFA together with AKFA and KPCA were evaluated using CT image data sets of colonic polyps comprised of True Polyps (TP) and False Polyps (FP) detected by our CAD system [5]. We obtained studies of 146 patients who had undergone a coloncleansing regimen in preparation for sameday optical colonoscopy. Each patient was scanned in both supine and prone positions, resulting in a total of 292 CT studies. Helical singleslice and multislice CT scanners (GE HiSpeed CTi, LightSpeed QX/I, and LightSpeed Ultra; GE Medical Systems, Milwaukee, WI) were used, with collimations of 1.25  5.0 mm, reconstruction intervals of 1.0 5.0 mm, Xray tube currents of 50 260 mA and voltages of 120 140 kVp. Inplane voxel sizes were 0.51 0.94 mm, and the CT image matrix size was 512 x 512. Out of 146 patients, there were 108 normal cases and 38 abnormal cases with a total of 61 colonoscopyconfirmed polyps larger than 6 mm. Twentyeight polyps were 69 mm and 33 polyps were larger than 10 mm (including 7 lesions larger than 30 mm). The CAD scheme processed the supine and prone volumetric data sets generated from a single patient independently to yield polyp candidates. The CAD scheme detected polyp candidates in the 292 CT colonography data sets with a 98% bypolyp detection sensitivity.
The volumes of interest (VOIs) representing each polyp candidate have been calculated as follows. The CAD scheme provided a segmented region for each candidate. The center of the VOI was placed at the center of mass of the region. The size of the VOI was chosen so that the entire region was covered. Finally, the VOI was resampled to 16ื16ื16 voxels. The VOIs so computed comprise the data set DB1, a sample of which is shown in Figure 1. There were a total of 131 true polyps (some of the larger lesions had multiple detections) and 8008 false polyps. The same procedure has been carried out using VOIs with dimensions 12ื12ื12 voxels to build the data set DB2 that consists of 39 true polyps and 149 false polyps.
The results on the computation time, in seconds, are shown in Table I. All the results were obtained using the Statistical Pattern Recognition Toolbox [17] on Matlab 7.0.1 (R14) for the Gram matrix calculation and KPCA algorithm, running on the Partners Research Computing cluster [18]. The cluster had 26 working nodes using an HP server. Each node had 72 GB storage (head node 380 GB storage), and two 3 GHz AMD Opteron 32/64 CPUs and 4 GB RAM. The nodes communicated via a GigE switch and used an NFS mount. For each of the algorithms, KPCA, SKFA, and AKFA, Table I indicates that the computational time of KPCA increased rapidly with the increase of the data size n. We set the eigendimension to 70 for measuring computation time. At n = 3500, the computation time of KPCA and SKFA were 9.4 and 30.5 times longer than that of AKFA. If the computational time versus data size n is plotted with commonlogarithm scales, the results fit the expected curves, and thus validate the complexity analyses in the methodology sections. This clearly shows that our proposed AKFA was much faster than the existing methods of KPCA and SKFA, especially when the data size is large.
C. Evaluation of Classification Accuracy: Optimized Kernel versus Unoptimized Kernel for DB2 data set:
In order to analyze how optimized kernel effects the classification performance of polyp candidates, we used the knearest neighborhood classifier on the image vectors in the reduced eigenspace. The data set DB2 was used in the experiments described in this section. . The training data and test data were selected according to the arrangement given in Table I. We first evaluated the performance of the classifier by applying it to the feature spaces obtained by KPCA , AKFA and then evaluated the performance of the classifier by applying it to kernel optimized feature spaces obtained by KPCA (Case 1) and AKFA(Case 2).
Case1: KPCA. The knearest neighborhood classifier was applied on the data after the feature extraction using the KPCA algorithm with data dependent kernel, which was used to extract a total of 75 features during the training. The data set as described in Arrangement 1 in Table I was used, and 1 10 nearest neighbors were considered. The results for classification accuracy against the number of nearest neighbors are given in Table II. When 9 nearest neighbors were considered for classification, the test data in the reduced eigenspace were grouped as given in Table I, resulting in a classification accuracy of 97.50%.
Case2: AKFA. The AKFA algorithm has been applied on the same training data set given in Table I to extract 75 features. Then the test data were classified using knearest neighborhood classifier considering 1 10 nearest neighbors; the results are summarized in Table II.
Table II: Arrangement of Training and Test data for classification. The data set DB2 was comprised of 39 true polyp data and 149 false polyp data.



Proportion from the total data set 
Number of vectors 
Total 
Arrangement 1 
Training Set 
TP 
80.00% 
31 
148 
FP 
78.30% 
117 

Test Set 
TP 
20% 
8 
40 

FP 
21.70% 
32 
Table III: Classification accuracy for each feature extraction algorithm against the number k of nearest neighbors after kernel optimization:
No. of Nearest Neighbors 
Classification Accuracy % (with kernel optimization) 
Classification Accuracy % (without kernel optimization) 

KPCA 
AKFA 
KPCA 
AKFA 

1 
100 
97.50 
97.5 
95.00 
2 
97.50 
92.50 
100 
90 
3 
97.50 
95.00 
95.5 
95 
4 
100 
100 
100 
90 
5 
100 
100 
100 
92.5 
6 
95.00 
92.50 
97.5 
92.5 
7 
97.50 
92.50 
97.5 
92.5 
8 
95.00 
95.00 
95 
90 
9 
97.50 
92.50 
95 
92.5 
10 
92.50 
90.00 
92.5 
87.5 
The reconstruction error results for kernel optimized KPCA, AKFA, and KPCA, AKFA for this data set are summarized in Table III. The results show that the reconstruction ability of kernel optimized KPCA and AKFA have similar performance as that of KPCA and AKFA, made evident by the smaller reconstruction error. However, given more training data coupled with the ability to extract more features would have resulted in a more accurate representation of data in the reduced eigenspace, and therefore in comparable results for KPCA and AKFA
Table IV: Mean Square Reconstruction Error for DB2
Feature Extraction Algorithm 
Mean Square Error (%) 
KPCA 
6.74 
AKFA 
10.74 
WKOKPCA 
6.84 
WKOAKFA 
10.99 