03964.com

文档资料库 文档搜索专家

文档资料库 文档搜索专家

Inductive Sparse Subspace Clustering

2.5

2

1.5

1

X. Peng, L. Zhang and Z. Yi

0.5 0 -0.5

2 1.5 1

0.5

0

0.5 0 -0.5

-0.5

Sparse Subspace Clustering (SSC) has achieved state-of-the-art clustering quality by performing spectral clustering over a ?1 -norm based similarity graph. However, SSC is a transductive method which does not handle with the data not used to construct the graph (out-of-sample data). For each new datum, SSC requires solving n optimization problems in O(n) variables for performing the algorithm over the whole data set, where n is the number of data points. Therefore, it is inef?cient to apply SSC in fast online clustering and scalable graphing. In this letter, we propose an inductive spectral clustering algorithm, called inductive Sparse Subspace Clustering (iSSC), which makes SSC feasible to cluster outof-sample data. iSSC adopts the assumption that high-dimensional data actually lie on the low-dimensional manifold such that out-of-sample data could be grouped in the embedding space learned from in-sample data. Experimental results show that iSSC is promising in clustering out-ofsample data.

6 5 2 1 0 -1 -2 -2 -1 0 2 1 3 4

-1 -1.5

-1

-1.5

-2 -2.5 -2 -1 0 1 2 3 4 5 6

-2 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4

(a)

(b)

(c)

Fig. 1 A key observation. (a) some data points sampled from two 2-dimensional manifolds (trefoil-knots) which are embedded into a 3-dimensional space; (b) a plan view of the sampled data; (c) the embedding of the sampled data. It is easy to ?nd that out-of-sample data points could be easily grouped into the correct cluster after they were projected into the embedding space.

function,

min ci

1

s.t.

yi ? Yi ci

2

< δ, ∈ Rm

(1)

Introduction: Spectral clustering is one of the most popular subspace clustering algorithms, which aims to ?nd a cluster membership of data points and the corresponding low-dimensional representation by utilizing the spectrum of a Laplacian matrix. The entries in the Laplacian matrix depict the similarity among data points. Thus, the construction of similarity graph lies on the heart of spectral clustering. In a similarity graph, the vertex denotes a data point and the connection weight between two points represents their similarity. Recently, Elhamifar and Vidal [1] constructed a similarity graph by using ?1 -minimization based coef?cient and performed spectral clustering over the graph, named Sparse Subspace Clustering (SSC). It automatically selects the nearby points for each datum by utilizing the principle of sparsity without pre-determination of the size of neighborhood. SSC has achieved impressive performance in images clustering and motion segmentation. However, it requires solving n optimization problems over n data points and calculating the eigenvectors of a n × n matrix, resulting in a very high computational complexity. In general, the time complexity of SSC is proportion to the cubic of data size. Thus, any medium-sized data set will bring up the scalability issues with SSC. In addition, SSC is a transductive algorithm which does not handle with the data not used to construct the graph (out-of-sample data). For each new datum, SSC needs performing the algorithm over the whole data set, which makes SSC inef?cient to fast online clustering and scalable grouping. To address the scalability issue and the out-of-sample problem in SSC, we propose an inductive clustering algorithm which is called inductive Sparse Subspace Clustering algorithm (iSSC). Out motivation derives from a widely-accepted assumption in manifold learning that the highdimensional data actually lie on the low-dimensional manifold. Therefore, we could obtain the cluster membership of out-of-sample data by assigning them to the nearest cluster in the embedding space learned from wellsampled in-sample data. In other words, we resolve the out-of-sample problem in SSC by using subspace learning method. On the other hand, for large scale data set, we randomly split it into two parts, in-sample data and out-of-sample data, such that scalability issue could be addressed as an out-of-sample problem. Except in some speci?ed cases, lower-case bold letters represent column vectors and upper-case bold ones represent matrices. AT denotes the transpose of the matrix A whose pseudo-inverse is A?1 , and I is reserved for identity matrix. Inductive Sparse Subspace Clustering Algorithm: The basic idea of our approach is that: Suppose two data sets Y ∈ Rm×p (in-sample data) and X ∈ Rm×n (out-of-sample data) are drawn from multiple underlying manifolds of which each corresponds to a subspace. Provided Y is suf?cient such that the manifolds are well-sampled, we expect to learn an embedding space with Y and group X in the embedding space since it is more compact and discriminative than the original space (See Fig. 1). We make SSC feasible to cluster out-of-sample data in "subspace clustering, subspace learning and extension" manner. The ?rst two steps are of?ine processes which only involve in-sample data, and the last one groups the out-of-sample data in online way. To obtain the cluster membership of in-sample data Y, iSSC ?rstly constructs a similarity graph by minimizing the following objective

arXiv:1303.0362v1 [cs.LG] 2 Mar 2013

where ci is the sparse representation of the data point yi over the dictionary Yi [y1 . . . yi?1 0 yi+1 . . . yp ], and δ ≥ 0 is the error tolerance. After getting the coef?cients of Y, iSSC performs normalized spectral clustering over the sparse coef?cients to get the cluster membership of Y as SSC does. However, SSC could not ef?ciently cope with out-of-sample data. Motivated by the assumption in manifold learning, we aim to group out-of-sample data in the embedding space. In this letter, following the embedding program of Neighborhood Preserving Embedding algorithm (NPE) [3], we perform subspace learning to compute the projection matrix W via

min WT Y ? WT YC

W 2 2

∈ Rp

, s.t. WT YYT W = I,

(2)

where C ∈ Rp×p is the collection of the sparse representation of Y produced by (1), and the constraint term aims at the scale-invariance. The solution of (2) is given by the maximum eigenvalue solution to the following generalized eigenvector problem:

WT Y(C + CT ? CT C)YT W = λWT YYT W

(3)

Once the optimal W is achieved, iSSC transforms out-of-sample data X in the embedding space via WT X, and then assigns X to the nearest cluster in the space. The steps of iSSC can be summarized as follows: 1 For in-sample data Y, calculate the sparse representation coef?cients C via solving

min ci

1

s.t.

yi ? Yi ci

1 1

2

< δ.

2 Construct a Laplacian matrix L = S? 2 AS? 2 by using the af?nity matrix A, where S = diag{si } with si = p aij , aij is an entry of j=1 A and A = |C| + |C|T . 3 Obtain the eigenvector matrix V ∈ Rp×k which consists of the ?rst k normalized eigenvectors of L corresponding to its k smallest eigenvalues. 4 Get the segmentations of Y by performing k-means clustering algorithm on the rows of V. 5 Suppose the desired dimensionality of embedding space is d, the projection matrix W ∈ Rm×d is given by the eigenvectors with d largest eigenvalues of the following eigenvector problem:

MZT = λZT ,

where M = C + CT ? CT C and Z = WT Y. 6 Project out-of-sample data X into the d-dimensional space via WT X. 7 Search the nearest neighbor of X from Y in the embedding space, and assign X to the cluster that the neighbor belongs to. Computational Complexity Analysis: Suppose in-sample data Y ∈ Rm×p drawn from k subspaces, we need O(t1 mp3 + t2 pk 2 ) to perform SSC over Y, where t1 and t2 are the numbers of iteration of Homotopy optimizer [2] and k-means clustering algorithm, respectively. Moreover, we need O(p3 ) to compute the projection matrix WT . To group out-of-sample datum X ∈ Rm×n , we need O(dmn) to obtain its d-dimensional representation and O(dpn) to search the nearest neighbor of X from Y in the embedding space.

ELECTRONICS LETTERS

31th January 2013

Vol. 00

No. 00

Putting everything together, the computational complexity of iSSC is O(t1 mp3 + t2 pk 2 + dpn) owing to d < m and m < p, where m < p derives from the conditions of compressive sensing theory. Clearly, under the same conditions, iSSC is more ef?cient than SSC whose time complexity is about O(t1 mn3 + t2 nk 2 ) . Baselines and Evaluation Metrics: We presented the experimental results of our approach over three real-world data sets, i.e., Extended Yale Database B (ExYaleB) [4], Pendigit1 and USPS2 . ExYaleB contains 2414 facial images of 38 subjects. We cropped the images from 192 × 168 to 48 × 42 and extracted 114 features by using PCA to retain 98% energy of the cropped data. Pendigit and USPS are two handwritten digital data sets distributed over 10 classes, where Pendigit contains 10992 samples with 54 features and USPS consists of 11000 samples with 256 dimensionality. We compared iSSC with three state-of-the-art inductive clustering algorithms, i.e., Nystr?m based spectral clustering [5], Spectral Embedding Clustering (SEC) [6] and Approximate Kernel K-means (AKK) [7]. Note that, Nystr?m based spectral clustering and SEC have two variants, respectively. We denote the variants as Nystr?m, Nystr?m_Orth, SEC_K and SEC_R. The approximate af?nity matrix of Nystr?m is non-orthogonal, while that of Nystr?m-Orth is columnorthogonal. SEC_K performs k-means to get the clustering results and SEC_R adopts spectral rotation method to obtain the ?nal cluster assignment matrix. Furthermore, we also reported the results of k-means clustering as a baseline. The MATLAB code of iSSC can be downloaded at https://www.dropbox.com/s/ju6y9qe7w8lcdyp/CodeAndData.zip. We adopted two widely-used metrics, Accuracy and Normalized Mutual Information (NMI), to measure the clustering quality of the tested methods. The value of Accuracy or NMI is 1 indicates that the predicted clustering membership is totally matching with the ground truth whereas 0 indicates totally mismatch. In all experiments, the tuned parameters for the algorithms were applied to achieve their best Accuracy. Speci?cally, iSSC adopted Homotopy optimizer [2] to solve ?1 -minimization problem. The optimizer needs two user-speci?ed parameters, sparsity parameter λ and error tolerance parameter δ. We found a good value combination by setting λ = (10?7 , 10?6 , 10?5 ) and δ = (10?3 , 10?2 , 10?1 ). Moreover, iSSC groups out-of-sample data in an low-dimensional space which preserves 98% energy of the embedding space learned from in-sample data. For the other competing methods, we set the value range for different parameters by following the con?gurations in [5, 6, 7]. Results: To examine the effectiveness of the algorithms, we randomly selected a half of images (1212) from ExYaleB as in-sample data and used the remaining samples as out-of-sample data. In the similar way, we formed two data sets by choosing 1000 samples from Pendigit and USPS as in-sample data and used the rest as out-of-sample data, respectively.

Table 3: Performance comparisons in different algorithms over USPS.

Algorithms iSSC (1e-7, 0.01) Nystr?m (14) Nystr?m_Orth (0.5) SEC_K (1e-9, 3, 1) SEC_R (1e-6, 4, 1) AKK (0.3) k-means Accuracy 52.93% 47.66% 50.70% 47.63% 11.70% 48.49% 46.54% NMI 52.90% 44.42% 44.60% 42.28% 1.44% 46.79% 45.61% Time(s) 41.52 15.91 183.37 43.38 19.78 16.81 250.82

? In all the tests, iSSC demonstrates an elegant balance between running

time and clustering quality. Although iSSC is not the fastest algorithm, it outperforms the other tested methods with considerable performance margins in Accuracy and NMI. For example, iSSC achieved 33.97% gains in Accuracy and 16.20% gains in NMI over the second best algorithm when ExYaleB database was used to test. ? The accelerating kernel-based method (AKK) is more competitive when it was applied to cluster handwritten digital data but facial data. Moreover, AKK performed very close to k-means algorithm, which is consistent with the results in [7]. Conclusion: In this letter, we have presented an inductive spectral clustering algorithm, called inductive Sparse Subspace Clustering (iSSC). The algorithm, which is an out-of-sample extension of Sparse Subspace Clustering algorithm (SSC) [1], scales linearly with the problem size such that it could be applied to fast online learning. Experimental results with facial image and digital image clustering indicate the effectiveness of iSSC comparing with the state-of-the-art approaches. Acknowledgment: This work has been supported by ... References

1 Elhamifar, E. and Vidal, R., ’Sparse Subspace Clustering: Algorithm, Theory, and Applications’, To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013. 2 Yang, A., Ganesh, A., Sastry, S., and Ma, Y., Fast L1-Minimization Algorithms and an Application in Robust Face Recognition: A Review’, (EECS Department, University of California, Berkeley, 2010) 3 He, X., Cai, D., Yan, S., and Zhang, H., Neighborhood Preserving Embedding’, IEEE International Conference on Computer Vision, (IEEE, 2005) 4 Georghiades, A.S., Belhumeur, P.N., and Kriegman, D.J., ’From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23, (6), pp. 643-660. 5 Chen, W.-Y., Song, Y., Bai, H., Lin, C.-J., and Chang, E.Y., ’Parallel Spectral Clustering in Distributed Systems’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33, (3), pp. 568-586. 6 Nie, F., Zeng, Z., W., T.I., Xu, D., and Zhang, C., ’Spectral Embedded Clustering: A Framework for in-Sample and out-of-Sample Spectral Clustering’, IEEE Transactions on Neural Networks, 2011, 22, (11), pp. 1796-1808. 7 Chitta, R., Jin, R., Havens, T.C., and Jain, A.K., ’Approximate Kernel K-Means: Solution to Large Scale Kernel Clustering’, ACM SIGMOD international conference on Knowledge discovery and data mining, (2011)

Table 1: Performance comparisons in different algorithms over ExYaleB.

Algorithms iSSC (1e-6, 1e-3) Nystr?m (12) Nystr?m_Orth (2) SEC_K (1e+12, 5, 1) SEC_R (1e+9, 4, 1) AKK (0.4) k-means Accuracy 59.69% 25.72% 21.71% 11.02% 5.97% 8.00% 9.03% NMI 62.77% 46.57% 41.74% 11.09% 4.31% 9.01% 11.20% Time(s) 24.88 9.33 58.87 34.91 19.96 9.94 37.05

Table 2: Performance comparisons in different algorithms over Pendigit.

Algorithms iSSC (1e-6, 0.1) Nystr?m (0.8) Nystr?m_Orth (6) SEC_K (1e-6, 4, 1) SEC_R (1e-9, 1, 1) AKK (0.9) k-means Accuracy 84.94% 76.09% 75.72% 76.91% 11.01% 77.22% 77.05% NMI 71.17% 68.10% 67.04% 63.84% 1.18% 69.48% 69.21% Time(s) 28.03 8.54 39.24 24.51 18.14 9.80 30.21

Tables 1-3 report the clustering quality and the time costs of the tested algorithms over the data sets. In the parenthesis, we also show the tuned parameters when the best Accuracy was achieved. From the results, we have the following observations:

1 2

http://archive.ics.uci.edu/ml/datasets.html http://www.cs.nyu.edu/ roweis/data.html

2

相关文章:

更多相关标签:

- [41]Sparse Subspace Clustering： Algorithm, Theory, and Applications
- Sparse Subspace Clustering_ Algorithm, Theory, and Applications
- Clustering of conceptual graphs with sparse data
- Unsupervised Clustering with Spiking Neurons by Sparse Temporal Coding and Multi-Layer RBF
- Unsupervised clustering with spiking neurons by sparse temporal coding and multilayer RBF n
- Sparse Subspace Clustering
- [41]Sparse Subspace Clustering： Algorithm, Theory, and Applications
- 【子空间聚类】Sparse Subspace Clustering(SSC) Algorithm=
- Sparse Subspace Clustering_ Algorithm, Theory, and Applications
- K-subspace clustering and its application in sparse component analysis
- 子空间聚类Sparse Subspace Clustering SSC
- Probabilistic Subspace Clustering via Sparse Representations
- 2013-CVPR-SSSC-Scalable Sparse Subspace Clustering
- Clustering and Feature Selection using Sparse Principal Component Analysis
- Peng CVPR 2013 - Scalable Sparse Subspace Clustering