Federated PCA with Adaptive Rank Estimation
In many online machine learning and data science tasks such as data summarisation and feature compression, $d$-dimensional vectors are usually distributed across a large number of clients in a decentralised network and collected in a streaming fashion. This is increasingly common in modern applications due to the sheer volume of data generated and the clients' constrained resources. In this setting, some clients are required to compute an update to a centralised target model independently using local data while other clients aggregate these updates with a low-complexity merging algorithm. However, some clients with limited storage might not be able to store all of the data samples if $d$ is large, nor compute procedures requiring at least $\Omega(d^2)$ storage-complexity such as Principal Component Analysis, Subspace Tracking, or general Feature Correlation. In this work, we present a novel federated algorithm for PCA that is able to adaptively estimate the rank $r$ of the dataset and compute its $r$ leading principal components when only $O(dr)$ memory is available. This inherent adaptability implies that $r$ does not have to be supplied as a fixed hyper-parameter which is beneficial when the underlying data distribution is not known in advance, such as in a streaming setting. Numerical simulations show that, while using limited-memory, our algorithm exhibits state-of-the-art performance that closely matches or outperforms traditional non-federated algorithms, and in the absence of communication latency, it exhibits attractive horizontal scalability.
NurtureToken New!

Token crowdsale for this paper ends in

Buy Nurture Tokens

Authors

Are you an author of this paper? Check the Twitter handle we have for you is correct.

Andreas Grammenos (add twitter)
Rodrigo Mendoza-Smith (add twitter)
Cecilia Mascolo (edit)
Jon Crowcroft (add twitter)
Ask The Authors

Ask the authors of this paper a question or leave a comment.

Read it. Rate it.
#1. Which part of the paper did you read?

#2. The paper contains new data or analyses that is openly accessible?
#3. The conclusion is supported by the data and analyses?
#4. The conclusion is of scientific interest?
#5. The result is likely to lead to future research?

Github
User:
None (add)
Repo:
None (add)
Stargazers:
0
Forks:
0
Open Issues:
0
Network:
0
Subscribers:
0
Language:
None
Youtube
Link:
None (add)
Views:
0
Likes:
0
Dislikes:
0
Favorites:
0
Comments:
0
Other
Sample Sizes (N=):
Inserted:
Words Total:
Words Unique:
Source:
Abstract:
None
07/18/19 06:02PM
10,632
2,813
Tweets
tforcworc: student&colleague's cunning work https://t.co/W7Ibkz1Rov
JoshARhoads: RT @arxiv_org: Federated PCA with Adaptive Rank Estimation. https://t.co/Cg0AYEe02T https://t.co/taLOxQ3lhw
arxiv_org: Federated PCA with Adaptive Rank Estimation. https://t.co/Cg0AYEe02T https://t.co/taLOxQ3lhw
StatsPapers: Federated PCA with Adaptive Rank Estimation. https://t.co/888CSGm4O8
arxivml: "Federated PCA with Adaptive Rank Estimation", Andreas Grammenos, Rodrigo Mendoza-Smith, Cecilia Mascolo, Jon Crowc… https://t.co/2aKpF83Ogf
arxiv_cs_LG: Federated PCA with Adaptive Rank Estimation. Andreas Grammenos, Rodrigo Mendoza-Smith, Cecilia Mascolo, and Jon Crowcroft https://t.co/djUsb9gAtC
Images
Related