Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings

Kan Xu,u00a0Xuanyi Zhao,u00a0Hamsa Bastani,u00a0Osbert Bastani

Sparse regression has recently been applied to enable transfer learning from very limited data. We study an extension of this approach to unsupervised learningu2014in particular, learning word embeddings from unstructured text corpora using low-rank matrix factorization. Intuitively, when transferring word embeddings to a new domain, we expect that the embeddings change for only a small number of wordsu2014e.g., the ones with novel meanings in that domain. We propose a novel group-sparse penalty that exploits this sparsity to perform transfer learning when there is very little text data available in the target domainu2014e.g., a single article of text. We prove generalization bounds for our algorithm. Furthermore, we empirically evaluate its effectiveness, both in terms of prediction accuracy in downstream tasks as well as in terms of interpretability of the results.