Papers
arxiv:1110.2058

Convergence Rates for Mixture-of-Experts

Published on Oct 10, 2011
Authors:
,

Abstract

In mixtures-of-experts (ME) model, where a number of submodels (experts) are combined, there have been two longstanding problems: (i) how many experts should be chosen, given the size of the training data? (ii) given the total number of parameters, is it better to use a few very complex experts, or is it better to combine many simple experts? In this paper, we try to provide some insights to these problems through a theoretic study on a ME structure where m experts are mixed, with each expert being related to a polynomial regression model of order k. We study the convergence rate of the maximum likelihood estimator (MLE), in terms of how fast the Kullback-Leibler divergence of the estimated density converges to the true density, when the sample size n increases. The convergence rate is found to be dependent on both m and k, and certain choices of m and k are found to produce optimal convergence rates. Therefore, these results shed light on the two aforementioned important problems: on how to choose m, and on how m and k should be compromised, for achieving good convergence rates.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1110.2058 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1110.2058 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1110.2058 in a Space README.md to link it from this page.

Collections including this paper 1