Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models

نویسندگان

  • Fang-Lan Huang
  • Cho-Jui Hsieh
  • Kai-Wei Chang
  • Chih-Jen Lin
چکیده

Maximum entropy (Maxent) is useful in natural language processing and many other areas. Iterative scaling (IS) methods are one of the most popular approaches to solve Maxent. With many variants of IS methods, it is difficult to understand them and see the differences. In this paper, we create a general and unified framework for iterative scaling methods. This framework also connects iterative scaling and coordinate descent methods. We prove general convergence results for IS methods and analyze their computational complexity. Based on the proposed framework, we extend a coordinate descent method for linear SVM to Maxent. Results show that it is faster than existing iterative scaling methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iterative Scaling and Coordinate Descent Methods for Maximum Entropy

Maximum entropy (Maxent) is useful in many areas. Iterative scaling (IS) methods are one of the most popular approaches to solve Maxent. With many variants of IS methods, it is difficult to understand them and see the differences. In this paper, we create a general and unified framework for IS methods. This framework also connects IS and coordinate descent (CD) methods. Besides, we develop a CD...

متن کامل

Notes on CG and LM-BFGS Optimization of Logistic Regression

It has been recognized that the typical iterative scaling methods [?, ?] used to train logistic regression classification models (maximum entropy models) are quite slow. Goodman has suggested the use of a component-wise optimization of GIS [?], which he has measured to be faster on many tasks. However, in general, the iterative scaling methods pale in comparison to conjugate gradient ascent (fo...

متن کامل

Convex relaxation methods for graphical models: Lagrangian and maximum entropy approaches

Graphical models provide compact representations of complex probability distributions of many random variables through a collection of potential functions defined on small subsets of these variables. This representation is defined with respect to a graph in which nodes represent random variables and edges represent the interactions among those random variables. Graphical models provide a powerf...

متن کامل

A Comparison of Algorithms for Maximum Entropy Parameter Estimation

Conditional maximum entropy (ME) models provide a general purpose machine learning technique which has been successfully applied to fields as diverse as computer vision and econometrics, and which is used for a wide variety of classification problems in natural language processing. However, the flexibility of ME models is not without cost. While parameter estimation for ME models is conceptuall...

متن کامل

Avoiding communication in primal and dual block coordinate descent methods

Primal and dual block coordinate descent methods are iterative methods for solving regularized and unregularized optimization problems. Distributed-memory parallel implementations of these methods have become popular in analyzing large machine learning datasets. However, existing implementations communicate at every iteration which, on modern data center and supercomputing architectures, often ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2010