Finding the smallest binarization of a CFG is NP-hard

نویسنده

  • Carlos Gómez-Rodríguez
چکیده

Grammar binarization is the process and result of transforming a grammar to an equivalent form whose rules contain at most two symbols in their right-hand side. Binarization is used, explicitly or implicity, by a wide range of parsers for contextfree grammars and other grammatical formalisms. Non-trivial grammars can be binarized in multiple ways, but in order to optimize the parser’s computational cost, it is convenient to choose a binarization that is as small as possible. While several authors have explored heuristics to obtain compact binarizations, none of them guarantee that the resulting grammar has minimum size. However, to our knowledge, no hardness results for this problem have been published. In this article, we address this issue and prove that the problem of finding a minimum binarization of a given context-free grammar is NP-hard, by reduction from vertex cover. We also provide a lower bound on the approximability of this problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A polynomial-time algorithm for a class of minimum concave cost flow problems

We study the minimum concave cost flow problem over a two-dimensional grid network (CFG), where one dimension represents time (1 ≤ t ≤ T ) and the other dimension represents echelons (1 ≤ l ≤ L). The concave function over each arc is given by a value oracle. We give a polynomial-time algorithm for finding the optimal solution when the network has a fixed number of echelons and all sources lie a...

متن کامل

الگوریتم ژنتیک با جهش آشوبی هوشمند و ترکیب چند‌نقطه‌ای مکاشفه‌ای برای حل مسئله رنگ‌آمیزی گراف

Graph coloring is a way of coloring the vertices of a graph such that no two adjacent vertices have the same color. Graph coloring problem (GCP) is about finding the smallest number of colors needed to color a given graph. The smallest number of colors needed to color a graph G, is called its chromatic number. GCP is a well-known NP-hard problems and, therefore, heuristic algorithms are usually...

متن کامل

Solving the Traveling Salesman Problem by an Efficient Hybrid Metaheuristic Algorithm

The traveling salesman problem (TSP) is the problem of finding the shortest tour through all the nodes that a salesman has to visit. The TSP is probably the most famous and extensively studied problem in the field of combinatorial optimization. Because this problem is an NP-hard problem, practical large-scale instances cannot be solved by exact algorithms within acceptable computational times. ...

متن کامل

Solving the Traveling Salesman Problem by an Efficient Hybrid Metaheuristic Algorithm

The traveling salesman problem (TSP) is the problem of finding the shortest tour through all the nodes that a salesman has to visit. The TSP is probably the most famous and extensively studied problem in the field of combinatorial optimization. Because this problem is an NP-hard problem, practical large-scale instances cannot be solved by exact algorithms within acceptable computational times. ...

متن کامل

Just-in-time subgrammar extraction for HPSG

We define the basic problem of subgrammar extraction for head-driven phrase structure grammars (HPSG) in the following way: Given a large HPSG grammarG and a set of wordsW , find a small subgrammar of G that accepts the same set of sentences fromW asG, and for each of them produces the same parse trees. The set of words W is obtained from a piece of text. Additionally, we assume that this opera...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 80  شماره 

صفحات  -

تاریخ انتشار 2014