Generalisation Enhancement via Input Space Transformation: A GP Approach

نویسندگان

  • Ahmed Kattan
  • Michael Kampouridis
  • Alexandros Agapitos
چکیده

This paper proposes a new approach to improve generalisation of standard regression techniques when there are hundreds or thousands of input variables. The input space X is composed of observational data of the form (xi, y(xi)), i = 1...n where each xi denotes a k-dimensional input vector of design variables and y is the response. Genetic Programming (GP) is used to transform the original input space X into a new input space Z = (zi, y(zi)) that has smaller input vector and is easier to be mapped into its corresponding responses. GP is designed to evolve a function that receives the original input vector from each xi in the original input space as input and return a new vector zi as an output. Each element in the newly evolved zi vector is generated from an evolved mathematical formula that extracts statistical features from the original input space. To achieve this, we designed GP trees to produce multiple outputs. Empirical evaluation of 20 different problems revealed that the new approach is able to significantly reduce the dimensionality of the original input space and improve the performance of standard approximation models such as Kriging, Radial Basis Functions Networks, and Linear Regression, and GP (as a regression techniques). In addition, results demonstrate that the new approach is better than standard dimensionality reduction techniques such as Principle Component Analysis (PCA). Moreover, the results show that the proposed approach is able to improve the performance of standard Linear Regression and make it competitive to other stochastic regression techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Meta-heuristic Approach to Cope with State Space Explosion in Model Checking Technique for Deadlock Freeness

Model checking is an automatic technique for software verification through which all reachable states are generated from an initial state to finding errors and desirable patterns. In the model checking approach, the behavior and structure of system should be modeled. Graph transformation system is a graphical formal modeling language to specify and model the system. However, modeling of large s...

متن کامل

On the Analysis of Simple Genetic Programming for Evolving Boolean Functions

This work presents a first step towards a systematic time and space complexity analysis of genetic programming (GP) for evolving functions with desired input/output behaviour. Two simple GP algorithms, called (1+1) GP and (1+1) GP*, equipped with minimal function (F) and terminal (L) sets are considered for evolving two standard classes of Boolean functions. It is rigorously proved that both al...

متن کامل

Improving the Generalisation Ability of Genetic Programming with Semantic Similarity based Crossover

This paper examines the impact of semantic control on the ability of Genetic Programming (GP) to generalise via a semantic based crossover operator (Semantic Similarity based Crossover SSC). The use of validation sets is also investigated for both standard crossover and SSC. All GP systems are tested on a number of real-valued symbolic regression problems. The experimental results show that whi...

متن کامل

Forecasting time series with Hyper-Volume Error Separation (HVES)

Time series prediction is a crucial task in many areas but the development of effective modeling and simulation methods to understand or predict the behavior of time dependent phenomena remains particularly difficult. In this paper we propose to use a Genetic Programming (GP) approach as a robust method for coping with problems in which finding a solution and its representation is difficult but...

متن کامل

A data-driven approach to speech enhancement using Gaussian process

This paper presents a novel data-driven approach to single channel speech enhancement employing Gaussian process (GP). Our approach is based on applying GP regression to estimate the residual gain with the input features being the a priori and a posteriori signal-to-noise ratios (SNRs). The residual gain is defined as the difference between the optimal gain and that obtained from the minimum me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014