Extracting Principal Components from Pseudo-random Data by Using Random Matrix Theory
نویسنده
چکیده
In a stock market, numerous stock prices move under a high level of randomness and some regularity. Some stocks exhibit strong correlation to other stocks. A strong correlation among eminent stocks should result in a visible global pattern. However, the networks of such correlation are unstable and the patterns are only temporal. In such a condition, a detailed description of the network may not be very useful, since the situation quickly changes and the past knowledge is no longer valid under the new environment. If, however, we have a methodology to extract, in a very short time, major components that characterize the motion of the market, it should give us a powerful tool to describe temporal characteristics of the market and help us to set up a time varying model to predict the future move of such market. Recently, there have been wide interest on a possible candidate for such a methodology using the eigenvalue spectrum of the equal-time correlation matrix between pairs of price time series of different stocks, in comparison to the corresponding matrix computed by means of random time series [1-4]. Plerau, et. al. [1,2] applied this technique on the daily close prices of stocks in NYSE and S&P500. We carry on the same line of study used in Ref. [1] for the intra-day price correlations on American stocks to extract principal components. We clarify the process in an explicit manner to set up our algorithm of RMT_PCM to be applied on intra-day price correlations. Based on this approach, we show how we track the trend change based on the results from year by year analysis. Here we extract significant principal components by picking a few distinctly large eigenvalues of cross correlation matrix of stock pairs in comparison to the known spectrum of corresponding random matrix derived in the random matrix theory (RMT). The criterion to separate signal from noise is the maximum value of the theoretical spectrum of We test the method using 1 hour data extracted from NYSE-TAQ database of tickwise stock prices, as well as daily close price and show that the result correctly reflect the actual trend of the market.
منابع مشابه
APPLICATION OF THE RANDOM MATRIX THEORY ON THE CROSS-CORRELATION OF STOCK PRICES
The analysis of cross-correlations is extensively applied for understanding of interconnections in stock markets. Variety of methods are used in order to search stock cross-correlations including the Random Matrix Theory (RMT), the Principal Component Analysis (PCA) and the Hierachical Structures. In this work, we analyze cross-crrelations between price fluctuations of 20 company stocks...
متن کاملDISCUSSION PAPERS IN STATISTICS AND ECONOMETRICS SEMINAR OF ECONOMIC AND SOCIAL STATISTICS UNIVERSITY OF COLOGNE No. 2/07 Tyler’s M-Estimator, Random Matrix Theory, and Generalized Elliptical Distributions with Applications to Finance
In recent publications standard methods of random matrix theory were applied to principal components analysis of high-dimensional financial data. We discuss the fundamental results and potential shortcomings of random matrix theory in the light of the stylized facts of empirical finance. Especially, our arguments are based on the impact of nonlinear dependencies such as tail dependence. After a...
متن کاملRandom matrix theory and estimation of high-dimensional covariance matrices
This projects aims to present significant results of random matrix theory in regards to the principal component analysis, including Wigner’s semicircular law and Marčenko-Pastur law describing limiting distribution of large dimensional random matrices. The work bases on the large dimensional data assumptions, where both the number of variables and sample size tends to infinity, while their rati...
متن کاملTyler's M-estimator, random matrix theory, and generalized elliptical distributions with applications to finance
In recent publications standard methods of random matrix theory were applied to principal components analysis of high-dimensional financial data. We discuss the fundamental results and potential shortcomings of random matrix theory in the light of the stylized facts of empirical finance. Especially, our arguments are based on the impact of nonlinear dependencies such as tail dependence. After a...
متن کاملExtracting Information from Interval Data Using Symbolic Principal Component Analysis
We address the definition of symbolic variance and covariance for random interval-valued variables, and present four known symbolic principal component estimation methods using a common insightful framework. In addition, we provide a simple explicit formula for the scores of the symbolic principal components, equivalent to the representation by Maximum Covering Area Rectangle. Furthermore, the ...
متن کامل