Applying Multi-Dimensional Analysis to a Russian Webcorpus: Searching for Evidence of Genres
نویسندگان
چکیده
The paper presents an application of Multidimensional (MD) analysis initially developed for the analysis of register variation in English (Biber, 1988) to the investigation of a genre diverse corpus, which was built from modern texts of the Russian Web. The analysis is based on the idea that each linguistic feature has different frequencies in different registers, and statistically stable co-occurrence of linguistic features across texts can be used for automatic identification of texts with similar communicative functions. By using a software tool which counts a set of linguistic features in texts in Russian and by performing factor analysis in R, we identified six dimensions of variation. These dimensions show significant similarities with Biber's original dimensions of variation. We studied the distribution of texts in the space of the dimensions of our factors and investigated their link to 17 externally defined Functional Text Dimensions (Forsyth and Sharoff, 2014), which were assigned to each text of the corpus by a group of annotators. The results show that dimensions of linguistic feature variation can be used for better understanding of the genre structure of the Russian Web.
منابع مشابه
INTUITIONISTIC FUZZY DIMENSIONAL ANALYSIS FOR MULTI-CRITERIA DECISION MAKING
Dimensional analysis, for multi-criteria decision making, is a mathematical method that includes diverse heterogeneous criteria into a single dimensionless index. Dimensional Analysis, in its current definition, presents the drawback to manipulate fuzzy information commonly presented in a multi-criteria decision making problem. To overcome such limitation, we propose two dimensional analysis ba...
متن کاملThe modified degenerate kernel method for the multi-dimensional Fredholm integral equations of the second kind
In this paper, to investigate the multi-dimensional Fredholm integral equations of the second kind a modified degenerate kernel method is used. To construct the mentioned modified, the source function is approximated by the same method which employed to obtain a degenerate approximation of the kernel. The Lagrange interpolation method is used to make the needed approximations. The error and ...
متن کاملAn Estimation of The Impact of Economic Sanctions and Oil Price Shocks on Iran-Russian Trade: Evidence from a Gravity- VEC Approach
Abstract This article is an empirical attempt to explore the relationship between sanctions (financial and non-financial), oil price shocks and Iran-Russian bilateral trade flows over the period 1991–2014. In contrast to earlier studies in which a gravity model has been estimated through a panel data approach, in this paper the authors apply a gravity model for only two countries and do the es...
متن کاملParalympic Judo: Is there Evidence for Match Rigging among Athletes with Disabilities?
Objectives. This paper studies the existence or non-existence of match-fixing (or rigging) among judo wrestlers (judoka) with disabilities during the consecutive Paralympic Games from 1988 until 2016. Methods. In our analysis, we use the institutional framework that makes it easy understand and model the incentives of the wrestlers using the readily available data. Our data set consists of off...
متن کاملSemantic Multi-modal Analysis, Structuring, and Visualization for Candid Personal Interaction Videos
Videos are rich in multimedia content and semantics, which should be used by video browsers to better present the audio-visual information to the viewer. Ubiquitous video players allow for content to be scanned linearly, rarely providing summaries or methods for searching. Through analysis of audio and video tracks, it is possible to extract text transcripts from audio, displayed text from vide...
متن کامل