Unsupervised Author Identification and Characterization
نویسندگان
چکیده
Author identification is a hot topic, especially in the Internet age. Following our previous work in which we proposed a novel approach to this problem, based on relational representations that take into account the structure of sentences, here we present a tool that computes and visualizes a numerical and graphical characterization of the authors/texts based on several linguistic features. This tool, that extends a previous language analysis tool, is the ideal complement to the author identification technique, that is based on a clustering procedure whose outcomes (i.e., the authors’ models) are not human-readable. Both approaches are unsupervised, which allows them to tackle problems to which other state-of-the-art systems are not applicable.
منابع مشابه
A Relational Unsupervised Approach to Author Identification
In the last decades speaking and writing habits have changed. Many works faced the author identification task by exploiting frequentist approaches, numeric techniques or writing style analysis. Following the last approach we propose a technique for author identification based on First-Order Logic. Specifically, we translate the complex data represented by natural language text to complex (relat...
متن کاملLarge Deformation Characterization of Mouse Oocyte Cell Under Needle Injection Experiment
In order to better understand the mechanical properties of biological cells, characterization and investigation of their material behavior is necessary. In this paper hyperelastic Neo-Hookean material is used to characterize the mechanical properties of mouse oocyte cell. It has been assumed that the cell behaves as continuous, isotropic, nonlinear and homogenous material for modeling. Then, by...
متن کاملDecision Support in Knowledge Acquisition: Concept Characterization Using Genetic Algorithms
We demonstrate the use of an unsupervised learning technique called genetic algorithms to discover the association between a concept and its key attributes in concept characterization. The resulting conceptattribute associations are important domain concepts for knowledge engineers to structure interviews with the experts or to prepare representative data for inductive inference. Examples based...
متن کاملطبقه بندی و شناسایی رخسارههای زمینشناسی با استفاده از دادههای لرزه نگاری و شبکههای عصبی رقابتی
Geological facies interpretation is essential for reservoir studying. The method of classification and identification seismic traces is a powerful approach for geological facies classification and distinction. Use of neural networks as classifiers is increasing in different sciences like seismic. They are computer efficient and ideal for patterns identification. They can simply learn new algori...
متن کاملVote/Veto Classification, Ensemble Clustering and Sequence Classification for Author Identification
The Author Identification task for PAN 2012 consisted of three different sub-tasks: traditional authorship attribution, authorship clustering and sexual predator identification. We developed three machine learning approaches for these tasks. For the two authorship related tasks we created various sets of feature spaces, where individual differences in writing styles are assumed to surface in ju...
متن کامل