Mohtasseb Billah, Haytham and Ahmed, Amr (2012) Two-layered Blogger identification model integrating profile and instance-based methods. Knowledge and Information Systems, 31 (1). pp. 1-21. ISSN 0219-1377
Full content URL: http://dx.doi.org/10.1007/s10115-011-0398-0
Full text not available from this repository.
|Item Status:||Live Archive|
This paper introduces a two-layered framework that improves the result of authorship identification within larger sample numbers of bloggers as compared with earlier work. Previous studies are mainly divided into two categories: profile-based and instance-based methods. Each of these approaches has its advantages and limitations. The two-layered framework presented here integrates the two previous approaches and presents a new solution to a key problem in authorship identification, namely the drop in accuracy experienced as the number of authors increases. The paper begins by illustrating the regular instance-based core model and the investigated features. It then introduces a new psycholinguistic profile representation of authors, presents similarity grouping extraction over profiles, and applies blogger identification utilizing the two-layered approach. The results confirm the improvement introduced by the proposed two-layered approach against our regular classifier, as well as a selected baseline, for an extended number of users.
|Keywords:||Blog mining, Authorship identification, User representation, Group extraction, Profile modeling, Online Diaries mining|
|Subjects:||G Mathematical and Computer Sciences > G710 Speech and Natural Language Processing|
G Mathematical and Computer Sciences > G400 Computer Science
G Mathematical and Computer Sciences > G720 Knowledge Representation
|Divisions:||College of Science > School of Computer Science|
|Deposited On:||30 Jan 2012 15:51|
Repository Staff Only: item control page