Mohtasseb Billah, Haytham and Ahmed, Amr (2012) Two-layered Blogger identification model integrating profile and instance-based methods. Knowledge and Information Systems, 31 (1). pp. 1-21. ISSN 0219-1377
Full text not available from this repository. (Request a copy)Abstract
This paper introduces a two-layered framework that improves the result of authorship identification within larger sample numbers of bloggers as compared with earlier work. Previous studies are mainly divided into two categories: profile-based and instance-based methods. Each of these approaches has its advantages and limitations. The two-layered framework presented here integrates the two previous approaches and presents a new solution to a key problem in authorship identification, namely the drop in accuracy experienced as the number of authors increases. The paper begins by illustrating the regular instance-based core model and the investigated features. It then introduces a new psycholinguistic profile representation of authors, presents similarity grouping extraction over profiles, and applies blogger identification utilizing the two-layered approach. The results confirm the improvement introduced by the proposed two-layered approach against our regular classifier, as well as a selected baseline, for an extended number of users.
| Item Type: | Article |
|---|---|
| Additional Information: | This paper introduces a two-layered framework that improves the result of authorship identification within larger sample numbers of bloggers as compared with earlier work. Previous studies are mainly divided into two categories: profile-based and instance-based methods. Each of these approaches has its advantages and limitations. The two-layered framework presented here integrates the two previous approaches and presents a new solution to a key problem in authorship identification, namely the drop in accuracy experienced as the number of authors increases. The paper begins by illustrating the regular instance-based core model and the investigated features. It then introduces a new psycholinguistic profile representation of authors, presents similarity grouping extraction over profiles, and applies blogger identification utilizing the two-layered approach. The results confirm the improvement introduced by the proposed two-layered approach against our regular classifier, as well as a selected baseline, for an extended number of users. |
| Keywords: | Blog mining, Authorship identification, User representation, Group extraction, Profile modeling, Online Diaries mining |
| Subjects: | G Mathematical and Computer Sciences > G710 Speech and Natural Language Processing G Mathematical and Computer Sciences > G400 Computer Science G Mathematical and Computer Sciences > G720 Knowledge Representation |
| Divisions: | College of Sciences > Faculty of Science > Lincoln School of Computer Science |
| Depositing User: | Amr Ahmed |
| Date Deposited: | 30 Jan 2012 15:51 |
| Last Modified: | 09 Jan 2013 11:00 |
| URI: | http://eprints.lincoln.ac.uk/id/eprint/4890 |
Actions (login required)
![]() |
View Item |
