Profile specific Document Weighted approach using a New Term Weighting Measure for Author Profiling
Authors: T. Raghunadha Reddy, B. Vishnu Vardhan, P. Vijayapal Reddy
Number of views: 353
Author Profiling is a text classification technique to predict the demographic features like age, gender, native language, location, educational background of the authors by analyzing their writing styles. Term weight measures identify the term discriminators for classifying the documents by assigning suitable weights to the terms. In this work, a supervised unique term weight measure is proposed to measure the significance of each term in the document. The proposed term weight measure is compared with four benchmark term weight measures such as TF, TFIDF, tf * rf, iqf * qf * icf. The experimental results show that the proposed term weight measure achieved the best performance among all term weight measures. The existing models fail to capture the relationship between terms and documents. To overcome the problem of independence among the terms within the document, in this work a new model has proposed by using second order representations between documents and profiles. In the second order representation, initially the relation between the terms within the document has established then, recognize the relationship among the documents and profiles. The performance of the proposed model is compared with existing model using various classifiers on reviews corpus. The results show that the proposed approach with new term weight measure out performs for predicting gender, age and location of the authors.