01 - 07
Text Classification on Dataset of Marine and Fisheries Sciences Domain using Random Forest Classifier
Authors: Desi Ramayanti, Umniy Salamah
Number of views: 29
The number of text documents contained opinions and suggestions are increasing and challenging to interpret one by one. Whereas if the text documents are processed and properly interpreted, this text document can provide a general overview of a particular case, organization, or object. This research focused on text classification on marine and fisheries domain by analyzing the Twitter data related to Ministry of Marine Affairs and Fisheries, Republic of Indonesia. By using random forest algorithm, this research will classify text documents whether classified as complaint or non-complaint based on existing data in social media in order to support follow up to the complaint. Related work of random forest algorithm included Bosch, Zisserman, and Muoz (2007); Schroff, Criminisi, and Zisserman (2008); Kuznetsova, Leal-Taix�, and Rosenhahn (2013); Shotton et al., (2013); Joshi, Monnier, Betke, and Sclaroff (2017) has been used as references to completed this research. The phase of this research including data acquisition, data pre-processing, feature selection, classification and classifier evaluation. As the result, we foundbest performance is achieved when we use parameters with values, i.e. 'bootstrap': False, 'min_samples_leaf': 1, 'n_estimators': 10, 'min_samples_split': 3, 'criterion': 'entropy', 'max_features': 3, 'max_depth': None. The best score is achieved in this experiment is 0.956063268893 using those parameter values with computational time required to tune parameters is 109.399434 second.