Aspect-Based Sentiment Analysis on Indonesian Presidential Election Using Deep Learning

The 2019 presidential election is a presidential election that has been a hot topic for some time since 2018. In predicting the winner of the 2019 presidential election, research has been carried out on Aspect-based Sentiment Analysis (ABSA) datasets using machine learning algorithms such as Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbors (KNN) and produces a fairly good accuracy. This study proposes a deep learning method using BERT (Bidirectional Encoder Representation Form Transformers) and RoBERTa (A Robustly Optimized BERT Pretraining Approach) models. The results of this study indicate that the indobenchmark BERT and RoBERTa base-Indonesian single label classification models on target features with preprocessing produce the best accuracy of 98.02%. The indolem BERT model and the indobenchmark single label classification on target features without preprocessing produce the best accuracy of 98.02%. The BERT indobenchmark single label classification model on aspect features with preprocessing produces the best accuracy of 74.26%. The BERT indolem single label classification model on aspect features without preprocessing produces the best accuracy of 74.26%. The BERT indolem single label classification model on the sentiment feature with preprocessing produces the best accuracy of 93.07%. The BERT indolem single label classification model on the sentiment feature without preprocessing produces the best accuracy of 94.06%. The BERT indobenchmark multi label classification model with preprocessing produces the best accuracy of 98.66%. The BERT indobenchmark multi label classification model without preprocessing produces the best accuracy of 98.66%.


INTRODUCTION
The 2019 presidential election is a presidential election that has been a hot topic of conversation for some time, and people have even talked about this topic since 2018 on the internet. Social media, such as Twitter and Facebook, have an important role because political parties use social media such as Twitter and Facebook to campaign and advertise their candidates. By using social media, political parties can obtain information that cannot be obtained from the media or traditional voting results, which can be used to predict election results (Suciati et al., 2019). In predicting the winner of the 2019 Indonesian presidential election, a dataset has been presented using Aspect-Based Sentiment Analysis (ABSA) which focuses on the character of the candidate. ABSA can help organizations become customercentric so that presidential candidates can listen and understand the voices of their people, evaluate, and learn from people's input and expectations. Data mining (DM) is a set of techniques and procedures for finding knowledge from various data sources such as data warehouses or relational databases, into an unformatted flat file created from predictive analysis using statistical study techniques to predict or anticipate certainty-based statistical measures. This knowledge can be classified in different rules and patterns that can help users/organizations to analyze collective data and predict decisions (Hamid Mughal, 2018) (Manjarres et al., 2018). Natural Language Processing (NLP) is also known as the field of computational linguistics, which involves engineering computational models and processes to solve practical problems in understanding human language (Otter et al., 2021). From a scientific perspective, NLP aims to model the cognitive mechanisms underlying human language comprehension and production. From an engineering perspective, NLP is concerned with developing new practical applications to facilitate the interaction between computers and human language. Sentiment analysis, also known as opinion mining, is a field within NLP that studies the computation of opinions, emotions, and subjectivity contained in text and often in negative, neutral and positive categories. Sentiments are collected, analyzed, and then summarized to produce real-time feedback (Hoang et al., 2019) (Sun et al., 2019). ABSA is a text mining method that organizes text into targets and aspects and then labels each as a sentiment polarity. This is different from the usual sentiment analysis which only recognizes the overall polarity of the text (Manik et al., 2020). Polarity gives the difference between the number of positive words and the number of negative words in each text divided by the number of sentiment words (Budiharto & Meiliana, 2018). Machine learning (ML) is a branch of Artificial Intelligence (AI) and is closely related to (and often overlaps with) computational statistics, which also focuses on making predictions using computers. Like DM, ML can also be unsupervised learning and is used to study and define basic behavior profiles for various entities and then use it to find meaningful anomalies (Xin et al., 2018). ML focuses on classification and regression based on known and previously learned features of training data. The engine learns a given pattern based on the data set and generates its own rules. When data is entered into the machine, the machine can recognize the data. Previous research has conducted on the 2019 presidential election ABSA dataset using machine learning algorithms such as the Support Vector Machine (SVM), Naive Bayes (NB), and K-Nearest Neighbors (KNN) and produced a fairly good accuracy on target class, aspect and sentiment.

Research Dataset
The dataset used in this study is the 2019 presidential election ABSA dataset which consists of 2019 instances and 5 features.   (56) Source: (Said & Manik, 2022)

Preprocessing
The first preprocessing in this research is data cleaning where the process is to homogenize all text into lowercase letters (lowercase), clean data containing url (http://), tabs and newlines, non ascii characters, punctuation, whitespace, reply. threads, numbers, and delimiters such as commas (,), and periods (.), but do not delete username tokens (@), do not delete emojis, and do not delete hashtag tokens (#) as shown in Figure 2. The last step is encoding. Two categorical encoding techniques are used, the first is Label Encoding which is used to convert labels to numeric form by assigning a unique number (starting from 0) to each data class. The second is One Hot Encoding which changes each categorical variable into a new column and each column contains the number "0" or "1" which corresponds to which column the label or category is in. (Waasiu et al., 2021) The use of these two categorical encoding techniques depends on what classification technique will be used when implementing the BERT and RoBERTa models (single label classification or multi label classification).

Implementation of BERT and RoBERTa
Before training the dataset, the dataset is adjusted to the input representation that will be received by BERT and RoBERTa. Therefore we need a tokenizer that aims to tokenize sentences and generate appropriate input. Sentences will be processed by the tokenizer to represent the input on BERT and also RoBERTa. In this study, the BERT model used is indolem and indobenchmark, while the RoBERTa model used is robert-base-indonesian.
Each sentence will be broken down into words using a wordpiece and will get the ID of the word. Each word will get a token that has become a system provision. Each sentence will also get a special token at the beginning and end of the sentence. In BERT the tokens used are [CLS] for tokens at the beginning of sentences and [SEP] for tokens at the end of sentences. While in RoBERTa the tokens used are <s> for tokens at the beginning of sentences and </s> for tokens at the end of sentences.
After that, the sentence is adjusted to the maximum length that has been determined using [PAD] on BERT and <pad> on RoBERTa. Because the maximum length specified is 32, both [PAD] and <pad> will be useful to fill in the blanks if the Paradigma, Vol. 24, No. 2, September 2022 P-ISSN 1410-5063, E-ISSN: 2579-3500 sentence is not up to 32 words. Sentence embedding is given to the sentence to distinguish the first sentence and the second sentence or padding. This stage can be done by giving the number 1 in the first sentence and giving the number 0 in the padding. The BERT tokenizer can find out which is the first sentence and which is the padding by looking at the [SEP] token which functions to separate the two sentences.
Positional embedding is also added for each token to indicate the position of each word in the sentence. This stage is done to find out the position of each word. So even though at the beginning of the sentence there is a word and at the end of the sentence there is also the same word but has a different meaning, it will not treat the token with the same meaning.
Source: (Said & Manik, 2022) Figure 4. The tokenization process using the BERT and RoBERTa models

Fine Tuning
This study performs fine tuning using hyperparameters on BERT and RoBBERTa, where the best accuracy results are obtained using hyperparameters as shown in Table 3.  (Said & Manik, 2022)

RESULTS AND DISCUSSION
Prior to classification, the dataset is divided into training datasets, validation datasets, and testing Paradigma, Vol. 24, No. 2, September 2022P-ISSN 1410-5063, E-ISSN: 2579 datasets. The training dataset is used to train the model. While the validation dataset is used to minimize overfitting. The testing dataset itself is used as a final test to see the accuracy of the network that has been trained with the training dataset.
After the data preprocessing process, fine tuning is carried out on the data using the following hyperparameters: a. Fine Tuning on target features:    Table 4 shows that the indobenchmark BERT and the Indonesian-based RoBERTa get the best accuracy on the target feature, which is 98.02%. Table 5 shows that the indobenchmark BERT gets the best accuracy on aspect features, which is 74.26%. Table 6 shows that BERT indolem gets the best accuracy on the sentiment feature, which is 93.07%.  Table 7 shows that indolem BERT and BERT get the best accuracy on the target feature, which is 98.02%. Table 8 shows that BERT indolem gets the best accuracy on aspect features, which is 74.26%. Table  9 shows that BERT indolem gets the best accuracy on the sentiment feature, which is 94.06%.  Table 10 and Table 11 show that the indobenchmark BERT gets the best accuracy in the multilabel classification technique with or without preprocessing.

CONCLUSION
The BERT and RoBERTa models are able to perform target, aspect and sentiment analysis tasks well, both using single label classification and multi label classification. BERT indobenchmark and RoBERTa base-Indonesian single label classification on target features with preprocessing resulted in the best accuracy of 98.02%. BERT indolem and indobenchmark single label classification on target features without preprocessing produce the best accuracy of 98.02%. BERT indobenchmark single label classification on aspect features with preprocessing produces the best accuracy of 74.26%. BERT indolem single label classification on aspect features without preprocessing produces the best accuracy of 74.26%. BERT indolem single label classification on the sentiment feature with preprocessing produces the best accuracy of 93.07%. BERT indolem single label classification on the sentiment feature without preprocessing produces the best accuracy of 94.06%. BERT indobenchmark multi label classification with preprocessing produces the best accuracy of 98.66%. BERT indobenchmark multi label classification without preprocessing produces the best accuracy of 98.66%.