Friday, October 22, 2021

 
These articles have been peer-reviewed and accepted for publication in JICT, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the JICT standard. Additionally, titles, authors, abstracts and keywords may change before publication.
 

 
ESTIMATION OF INFORMATION MEASURES FOR POWER-FUNCTION DISTRIBUTION IN PRESENCE OF OUTLIERS AND THEIR APPLICATIONS
1Amal Soliman Hassan, 1Elsayed Ahmed Elsherpieny & 2Rokaya Elmorsy Mohamed
1Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt
2Department of Mathematics, Statistics and Insurance, Sadat Academy for Management Sciences, Egypt
amal52_soliman; Elsayed.Elsherpieny @cu.edu.eg; Rokaya.Elmorsy@sadatacademy.edu.eg 
 
 
ABSTRACT
 
Entropy measurement plays an important role in the field of information theory. Furthermore, the estimation of entropy is an important problem in statistics and machine learning. In this study, we estimate the Rényi and q-entropies of a power-function distribution in the presence of s outliers using classical and Bayesian procedures. In the classical method, we obtain the maximum likelihood estimators of the entropies and assess their performance through a numerical study. In the Bayesian method, we obtain the Bayesian estimators of the entropies under uniform and gamma priors based on different loss functions. The Bayesian estimators are computed empirically using a Monte Carlo simulation based on the Gibbs sampling algorithm. The simulated data sets are analyzed to investigate the accuracy of the estimates. The study results show that the precision of the maximum likelihood and Bayesian estimates of both entropies improves with increasing the sample size and the number of outliers. The absolute biases and the mean squared errors of the estimates in the presence of outliers exceed those of the corresponding estimates in the homogenous case (no-outliers). Further, the Bayesian estimates of the Rényi and q-entropies under the squared error loss function are preferable to the other Bayesian estimates in a majority of the cases. Finally, analysis results of real data examples are consistent with those of the simulated data.
 
Keywords: Bayesian estimators, Maximum likelihood estimators, Outliers, Power-function distribution, Rényi entropy.
 


BUTTERFLY TRIPLE SYSTEM ALGORITHM BASED ON GRAPH THEORY
Raja'i Aldiabat, Haslinda Ibrahim & Sharmila Karim
School of Quantitative Sciences, Universiti Utara Malaysia, Malaysia
rajae227@yahoo.com; linda@uum.edu.my; mila@uum.edu.my
 
 
ABSTRACT
 
In combinatorial design theory, clustering elements into a set of three elements is the heart of classifying data, which has recently received considerable attention in the fields of network algorithms, cryptography, design and analysis of algorithms, statistics, and Information theory. This article provides insight into formulating algorithm for a new type of triple system, called a Butterfly triple system. Basically, in this algorithm development, a starter of cyclic near-resolvable ((v-1)/2)-cycle system of the 2-fold complete graph 2Kvis employed to construct the starter of cyclic ((v-1)/2)-star decomposition of 2Kv. These starters are then decomposed into triples and classified as a starter of cyclic Butterfly triple. The obtained starter set generates a triple system of order v. A special reference for case ν ≡ 9 (mod 12) is presented to demonstrate the development of the Butterfly triple system.
 
Keywords: Cyclic triple system, Graph decompositions, λ-fold complete graph.
 

 
BAYESIAN TWO-SIDED COMPLETE GROUP CHAIN SAMPLING PLAN FOR BINOMIAL DISTRIBUTION USING BETA PRIOR THROUGH QUALITY REGIONS
1Waqar Hafeez & 1,2Nazrina Aziz
1School of Quantitative Sciences, Universiti Utara Malaysia, Malaysia
2Institute of Strategic Industrial Decision Modelling (ISIDM), Universiti Utara Malaysia, Malaysia
waqar_hafeez@ahsqs.uum.edu.my; nazrina@uum.edu.my
 
 
ABSTRACT
 
Acceptance sampling is a technique for statistical quality assurance based on the inspection of a random sample to decide the lot disposition: accept or reject. Producer’s risk and consumer’s risk are inevitable in acceptance sampling. Most conventional plans only focus on minimizing the consumer’s risk. This study focuses on minimizing both producer’s and consumer’s risks through the quality region. Experts from available historical knowledge concur that Bayesian is the best approach to make the correct decision. In this study, a Bayesian two-sided complete group chain sampling plan (BTSCGChSP) is proposed for the average probability of acceptance. The binomial distribution with beta prior is used to derive the probability of lot acceptance. For selected design parameters in BTSCGChSP, the acceptable quality level (AQL) and limiting quality level (LQL) are considered to estimate quality regions that are directly associated with producer’s and consumer’s risks, respectively. Four quality regions: (i) quality decision region (QDR), (ii) probabilistic quality region (PQR), (iii) limiting quality region (LQR) and (iv) indifference quality region (IQR), are considered. To compare with existing BGChSP, operating characteristic curves are used for the same parameter values and probability of lot acceptance. Findings explain that BTSCGChSP provides smaller proportion of defectives than BGChSP for same probability of acceptance. If quality regions are found for the same values of consumer and producer risks, then the BTSCGChSP region will contain fewer defectives than in the BGChSP region. Hence, in industrial practitioners, the proposed plan is a better substitute for existing BGChSP and other traditional GCSP.
 
Keywords: Acceptance sampling, Bayesian group chain, beta distribution, binomial distribution, quality region.
 

 
PRE-TRAINED BIDIRECTIONAL ENCODER REPRESENTATIONS FROM TRANSFORMERS CHECKPOINTS FOR INDONESIAN ABSTRACTIVE TEXT SUMMARIZATION
Henry Lucky & Derwin Suhartono
Computer Science Department, Bina Nusantara University, Jakarta, Indonesia
henry.lucky@binus.ac.id; dsuhartono@binus.edu
 
 
ABSTRACT
 
Text summarization aims to reduce text by removing less useful information to get information quickly and precisely. In Indonesian abstractive text summarization, the research mostly focuses on multi-document summarization which methods will not work optimally in single-document summarization. As the public summarization dataset and works in English are focusing on single-document summarization, this study focuses on Indonesian single-document summarization. Since abstractive text summarization study in English frequently uses Bidirectional Encoder Representations from Transformers (BERT), and Indonesian BERT checkpoint is available, therefore we use Indonesian BERT in our study. This study investigates the use of Indonesian BERT in abstractive text summarization on the IndoSum dataset using the BERTSum model. The investigation went further on using various combinations of model encoders, model embedding sizes, and model decoders. Evaluation results show that models with more embedding size and using Generative Pre-Training (GPT)-like decoder can improve Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score and BERTScore of the model results.
 
Keywords:  Abstractive text summarization, BERTSum model, BERT Score, GPT-like decoder, ROUGE score
 

 
SELECTIVE IMAGE SEGMENTATION MODELS USING THREE DISTANCE FUNCTIONS
Siti Aminah Abdullah & Abdul Kadir Jumaat
Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Malaysia
2020534771@student.uitm.edu.my; abdulkadir@tmsk.uitm.edu.my
 
 
ABSTRACT
 
Image segmentation can be defined as partitioning an image that contains multiple segments of meaningful parts for further processing. Global segmentation is concerned with segmenting the whole object of an observed image. Meanwhile, the selective segmentation model is concerned with segmenting a specific object required to be extracted. The Convex Distance Selective Segmentation (CDSS) model, which uses the Euclidean distance function as the fitting term, was proposed in 2015. However, the Euclidean distance function takes time to compute. This paper proposes the reformulation of the CDSS minimization problem by changing the fitting term with three popular distance functions, namely Chessboard, City Block, and Quasi-Euclidean. The proposed models are CDSSNEW1, CDSSNEW2, and CDSSNEW3, which apply the Chessboard, City Block, and Quasi-Euclidean distance functions respectively. In this study, the Euler-Lagrange (EL) equations of the proposed models were derived and solved using the Additive Operator Splitting method. Then, MATLAB coding was developed to implement the proposed models. The accuracy of the segmented image was evaluated using the Jaccard (JSC) and Dice Similarity Coefficients (DSC). The execution time was recorded to measure the efficiency of the models. Numerical results showed that the proposed CDSSNEW1 model based on the Chessboard distance function could segment the specific object successfully for all grayscale images with the fastest execution time compared to other models.
 
Keywords: Active contour, convex distance selective segmentation, convex functional, selective variational image segmentation.
 


ENSEMBLE FEED-FORWARD NEURAL NETWORK AND SUPPORT VECTOR MACHINE FOR PREDICTION OF MULTICLASS MALARIA INFECTION
1Rasheed Gbenga Jimoh, 2 Opeyemi Aderiike Abisoye & 3 Muhammed Mubashir Babatunde Uthman
1Department of Computer Science, University of Ilorin, Nigeria
2 Department of Computer Science, Federal University of Technology, Nigeria
3 Department of Epidemiology and Community Health, University of Ilorin, Nigeria
jimoh_rasheed@unilorin.edu.ng; o.abisoye@futminna.edu.ng; uthman.mb@unilorin.edu.ng
 
 
ABSTRACT
 
Globally, recent researches are focusing on developing appropriate and robust algorithms to provide robust health care system that is versatile and accurate. Existing malaria models are plagued with low rate of convergence, over-fittings, limited generalization due to restriction to binary cases prediction, proneness to local minimum errors in finding reliable testing output due to complexity of features in the feature space which is black box in nature. This study adopts a stacking method of heterogeneous ensemble learning of Artificial Neural Network (ANN) and Support Vector Machine (SVM) algorithms to predict multiclass, symptomatic and climatic malaria infection. ANN produced 48.33% Accuracy (Acc), 60.61% Sensitivity (Ss) and 45.58% Specificity (Sp). SVM with Gaussian kernel function (rbf) gave better performance result of 85.60% Accuracy (Acc), 84.06% Sensitivity (Ss) and 86.09%, Specificity (Sp). Consequently, to improve prediction performance, a stacking method was introduced to ensemble SVM with ANN. The proposed ensemble malaria model was tuned on different threshold but at threshold value 0.60, the ensemble model gave an optimum Accuracy (Acc) of 99.86%, sensitivity (Ss) 100%, specificity (Sp) 98.68% and mean square error 0.14. The ensemble model experimental results indicate that stacked multiple classifiers produces better result than a single model. This research demonstrated the efficiency of heterogeneous stacking ensemble model on effects of climatic variations on multiclass malaria infection classification. Also, the model reduces the complexity, over-fitting, low rate of convergence and proneness to local minimum error problems of multiclass malaria infection in comparison with previous related models.
 
Keywords: Artificial Neural Network, Data Mining, Ensemble, Malaria Infection, Support Vector Machine
 

Universiti Utara Malaysia Press
 Universiti Utara Malaysia, 06010 UUM Sintok
Kedah Darul Aman, MALAYSIA
Phone: +604-928 4816, Fax : +604-928 4792

Creative Commons License

All articles published in Journal of Information and Communication Technology (JICT) are licensed under a Creative Commons Attribution 4.0 International License.