Tuesday, March 02, 2021

 
These articles have been peer-reviewed and accepted for publication in JICT, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the JICT standard. Additionally, titles, authors, abstracts and keywords may change before publication.
 

  
 ADAPTIVE VARIABLE EXTRACTIONS WITH LDA FOR CLASSIFICATION OF MIXED VARIABLES, AND APPLICATIONS TO MEDICAL DATA
1Hashibah Hamid, 2Nor Idayu Mahat & 3Safwati Ibrahim
1&2 School of Quantitative Sciences, UUM College of Arts and Sciences, 06010 UUM Sintok, Kedah
3Institute of Engineering Mathematics, Universiti Malaysia Perlis, 02600 UniMAP Arau, Perlis
hashibah@uum.edu.my; noridayu@uum.edu.my; safwati@unimap.edu.my
 
 
ABSTRACT
 
The strategy surrounding the extraction of a number of mixed variables is examined in this paper in building a model for Linear Discriminant Analysis (LDA). Two methods for extracting crucial variables from a dataset with categorical and continuous variables were employed, namely, multiple correspondence analysis (MCA) and principal component analysis (PCA). However, in this case, direct use of either MCA or PCA on mixed variables is impossible due to restrictions on the structure of data that each method could handle. Therefore, this paper executes some adjustments including a strategy for managing mixed variables so that those mixed variables are equivalent in values. With this, both MCA and PCA can be performed on mixed variables simultaneously. The variables following this strategy of extraction were then utilised in the construction of the LDA model before applying them to classify objects going forward. The suggested models, using three real sets of medical data were then tested, where the results indicated that using a combination of the two methods of MCA and PCA for extraction and LDA could reduce the model’s size, having a positive effect on classifying and better performance of the model since it leads towards minimising the leave-one-out error rate. Accordingly, the models proposed in this paper, including the strategy that was adapted was successful in presenting good results over the full LDA model. Regarding the indicators that were used to extract and to retain the variables in the model, cumulative variance explained (CVE), eigenvalue, and a non-significant shift in the CVE (constant change), could be considered a useful reference or guideline for practitioners experiencing similar issues in future.
 
Keywords: Classification, Linear discriminant analysis, Multiple correspondence analysis, Mixed variables, Principal component analysis.
 

 
MITIGATING SLOW HYPERTEXT TRANSFER PROTOCOL DISTRIBUTED DENIAL OF SERVICE ATTACKS IN SOFTWARE DEFINED NETWORKS
Oluwatobi Shadrach Akanji, Opeyemi Aderiike Abisoye, Mohammed Awwal Iliyasu
Department of Computer Science, Federal University of Technology Minna, Nigeria
akanji.pg914786@st.futminna.edu.ng; o.abisoye@futminna.edu.ng; awwaliliyasu@gmail.com
 
 
ABSTRACT
 
Distributed Denial of Service (DDoS) attacks has been one of the persistent forms of attacks on information technology infrastructure connected to public networks due to the ease of access to DDoS attack tools. Researchers have been able to develop several techniques to curb volumetric DDoS which overwhelms the target with a large number of request packets. However, compared to slow DDoS, limited number of research has been executed on mitigating slow DDoS. Attackers have resorted to slow DDoS because it mimics the behaviour of a slow legitimate client thereby causing service unavailability.  This paper provides the scholarly community with an approach to boosting service availability in web servers under slow Hypertext Transfer Protocol (HTTP) DDoS attacks through attack detection using Genetic Algorithm and Support Vector Machine which facilitates attack mitigation in a Software-Defined Networking (SDN) environment simulated in GNS3. Genetic algorithm was used to select the Netflow features which indicates the presence of an attack and also determine the appropriate regularization parameter, C, and gamma parameter for the Support Vector Machine classifier. Results obtained showed that the classifier had detection accuracy, Area Under Receiver Operating Curve (AUC), true positive rate, false positive rate and a false negative rate of 99.89%, 99.89%, 99.95%, 0.18%, and 0.05% respectively. Also, the algorithm for subsequent implementation of the selective adaptive bubble burst mitigation mechanism was presented. This study contributes to the ongoing research in detecting and mitigating slow HTTP DDoS attacks with emphasis on the use of machine learning classification and meta-heuristic algorithms.
 
Keywords: Genetic Algorithm, Slow DDoS Mitigation, Slow Distributed Denial of Service, Software Defined Network, Support Vector Machine.
 

 
A SYNTACTIC-BASED SENTENCE VALIDATION TECHNIQUE FOR MALAY TEXT SUMMARIZER
1Suraya Alias, 1Mohd Shamrie Sainin, 2Siti Khaotijah Mohammad
1Faculty of Computing and Informatics, Universiti Malaysia Sabah, Malaysia
2School of Computer Sciences, Universiti Sains Malaysia, Malaysia 
suealias@ums.edu.my; shamrie@ums.edu.my; sitijah@usm.my
 
 
ABSTRACT
 
In the Automatic Text Summarization domain, a Sentence Compression (SC) technique is applied to the summary sentence to remove unnecessary words or phrases. The purpose of SC is to preserve the important information in the sentence and to remove the unnecessary ones without sacrificing the sentence's grammar. The existing development of Malay Natural Language Processing (NLP) tools is still under study with limited open access. The issue is the lack of a benchmark dataset in the Malay language to evaluate the quality of the summaries and to validate the compressed sentence produced by the summarizer model. Hence, our paper outlines a Syntactic-based Sentence Validation technique for Malay sentences by referring to the Malay Grammar Pattern. In this work, we propose a new derivation set of Syntactic Rules based on the Malay main Word Class to validate a Malay sentence that undergoes the SC procedure. We experimented using the Malay dataset of 100 new articles covering the Natural Disaster and Events domain to find the optimal compression rate and its effect on the summary content. An automatic evaluation using ROUGE (Recall-Oriented Understudy for Gisting Evaluation) produced a result with an average F-measure of 0.5826 and an average Recall value of 0.5925 with an optimum compression rate of 0.5 Confidence Conf value. Furthermore, a manual summary evaluation by a group of Malay experts on the grammaticality of the compressed summary sentence produced a good result of 4.11 and a readability score of 4.12 out of 5. This depicts the reliability of the proposed technique to validate the Malay sentence with promising summary content and readability results.
 
Keywords: Malay Text Summarization, Sentence Compression, Syntactic rules, POS, Parser.
 

 
A DISCOURSE-BASED INFORMATION RETRIEVAL FOR TAMIL LITERARY TEXTS
1Anita Ramalingam & 2Subalalitha Chinnaudayar Navaneethakrishnan
1,2Department of Computer Science and Engineering,SRM Institute of Science and Technology, India
anitaramalingam17@gmail.com, subalalitha@gmail.com 
 
 
ABSTRACT
 
Tamil literature has many valuable thoughts that can help the human community to lead a successful and a happy life. Tamil literary works are abundantly available and searched on the World Wide Web (WWW), but the existing search systems follow a keyword-based match strategy which fails to satisfy the user needs. This necessitates the demand for a focused Information Retrieval System that semantically analyses the Tamil literary text which will eventually improve the search system performance. This paper proposes a novel Information Retrieval framework that uses discourse processing techniques which aids in semantic analysis and representation of the Tamil Literary text. The proposed framework has been tested using two ancient literary works, the Thirukkural and Naladiyar, which were written during 300 BCE. The Thirukkural comprises 1330 couplets, each 7 words long, while the Naladiyar consists of 400 quatrains, each 15 words long.  The proposed system, tested with all the 1330 Thirukkural couplets and 400 Naladiyar quatrains, achieved a mean average precision (MAP) score of 89%. The performance of the proposed framework has been compared with Google Tamil search and a keyword-based search which is a substandard version of the proposed framework. Google Tamil search achieved a MAP score of 56% and keyword-based method achieved a MAP score of 62% which shows that the discourse processing techniques improves the search performance of an Information Retrieval system.
 
Keywords: Discourse Parser, Morphological Analyzer, Inverted indexing, Ranking, Tamil Information Retrieval.
 
 

Universiti Utara Malaysia Press
 Universiti Utara Malaysia, 06010 UUM Sintok
Kedah Darul Aman, MALAYSIA
Phone: +604-928 4816, Fax : +604-928 4792

Creative Commons License

All articles published in Journal of Information and Communication Technology (JICT) are licensed under a Creative Commons Attribution 4.0 International License.