Previous Issue
Volume 3, September
 
 

Software, Volume 3, Issue 4 (December 2024) – 9 articles

Cover Story (view full-size image): Mathematics of Arrays (MoA) provides a formalism to design and verify algorithms based on arrays of data, at any level of detail. These simple data structures are ubiquitous in technical and scientific computations but also in machine learning contexts. MoA is designed to minimise copying of data, thus optimising memory use and computation time. Moreover, it provides a mapping of any resources onto arrays. This paper demonstrates how MoA's array operations fit naturally with an array-oriented programming language like modern Fortran, including memory management. It explores several questions about the resource usage of implementing the MoA operations in Fortran. The regular and concise encoding facilitates optimisation for specific hardware configurations. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Select all
Export citation of selected articles as:
8 pages, 837 KiB  
Communication
Dental Loop Chatbot: A Prototype Large Language Model Framework for Dentistry
by Md Sahadul Hasan Arian, Faisal Ahmed Sifat, Saif Ahmed, Nabeel Mohammed, Taseef Hasan Farook and James Dudley
Software 2024, 3(4), 587-594; https://doi.org/10.3390/software3040029 - 17 Dec 2024
Viewed by 444
Abstract
The Dental Loop Chatbot was developed as a real-time, evidence-based guidance system for dental practitioners using a fine-tuned large language model (LLM) and Retrieval-Augmented Generation (RAG). This paper outlines the development and preliminary evaluation of the chatbot as a scalable clinical decision-support tool [...] Read more.
The Dental Loop Chatbot was developed as a real-time, evidence-based guidance system for dental practitioners using a fine-tuned large language model (LLM) and Retrieval-Augmented Generation (RAG). This paper outlines the development and preliminary evaluation of the chatbot as a scalable clinical decision-support tool designed for resource-limited settings. The system’s architecture incorporates Quantized Low-Rank Adaptation (QLoRA) for efficient fine-tuning, while dynamic retrieval mechanisms ensure contextually accurate and relevant responses. This prototype lays the groundwork for future triaging and diagnostic support systems tailored specifically to the field of dentistry. Full article
Show Figures

Figure 1

18 pages, 568 KiB  
Article
A Fuzzing Tool Based on Automated Grammar Detection
by Jia Song and Jim Alves-Foss
Software 2024, 3(4), 569-586; https://doi.org/10.3390/software3040028 - 14 Dec 2024
Viewed by 247
Abstract
Software testing is an important step in the software development life cycle to ensure the quality and security of software. Fuzzing is a security testing technique that finds vulnerabilities automatically without accessing the source code. We built a fuzzer, called JIMA-Fuzzing, which is [...] Read more.
Software testing is an important step in the software development life cycle to ensure the quality and security of software. Fuzzing is a security testing technique that finds vulnerabilities automatically without accessing the source code. We built a fuzzer, called JIMA-Fuzzing, which is an effective fuzzing tool that utilizes grammar detected from sample input. Based on the detected grammar, JIMA-Fuzzing selects a portion of the valid user input and fuzzes that portion. For example, the tool may greatly increase the size of the input, truncate the input, replace numeric values with new values, replace words with numbers, etc. This paper discusses how JIMA-Fuzzing works and shows the evaluation results after testing against the DARPA Cyber Grand Challenge (CGC) dataset. JIMA-Fuzzing is capable of extracting grammar from sample input files, meaning that it does not require access to the source code to generate effective fuzzing files. This feature allows it to work with proprietary or non-open-source programs and significantly reduces the effort needed from human testers. In addition, compared to fuzzing tools guided with symbolic execution or taint analysis, JIMA-Fuzzing takes much less computing power and time to analyze sample input and generate fuzzing files. However, the limitation is that JIMA-Fuzzing relies on good sample inputs and works primarily on programs that require user interaction/input. Full article
(This article belongs to the Special Issue Software Reliability, Security and Quality Assurance)
Show Figures

Figure 1

20 pages, 682 KiB  
Article
RbfCon: Construct Radial Basis Function Neural Networks with Grammatical Evolution
by Ioannis G. Tsoulos, Ioannis Varvaras and Vasileios Charilogis
Software 2024, 3(4), 549-568; https://doi.org/10.3390/software3040027 - 11 Dec 2024
Viewed by 506
Abstract
Radial basis function networks are considered a machine learning tool that can be applied on a wide series of classification and regression problems proposed in various research topics of the modern world. However, in many cases, the initial training method used to fit [...] Read more.
Radial basis function networks are considered a machine learning tool that can be applied on a wide series of classification and regression problems proposed in various research topics of the modern world. However, in many cases, the initial training method used to fit the parameters of these models can produce poor results either due to unstable numerical operations or its inability to effectively locate the lowest value of the error function. The current work proposed a novel method that constructs the architecture of this model and estimates the values for each parameter of the model with the incorporation of Grammatical Evolution. The proposed method was coded in ANSI C++, and the produced software was tested for its effectiveness on a wide series of datasets. The experimental results certified the adequacy of the new method to solve difficult problems, and in the vast majority of cases, the error in the classification or approximation of functions was significantly lower than the case where the original training method was applied. Full article
Show Figures

Figure 1

15 pages, 341 KiB  
Article
Implementing Mathematics of Arrays in Modern Fortran: Efficiency and Efficacy
by Arjen Markus and Lenore Mullin
Software 2024, 3(4), 534-548; https://doi.org/10.3390/software3040026 - 30 Nov 2024
Viewed by 1115
Abstract
Mathematics of Arrays (MoA) concerns the formal description of algorithms working on arrays of data and their efficient and effective implementation in software and hardware. Since (multidimensional) arrays are one of the most important data structures in Fortran, as witnessed by their native [...] Read more.
Mathematics of Arrays (MoA) concerns the formal description of algorithms working on arrays of data and their efficient and effective implementation in software and hardware. Since (multidimensional) arrays are one of the most important data structures in Fortran, as witnessed by their native support in its language and the numerous operations and functions that take arrays as inputs and outputs, it is natural to examine how Fortran can be used as an implementation language for MoA. This article presents the first results, both in terms of code and of performance, regarding this union. It may serve as a basis for further research, both with respect to the formal theory of MoA and to improving the practical implementation of array-based algorithms. Full article
Show Figures

Figure 1

20 pages, 1489 KiB  
Article
Analysing Quality Metrics and Automated Scoring of Code Reviews
by Owen Sortwell, David Cutting and Christine McConnellogue
Software 2024, 3(4), 514-533; https://doi.org/10.3390/software3040025 - 29 Nov 2024
Viewed by 369
Abstract
Code reviews are an important part of the software development process, and there is a wide variety of approaches used to perform them. While it is generally agreed that code reviews are beneficial and result in higher-quality software, there has been little work [...] Read more.
Code reviews are an important part of the software development process, and there is a wide variety of approaches used to perform them. While it is generally agreed that code reviews are beneficial and result in higher-quality software, there has been little work investigating best practices and approaches, exploring which factors impact code review quality. Our approach firstly analyses current best practices and procedures for undertaking code reviews, along with an examination of metrics often used to analyse a review’s quality and current offerings for automated code review assessment. A maximum of one thousand code review comments per project were mined from GitHub pull requests across seven open-source projects which have previously been analysed in similar studies. Several identified metrics are tested across these projects using Python’s Natural Language Toolkit, including stop word ratio, overall sentiment, and detection of code snippets through the GitHub markdown language. Comparisons are drawn with regards to each project’s culture and the language used in the code review process, with pros and cons for each. The results show that the stop word ratio remained consistent across all projects, with only one project exceeding an average of 30%, and that the percentage of positive comments across the projects was broadly similar also. The suitability of these metrics is also discussed with regards to the creation of a scoring framework and development of an automated code review analysis tool. We conclude that the software written is an effective method of comparing practices and cultures across projects and can provide benefits by promoting a positive review culture within an organisation. However, rudimentary sentiment analysis and detection of GitHub code snippets may not be sufficient to assess a code review’s overall usefulness, as many terms that are important to include in a programmer’s lexicon such as ‘error’ and ‘fail’ deem a code review to be negative. Code snippets that are included outside of the markdown language are also ignored from analysis. Recommendations for future work are suggested, including the development of a more robust sentiment analysis system that can include detection of emotion such as frustration, and the creation of a programming dictionary to exclude programming terms from sentiment analysis. Full article
Show Figures

Figure 1

16 pages, 3578 KiB  
Article
Implementation and Performance Evaluation of Quantum Machine Learning Algorithms for Binary Classification
by Surajudeen Shina Ajibosin and Deniz Cetinkaya
Software 2024, 3(4), 498-513; https://doi.org/10.3390/software3040024 - 28 Nov 2024
Viewed by 478
Abstract
In this work, we studied the use of Quantum Machine Learning (QML) algorithms for binary classification and compared their performance with classical Machine Learning (ML) methods. QML merges principles of Quantum Computing (QC) and ML, offering improved efficiency and potential quantum advantage in [...] Read more.
In this work, we studied the use of Quantum Machine Learning (QML) algorithms for binary classification and compared their performance with classical Machine Learning (ML) methods. QML merges principles of Quantum Computing (QC) and ML, offering improved efficiency and potential quantum advantage in data-driven tasks and when solving complex problems. In binary classification, where the goal is to assign data to one of two categories, QML uses quantum algorithms to process large datasets efficiently. Quantum algorithms like Quantum Support Vector Machines (QSVM) and Quantum Neural Networks (QNN) exploit quantum parallelism and entanglement to enhance performance over classical methods. This study focuses on two common QML algorithms, Quantum Support Vector Classifier (QSVC) and QNN. We used the Qiskit software and conducted the experiments with three different datasets. Data preprocessing included dimensionality reduction using Principal Component Analysis (PCA) and standardization using scalers. The results showed that quantum algorithms demonstrated competitive performance against their classical counterparts in terms of accuracy, while QSVC performed better than QNN. These findings suggest that QML holds potential for improving computational efficiency in binary classification tasks. This opens the way for more efficient and scalable solutions in complex classification challenges and shows the complementary role of quantum computing. Full article
Show Figures

Figure 1

25 pages, 349 KiB  
Article
A Brief Overview of the Pawns Programming Language
by Lee Naish
Software 2024, 3(4), 473-497; https://doi.org/10.3390/software3040023 - 19 Nov 2024
Viewed by 310
Abstract
This paper describes the Pawns programming language, currently under development, which uses several novel features to combine the functional and imperative programming paradigms. It supports pure functional programming (including algebraic data types, higher-order programming and parametric polymorphism), where the representation of values need [...] Read more.
This paper describes the Pawns programming language, currently under development, which uses several novel features to combine the functional and imperative programming paradigms. It supports pure functional programming (including algebraic data types, higher-order programming and parametric polymorphism), where the representation of values need not be considered. It also supports lower-level C-like imperative programming with pointers and the destructive update of all fields of the structs used to represent the algebraic data types. All destructive update of variables is made obvious in Pawns code, via annotations on statements and in type signatures. Type signatures must also declare sharing between any arguments and result that may be updated. For example, if two arguments of a function are trees that share a subtree and the subtree is updated within the function, both variables must be annotated at that point in the code, and the sharing and update of both arguments must be declared in the type signature of the function. The compiler performs extensive sharing analysis to check that the declarations and annotations are correct. This analysis allows destructive update to be encapsulated: a function with no update annotations in its type signature is guaranteed to behave as a pure function, even though the value returned may have been constructed using destructive update within the function. Additionally, the sharing analysis helps support a constrained form of global variables that also allows destructive update to be encapsulated and safe update of variables with polymorphic types to be performed. Full article
Show Figures

Figure 1

31 pages, 445 KiB  
Article
Software Development and Maintenance Effort Estimation Using Function Points and Simpler Functional Measures
by Luigi Lavazza, Angela Locoro and Roberto Meli
Software 2024, 3(4), 442-472; https://doi.org/10.3390/software3040022 - 29 Oct 2024
Viewed by 634
Abstract
Functional size measures are widely used for estimating software development effort. After the introduction of Function Points, a few “simplified” measures have been proposed, aiming to make measurement simpler and applicable when fully detailed software specifications are not yet available. However, some practitioners [...] Read more.
Functional size measures are widely used for estimating software development effort. After the introduction of Function Points, a few “simplified” measures have been proposed, aiming to make measurement simpler and applicable when fully detailed software specifications are not yet available. However, some practitioners believe that, when considering “complex” projects, traditional Function Point measures support more accurate estimates than simpler functional size measures, which do not account for greater-than-average complexity. In this paper, we aim to produce evidence that confirms or disproves such a belief via an empirical study that separately analyzes projects that involved developments from scratch and extensions and modifications of existing software. Our analysis shows that there is no evidence that traditional Function Points are generally better at estimating more complex projects than simpler measures, although some differences appear in specific conditions. Another result of this study is that functional size metrics—both traditional and simplified—do not seem to effectively account for software complexity, as estimation accuracy decreases with increasing complexity, regardless of the functional size metric used. To improve effort estimation, researchers should look for a way of measuring software complexity that can be used in effort models together with (traditional or simplified) functional size measures. Full article
Show Figures

Figure 1

31 pages, 1991 KiB  
Article
Opening Software Research Data 5Ws+1H
by Anastasia Terzi and Stamatia Bibi
Software 2024, 3(4), 411-441; https://doi.org/10.3390/software3040021 - 26 Sep 2024
Viewed by 899
Abstract
Open Science describes the movement of making any research artifact available to the public, fostering sharing and collaboration. While sharing the source code is a popular Open Science practice in software research and development, there is still a lot of work to be [...] Read more.
Open Science describes the movement of making any research artifact available to the public, fostering sharing and collaboration. While sharing the source code is a popular Open Science practice in software research and development, there is still a lot of work to be done to achieve the openness of the whole research and development cycle from the conception to the preservation phase. In this direction, the software engineering community faces significant challenges in adopting open science practices due to the complexity of the data, the heterogeneity of the development environments and the diversity of the application domains. In this paper, through the discussion of the 5Ws+1H (Why, Who, What, When, Where, and How) questions that are referred to as the Kipling’s framework, we aim to provide a structured guideline to motivate and assist the software engineering community on the journey to data openness. Also, we demonstrate the practical application of these guidelines through a use case on opening research data. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop