Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (30)

Search Parameters:
Keywords = hardware performance counter

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 960 KB  
Article
Design of Constant Modulus Radar Waveform for PSD Matching Based on MM Algorithm
by Hao Zheng, Chaojie Qiu, Chenyu Liang and Junkun Yan
Remote Sens. 2025, 17(11), 1937; https://doi.org/10.3390/rs17111937 - 3 Jun 2025
Viewed by 420
Abstract
The power spectral density (PSD) shape of the transmit waveform plays an important role in some fields of radar, such as electronic counter-countermeasures (ECCM), target detection, and target classification. In addition, radar hardware generally requires the waveform to have constant modulus (CM) characteristics. [...] Read more.
The power spectral density (PSD) shape of the transmit waveform plays an important role in some fields of radar, such as electronic counter-countermeasures (ECCM), target detection, and target classification. In addition, radar hardware generally requires the waveform to have constant modulus (CM) characteristics. Therefore, it is a significant problem to synthesize the discrete-time CM waveform from a given PSD. To address this problem, some algorithms have been proposed in the existing literature. In this paper, based on the majorization–minimization (MM) framework, a novel algorithm is proposed to solve this problem. The proposed algorithm can be proved to converge to the stationary point, and the error reduction property can be obtained without the unitary requirements on the discrete Fourier transform (DFT) matrix. To accelerate the convergence rate of the proposed algorithm, three acceleration schemes are developed for the proposed algorithm. Considering a specific algorithm stopping condition, one of the proposed acceleration schemes shows better computation efficiency than the existing algorithms and is more robust to the initial points. Besides, when the DFT matrix is not unitary, the numerical results show that the proposed acceleration scheme has better matching performance compared with the existing algorithms. Full article
Show Figures

Figure 1

16 pages, 3050 KB  
Article
Reliability Improvement of 28 nm Intel FPGA Ring Oscillator PUF for Chip Identification
by Zulfikar Zulfikar, Hubbul Walidainy, Aulia Rahman and Kahlil Muchtar
Cryptography 2025, 9(2), 36; https://doi.org/10.3390/cryptography9020036 - 29 May 2025
Viewed by 1132
Abstract
The Ring Oscillator Physical Unclonable Function (RO-PUF) is a hardware security innovation that creates a secure and distinct identifier by utilizing the special physical properties of ring oscillators. Their unique response, low hardware overhead, and difficulty of reproduction are some of the security [...] Read more.
The Ring Oscillator Physical Unclonable Function (RO-PUF) is a hardware security innovation that creates a secure and distinct identifier by utilizing the special physical properties of ring oscillators. Their unique response, low hardware overhead, and difficulty of reproduction are some of the security benefits that make them valuable in safe authentication systems. Numerous developments, such as temperature adjustment methods, aging mitigation, and better architecture and layout, have been created to increase its security, dependability, and efficiency. However, achieving the sacrifice metric makes it challenging to implement with additional complex circuits. This work focuses on stability improvement in terms of the reliability of the RO-PUF in enhanced challenge and response (CRP) by exploiting existing on-chip hard processors. This work establishes only ROs and their counters inside the chip. The built-in microprocessor performs the remaining process using the intermediary process of a Q factor and new frequency mapping. As a result, the reliability improves significantly to 95.8% compared to previous methods. The proper use of resources due to the limitation of on-chip resources has been emphasized by considering that a hard processor exists inside the new FPGA chip. Full article
(This article belongs to the Section Hardware Security)
Show Figures

Figure 1

15 pages, 2874 KB  
Article
Optimized Hybrid Central Processing Unit–Graphics Processing Unit Workflow for Accelerating Advanced Encryption Standard Encryption: Performance Evaluation and Computational Modeling
by Min Kyu Yang and Jae-Seung Jeong
Appl. Sci. 2025, 15(7), 3863; https://doi.org/10.3390/app15073863 - 1 Apr 2025
Cited by 1 | Viewed by 1116
Abstract
This study addresses the growing demand for scalable data encryption by evaluating the performance of AES (Advanced Encryption Standard) encryption and decryption using CBC (Cipher Block Chaining) and CTR (Counter Mode) modes across various CPU (Central Processing Unit) and GPU (Graphics Processing Unit) [...] Read more.
This study addresses the growing demand for scalable data encryption by evaluating the performance of AES (Advanced Encryption Standard) encryption and decryption using CBC (Cipher Block Chaining) and CTR (Counter Mode) modes across various CPU (Central Processing Unit) and GPU (Graphics Processing Unit) hardware models. The objective is to highlight GPU acceleration benefits and propose an optimized hybrid CPU–GPU workflow for large-scale data security. Methods include benchmarking encryption performance with provided data, mathematical models, and computational analysis. The results indicate significant performance gains with GPU acceleration, particularly for large datasets, and demonstrate that the hybrid CPU–GPU approach balances speed and resource utilization efficiently. Full article
Show Figures

Figure 1

18 pages, 8322 KB  
Article
Evaluating Large Language Model Application Impacts on Evasive Spectre Attack Detection
by Jiajia Jiao, Ling Jiang, Quan Zhou and Ran Wen
Electronics 2025, 14(7), 1384; https://doi.org/10.3390/electronics14071384 - 29 Mar 2025
Cited by 1 | Viewed by 579
Abstract
This paper investigates the impact of different Large Language Models (DeepSeek, Kimi and Doubao) on the attack detection success rate of evasive Spectre attacks while accessing text, image, and code tasks. By running different Large Language Models (LLMs) tasks concurrently with evasive Spectre [...] Read more.
This paper investigates the impact of different Large Language Models (DeepSeek, Kimi and Doubao) on the attack detection success rate of evasive Spectre attacks while accessing text, image, and code tasks. By running different Large Language Models (LLMs) tasks concurrently with evasive Spectre attacks, a unique dataset with LLMs noise was constructed. Subsequently, clustering algorithms were employed to reduce the dimension of the data and filter out representative samples for the test set. Finally, based on a random forest detection model, the study systematically evaluated the impact of different task types on the attack detection success rate. The experimental results indicate that the attack detection success rate follows the pattern of “code > text > image” in both the evasive Spectre memory attack and the evasive Spectre nop attack. To further assess the influence of different architectures on evasive Spectre attacks, additional experiments were conducted on an NVIDIA RTX 3060 GPU. The results reveal that, on the RTX 3060, the attack detection success rate for code tasks decreased, while those for text and image tasks increased compared to the 2080 Ti. This finding suggests that architectural differences impact the manifestation of Hardware Performance Counters (HPCs), influencing the attack detection success rate. Full article
Show Figures

Figure 1

23 pages, 1784 KB  
Article
FPGA Implementation of Reaction Systems
by Zeyi Shang, Sergey Verlan, Jing Lu, Zhe Wei and Min Zhou
Electronics 2024, 13(24), 4929; https://doi.org/10.3390/electronics13244929 - 13 Dec 2024
Cited by 1 | Viewed by 962
Abstract
Reaction system (RS) belongs to a type of qualitative computing model inspired by biochemical reactions taking place inside biological cells. It concerns more the interactions and causality among reactions rather than concrete concentrations of chemical entities. Many biochemical processes and models can be [...] Read more.
Reaction system (RS) belongs to a type of qualitative computing model inspired by biochemical reactions taking place inside biological cells. It concerns more the interactions and causality among reactions rather than concrete concentrations of chemical entities. Many biochemical processes and models can be represented in the form of reaction systems so that complex relations and ultimate products of a variety of reactions can be revealed qualitatively. The reaction system works in parallel mode. Software simulation of this kind of model may suffer from the penalty of inefficient parallelism for the limited performance of CPU/GPU, especially for the simulation of large-scale models. Considering potential applications of reaction systems in disease diagnoses and in drug developments, hardware implementation of reaction systems provides a better way to accelerate computations involved. In this paper, an FPGA implementation method of a reaction system called RSFIM is proposed. Two small-scale models, i.e., the reaction system of intermediate filaments self-assembly and heat shock response, are implemented on FPGA, achieving a computing speed of 2×108 steps per second. For large-scale models, the ErbB reaction system is implemented, obtaining a speedup of 7.649×104 compared with its highest performance GPU simulation so far. The reaction system binary counter, which is a quantitative model, is also implemented by the Boolean explanation of the qualitative character of the reaction system. FPGA implementation of reaction systems opens a novel research line to speed up the simulations of reaction systems and other biological models in the perspective of parallel digital circuits. Full article
Show Figures

Figure 1

22 pages, 3889 KB  
Article
Malware Classification Using Few-Shot Learning Approach
by Khalid Alfarsi, Saim Rasheed and Iftikhar Ahmad
Information 2024, 15(11), 722; https://doi.org/10.3390/info15110722 - 11 Nov 2024
Cited by 2 | Viewed by 2642
Abstract
Malware detection, targeting the microarchitecture of processors, has recently come to light as a potentially effective way to improve computer system security. Hardware Performance Counter data are used by machine learning algorithms in security mechanisms, such as hardware-based malware detection, to categorize and [...] Read more.
Malware detection, targeting the microarchitecture of processors, has recently come to light as a potentially effective way to improve computer system security. Hardware Performance Counter data are used by machine learning algorithms in security mechanisms, such as hardware-based malware detection, to categorize and detect malware. It is crucial to determine whether or not a file contains malware. Many issues have been brought about by the rise in malware, and businesses are losing vital data and dealing with other issues. The second thing to keep in mind is that malware can quickly cause a lot of damage to a system by slowing it down and encrypting a large amount of data on a personal computer. This study provides extensive details on a flexible framework related to machine learning and deep learning techniques using few-shot learning. Malware detection is possible using DT, RF, LR, SVM, and FSL techniques. The logic is that these algorithms make it simple to differentiate between files that are malware-free and those that are not. This indicates that their goal is to reduce the number of false positives in the data. For this, we use two different datasets from an online platform. In this research work, we mainly focus on few-shot learning techniques by using two different datasets. The proposed model has an 97% accuracy rate, which is much greater than that of other techniques. Full article
Show Figures

Figure 1

19 pages, 2027 KB  
Article
T-Smade: A Two-Stage Smart Detector for Evasive Spectre Attacks Under Various Workloads
by Jiajia Jiao, Ran Wen and Yulian Li
Electronics 2024, 13(20), 4090; https://doi.org/10.3390/electronics13204090 - 17 Oct 2024
Cited by 3 | Viewed by 1333
Abstract
Evasive Spectre attacks have used additional nop or memory delay instructions to make effective hardware performance counter based detectors with lower attack detection successful rate. Interestingly, the detection performance gets worse under different workloads. For example, the attack detection successful rate is only [...] Read more.
Evasive Spectre attacks have used additional nop or memory delay instructions to make effective hardware performance counter based detectors with lower attack detection successful rate. Interestingly, the detection performance gets worse under different workloads. For example, the attack detection successful rate is only 59.8% for realistic applications, while it is much lower 27.52% for memory stress test. Therefore, this paper proposes a two-stage smart detector T-Smade designed for evasive Spectre attacks (e.g., evasive Spectre nop and evasive Spectre memory) under various workloads. T-Smade uses the first-stage detector to identify the type of workloads and then selects the appropriate second-stage detector, which uses four hardware performance counter events to characterize the high cache miss rate and low branch miss rate of Spectre attacks. More importantly, the second stage detector adds one dimension of reusing cache miss rate and branch miss rate to exploit the characteristics of various workloads to detect evasive Spectre attacks effectively. Furthermore, to achieve the good generalization for more unseen evasive Spectre attacks, the proposed classification detector T-Smade is trained by the raw data of Spectre attacks and non-attacks in different workloads using simple Multi-Layer Perception models. The comprehensive results demonstrate that T-Smade makes the average attack detection successful rate of evasive Spectre nop under different workload return from 27.52% to 95.42%, and that of evasive Spectre memory from 59.8% up to 100%. Full article
Show Figures

Figure 1

23 pages, 4213 KB  
Article
Leveraging Bit-Serial Architectures for Hardware-Oriented Deep Learning Accelerators with Column-Buffering Dataflow
by Xiaoshu Cheng, Yiwen Wang, Weiran Ding, Hongfei Lou and Ping Li
Electronics 2024, 13(7), 1217; https://doi.org/10.3390/electronics13071217 - 26 Mar 2024
Cited by 4 | Viewed by 2676
Abstract
Bit-serial neural network accelerators address the growing need for compact and energy-efficient deep learning tools. Traditional neural network accelerators, while effective, often grapple with issues of size, power consumption, and versatility in handling a variety of computational tasks. To counter these challenges, this [...] Read more.
Bit-serial neural network accelerators address the growing need for compact and energy-efficient deep learning tools. Traditional neural network accelerators, while effective, often grapple with issues of size, power consumption, and versatility in handling a variety of computational tasks. To counter these challenges, this paper introduces an approach that hinges on the integration of bit-serial processing with advanced dataflow techniques and architectural optimizations. Central to this approach is a column-buffering (CB) dataflow, which significantly reduces access and movement requirements for the input feature map (IFM), thereby enhancing efficiency. Moreover, a simplified quantization process effectively eliminates biases, streamlining the overall computation process. Furthermore, this paper presents a meticulously designed LeNet-5 accelerator leveraging a convolutional layer processing element array (CL PEA) architecture incorporating an improved bit-serial multiply–accumulate unit (MAC). Empirically, our work demonstrates superior performance in terms of frequency, chip area, and power consumption compared to current state-of-the-art ASIC designs. Specifically, our design utilizes fewer hardware resources to implement a complete accelerator, achieving a high performance of 7.87 GOPS on a Xilinx Kintex-7 FPGA with a brief processing time of 284.13 μs. The results affirm that our design is exceptionally suited for applications requiring compact, low-power, and real-time solutions. Full article
(This article belongs to the Section Artificial Intelligence Circuits and Systems (AICAS))
Show Figures

Figure 1

22 pages, 1848 KB  
Review
GNSS Carrier-Phase Multipath Modeling and Correction: A Review and Prospect of Data Processing Methods
by Qiuzhao Zhang, Longqiang Zhang, Ao Sun, Xiaolin Meng, Dongsheng Zhao and Craig Hancock
Remote Sens. 2024, 16(1), 189; https://doi.org/10.3390/rs16010189 - 2 Jan 2024
Cited by 14 | Viewed by 7792
Abstract
A multipath error is one of the main sources of GNSS positioning errors. It cannot be eliminated by forming double-difference and other methods, and it has become an issue in GNSS positioning error processing, because it is mainly related to the surrounding environment [...] Read more.
A multipath error is one of the main sources of GNSS positioning errors. It cannot be eliminated by forming double-difference and other methods, and it has become an issue in GNSS positioning error processing, because it is mainly related to the surrounding environment of the station. To address multipath errors, three main mitigation strategies are employed: site selection, hardware enhancements, and data processing. Among these, data processing methods have been a focal point of research due to their cost-effectiveness, impressive performance, and widespread applicability. This paper focuses on the review of data processing mitigation methods for GNSS carrier-phase multipath errors. The paper begins by elucidating the origins and mitigation strategies of multipath errors. Subsequently, it reviews the current research status pertaining to data processing methods using stochastic and functional models to counter multipath errors. The paper also provides an overview of filtering techniques for extracting multipath error models from coordinate sequences or observations. Additionally, it introduces the evolution and algorithmic workflow of sidereal filtering (SF) and multipath hemispherical mapping (MHM), from both coordinate and observation domain perspectives. Furthermore, the paper emphasizes the practical significance and research relevance of multipath error processing. It concludes by delineating future research directions in the realm of multipath error mitigation. Full article
Show Figures

Graphical abstract

17 pages, 1006 KB  
Article
Efficient Integrity-Tree Structure for Convolutional Neural Networks through Frequent Counter Overflow Prevention in Secure Memories
by Jesung Kim, Wonyoung Lee, Jeongkyu Hong and Soontae Kim
Sensors 2022, 22(22), 8762; https://doi.org/10.3390/s22228762 - 13 Nov 2022
Viewed by 2098
Abstract
Advancements in convolutional neural network (CNN) have resulted in remarkable success in various computing fields. However, the need to protect data against external security attacks has become increasingly important because inference process in CNNs exploit sensitive data. Secure Memory is a hardware-based protection [...] Read more.
Advancements in convolutional neural network (CNN) have resulted in remarkable success in various computing fields. However, the need to protect data against external security attacks has become increasingly important because inference process in CNNs exploit sensitive data. Secure Memory is a hardware-based protection technique that can protect the sensitive data of CNNs. However, naively applying secure memory to a CNN application causes significant performance and energy overhead. Furthermore, ensuring secure memory becomes more difficult in environments that require area efficiency and low-power execution, such as the Internet of Things (IoT). In this paper, we investigated memory access patterns for CNN workloads and analyzed their effects on secure memory performance. According to our observations, most CNN workloads intensively write to narrow memory regions, which can cause a considerable number of counter overflows. On average, 87.6% of total writes occur in 6.8% of the allocated memory space; in the extreme case, 93.9% of total writes occur in 1.4% of the allocated memory space. Based on our observations, we propose an efficient integrity-tree structure called Countermark-tree that is suitable for CNN workloads. The proposed technique reduces overall energy consumption by 48%, shows a performance improvement of 11.2% compared to VAULT-128, and requires a similar integrity-tree size to VAULT-64, a state-of-the-art technique. Full article
(This article belongs to the Special Issue Energy Management System for Internet of Things)
Show Figures

Figure 1

15 pages, 390 KB  
Article
Cross-World Covert Channel on ARM Trustzone through PMU
by Xinyao Li and Akhilesh Tyagi
Sensors 2022, 22(19), 7354; https://doi.org/10.3390/s22197354 - 28 Sep 2022
Cited by 4 | Viewed by 2326
Abstract
The TrustZone technology is incorporated in a majority of recent ARM Cortex A and Cortex M processors widely deployed in the IoT world. Security critical code execution inside a so-called secure world is isolated from the rest of the application execution within a [...] Read more.
The TrustZone technology is incorporated in a majority of recent ARM Cortex A and Cortex M processors widely deployed in the IoT world. Security critical code execution inside a so-called secure world is isolated from the rest of the application execution within a normal world. It provides hardware-isolated area called a trusted execution environment (TEE) in the processor for sensitive data and code. This paper demonstrates a vulnerability in the secure world in the form of a cross-world, secure world to normal world, covert channel. Performance counters or Performance Monitoring Unit (PMU) events are used to convey the information from the secure world to the normal world. An encoding program generates appropriate PMU event footprint given a secret S. A corresponding decoding program reads the PMU footprint and infers S using machine learning (ML). The machine learning model can be trained entirely from the data collected from the PMU in user space. Lack of synchronization between PMU start and PMU read adds noise to the encoding/decoding ML models. In order to account for this noise, this study proposes three different synchronization capabilities between the client and trusted applications in the covert channel. These are synchronous, semi-synchronous, and asynchronous. Previously proposed PMU based covert channels deploy L1 and LLC cache PMU events. The latency of these events tends to be 100–1000 cycles limiting the bandwidth of these covert channels. We propose to use microarchitecture level events with latency of 10–100 cycles captured through PMU for covert channel encoding leading to a potential 100× higher bandwidth. This study conducts a series of experiments to evaluate the proposed covert channels under various synchronization models on a TrustZone supported Cortex-A processor using OP-TEE framework. As stated earlier, switch from signaling based on PMU cache events to PMU microarchitectural events leads to approximately 15× higher covert channel bandwidth. This proposed finer-grained microarchitecture event encoding covert channel can achieve throughput of the order of 11 Kbits/s as opposed to previous work’s throughput of the order of 760 bits/s. Full article
Show Figures

Figure 1

17 pages, 3942 KB  
Article
BIOS-Based Server Intelligent Optimization
by Xianxian Qi, Jianfeng Yang, Yiyang Zhang and Baonan Xiao
Sensors 2022, 22(18), 6730; https://doi.org/10.3390/s22186730 - 6 Sep 2022
Cited by 2 | Viewed by 2044
Abstract
Servers are the infrastructure of enterprise applications, and improving server performance under fixed hardware resources is an important issue. Conducting performance tuning at the application layer is common, but it is not systematic and requires prior knowledge of the running application. Some works [...] Read more.
Servers are the infrastructure of enterprise applications, and improving server performance under fixed hardware resources is an important issue. Conducting performance tuning at the application layer is common, but it is not systematic and requires prior knowledge of the running application. Some works performed tuning by dynamically adjusting the hardware prefetching configuration with a predictive model. Similarly, we design a BIOS (Basic Input/Output System)-based dynamic tuning framework for a Taishan 2280 server, including dynamic identification and static optimization. We simulate five workload scenarios (CPU-instance, etc.) with benchmark tools and perform scenario recognition dynamically with performance monitor counters (PMCs). The adjustable configurations provided by Kunpeng processing reach 2N(N>100). Therefore, we propose a joint BIOS optimization algorithm using a deep Q-network. Configuration optimization is modeled as a Markov decision process starting from a feasible solution and optimizing gradually. To improve the continuous optimization capabilities, the neighborhood search method of state machine control is added. To assess its performance, we compare our algorithm with the genetic algorithm and particle swarm optimization. Our algorithm shows that it can also improve performance up to 1.10× compared to experience configuration and perform better in reducing the probability of server downtime. The dynamic tuning framework in this paper is extensible, can be trained to adapt to different scenarios, and is more suitable for servers with many adjustable configurations. Compared with the heuristic intelligent search algorithm, the proposed joint BIOS optimization algorithm can generate fewer infeasible solutions and is not easily disturbed by initialization. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

17 pages, 6762 KB  
Article
Designing a Custom CPU Architecture Based on Hardware RTOS and Dynamic Preemptive Scheduler
by Ionel Zagan and Vasile Gheorghiță Găitan
Mathematics 2022, 10(15), 2637; https://doi.org/10.3390/math10152637 - 27 Jul 2022
Cited by 4 | Viewed by 3942
Abstract
The current trend in real-time operating systems involves executing many tasks using a limited hardware platform. Thus, a single processor system has to execute multiple tasks with different priorities in different real-time system (RTS) work modes. Hardware schedulers can greatly reduce event trigger [...] Read more.
The current trend in real-time operating systems involves executing many tasks using a limited hardware platform. Thus, a single processor system has to execute multiple tasks with different priorities in different real-time system (RTS) work modes. Hardware schedulers can greatly reduce event trigger latency and successfully remove most of the scheduling overhead, providing more computing cycles for applications. In this paper, we present a hardware-accelerated RTOS based on the replication of resources such as program counters, general purpose registers (GPRs) and pipeline registers. The implementation of this new concept, based on real-time event handling implemented in hardware, is intended to meet the current rigorous requirements imposed by critical real-time systems. The most important attribute of this FPGA implementation is the time required for task context switching, which is only one clock cycle or three clock cycles when working with the atomic instructions used in the case of inter-task synchronization and communication mechanisms. The main contribution of this article is its focus on mutexes and the speed of response associated with related events. Thus, fast switching between threads is also validated, considering the handling of events in the hardware using HW_nMPRA_RTOS (HW-RTOS). The proposed architecture implements inter-task synchronization and communication mechanisms with high performance, improving the overall response time when the mutex or message is expected to relate to a higher-priority task. Full article
(This article belongs to the Special Issue Numerical Methods in Real-Time and Embedded Systems)
Show Figures

Figure 1

20 pages, 871 KB  
Article
Efficient Intersection Management Based on an Adaptive Fuzzy-Logic Traffic Signal
by Victor Manuel Madrigal Arteaga, José Roberto Pérez Cruz, Antonio Hurtado-Beltrán and Jan Trumpold
Appl. Sci. 2022, 12(12), 6024; https://doi.org/10.3390/app12126024 - 14 Jun 2022
Cited by 15 | Viewed by 3538
Abstract
Traffic signals may generate bottlenecks due to an unfair timing balance. Facing this problem, adaptive traffic signal controllers have been proposed to compute the phase durations according to conditions monitored from on-road sensors. However, high hardware requirements, as well as complex setups, make [...] Read more.
Traffic signals may generate bottlenecks due to an unfair timing balance. Facing this problem, adaptive traffic signal controllers have been proposed to compute the phase durations according to conditions monitored from on-road sensors. However, high hardware requirements, as well as complex setups, make the majority of these approaches infeasible for most cities. This paper proposes an adaptive traffic signal fuzzy-logic controller which uses the flow rate, retrieved from simple traffic counters, as a unique input requirement. The controller dynamically computes the cycle duration according to the arrival flow rates, executing a fuzzy inference system guided by the reasoning: the higher the traffic flow, the longer the cycle length. The computed cycle is split into different phases proportionally to the arrival flow rates according to Webster’s method for signalization. Consequently, the controller only requires determining minimum/maximum flow rates and cycle lengths to establish if–then mappings, allowing the reduction of technical requirements and computational overhead. The controller was tested through a microsimulation model of a real isolated intersection, which was calibrated with data collected from a six-month traffic study. Results revealed that the proposed controller with fewer input requirements and lower computational costs has a competitive performance compared to the best and most used approaches, being a feasible solution for many cities. Full article
(This article belongs to the Section Transportation and Future Mobility)
Show Figures

Figure 1

16 pages, 8774 KB  
Article
KMC3 and CHTKC: Best Scenarios, Deficiencies, and Challenges in High-Throughput Sequencing Data Analysis
by Deyou Tang, Daqiang Tan, Weihao Xiao, Jiabin Lin and Juan Fu
Algorithms 2022, 15(4), 107; https://doi.org/10.3390/a15040107 - 24 Mar 2022
Viewed by 3272
Abstract
Background: K-mer frequency counting is an upstream process of many bioinformatics data analysis workflows. KMC3 and CHTKC are the representative partition-based k-mer counting and non-partition-based k-mer counting algorithms, respectively. This paper evaluates the two algorithms and presents their best applicable scenarios and potential [...] Read more.
Background: K-mer frequency counting is an upstream process of many bioinformatics data analysis workflows. KMC3 and CHTKC are the representative partition-based k-mer counting and non-partition-based k-mer counting algorithms, respectively. This paper evaluates the two algorithms and presents their best applicable scenarios and potential improvements using multiple hardware contexts and datasets. Results: KMC3 uses less memory and runs faster than CHTKC on a regular configuration server. CHTKC is efficient on high-performance computing platforms with high available memory, multi-thread, and low IO bandwidth. When tested with various datasets, KMC3 is less sensitive to the number of distinct k-mers and is more efficient for tasks with relatively low sequencing quality and long k-mer. CHTKC performs better than KMC3 in counting assignments with large-scale datasets, high sequencing quality, and short k-mer. Both algorithms are affected by IO bandwidth, and decreasing the influence of the IO bottleneck is critical as our tests show improvement by filtering and compressing consecutive first-occurring k-mers in KMC3. Conclusions: KMC3 is more competitive for running counter on ordinary hardware resources, and CHTKC is more competitive for counting k-mers in super-scale datasets on higher-performance computing platforms. Reducing the influence of the IO bottleneck is essential for optimizing the k-mer counting algorithm, and filtering and compressing low-frequency k-mers is critical in relieving IO impact. Full article
(This article belongs to the Special Issue Performance Optimization and Performance Evaluation)
Show Figures

Figure 1

Back to TopTop