Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (114)

Search Parameters:
Keywords = memory compilation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 6665 KB  
Article
Using Entity-Aware LSTM to Enhance Streamflow Predictions in Transboundary and Large Lake Basins
by Yunsu Park, Xiaofeng Liu, Yuyue Zhu and Yi Hong
Hydrology 2025, 12(10), 261; https://doi.org/10.3390/hydrology12100261 - 2 Oct 2025
Viewed by 272
Abstract
Hydrological simulation of large, transboundary water systems like the Laurentian Great Lakes remains challenging. Although deep learning has advanced hydrologic forecasting, prior efforts are fragmented, lacking a unified basin-wide model for daily streamflow. We address this gap by developing a single Entity-Aware Long [...] Read more.
Hydrological simulation of large, transboundary water systems like the Laurentian Great Lakes remains challenging. Although deep learning has advanced hydrologic forecasting, prior efforts are fragmented, lacking a unified basin-wide model for daily streamflow. We address this gap by developing a single Entity-Aware Long Short-Term Memory (EA-LSTM) model, an architecture that distinctly processes static catchment attributes and dynamic meteorological forcings, trained without basin-specific calibration. We compile a cross-border dataset integrating daily meteorological forcings, static catchment attributes, and observed streamflow for 975 sub-basins across the United States and Canada (1980–2023). With a temporal training/testing split, the unified EA-LSTM attains a median Nash–Sutcliffe Efficiency (NSE) of 0.685 and a median Kling–Gupta Efficiency (KGE) of 0.678 in validation, substantially exceeding a standard LSTM (median NSE 0.567, KGE 0.555) and the operational NOAA National Water Model (median NSE 0.209, KGE 0.440). Although skill is reduced in the smallest basins (median NSE 0.554) and during high-flow events (median PBIAS −29.6%), the performance is robust across diverse hydroclimatic settings. These results demonstrate that a single, calibration-free deep learning model can provide accurate, scalable streamflow prediction across an international basin, offering a practical path toward unified forecasting for the Great Lakes and a transferable framework for other large, data-sparse watersheds. Full article
Show Figures

Figure 1

18 pages, 960 KB  
Article
Fus: Combining Semantic and Structural Graph Information for Binary Code Similarity Detection
by Yanlin Li, Taiyan Wang, Lu Yu and Zulie Pan
Electronics 2025, 14(19), 3781; https://doi.org/10.3390/electronics14193781 - 24 Sep 2025
Viewed by 223
Abstract
Binary code similarity detection (BCSD) plays an important role in software security. Recent deep learning-based methods have made great progress. Existing methods based on a single feature, such as semantics or graph structure, struggle to handle changes caused by the architecture or compilation [...] Read more.
Binary code similarity detection (BCSD) plays an important role in software security. Recent deep learning-based methods have made great progress. Existing methods based on a single feature, such as semantics or graph structure, struggle to handle changes caused by the architecture or compilation environment. Methods fusing semantics and graph structure suffer from insufficient learning of the function, resulting in low accuracy and robustness. To address this issue, we proposed Fus, a method that integrates semantic information from the pseudo-C code and structural features from the Abstract Syntax Tree (AST). The pseudo-C code and AST are robust against compilation and architectural changes and can represent the function well. Our approach consists of three steps. First, we preprocess the assembly code to obtain the pseudo-C code and AST for each function. Second, we employ a Siamese network with CodeBERT models to extract semantic embeddings from the pseudo-C code and Tree-Structured Long Short-Term Memory (Tree LSTM) to encode the AST. Finally, function similarity is computed by summing the respective semantic and structural similarities. The evaluation results show that our method outperforms the state-of-the-art methods in most scenarios. Especially in large-scale scenarios, its performance is remarkable. In the vulnerability search task, Fus achieves the highest recall. It demonstrates the accuracy and robustness of our method. Full article
Show Figures

Figure 1

23 pages, 1764 KB  
Article
Parallelization of the Koopman Operator Based on CUDA and Its Application in Multidimensional Flight Trajectory Prediction
by Jing Lu, Lulu Wang and Zeyi Shang
Electronics 2025, 14(18), 3609; https://doi.org/10.3390/electronics14183609 - 11 Sep 2025
Viewed by 410
Abstract
This paper introduces a parallelized approach to reconstruct Koopman computational graphs from the perspective of parallel computing to address the computational efficiency bottleneck in approximating Koopman operators within high-dimensional spaces. We propose the KPA (Koopman Parallel Accelerator), a parallelized algorithm that restructures the [...] Read more.
This paper introduces a parallelized approach to reconstruct Koopman computational graphs from the perspective of parallel computing to address the computational efficiency bottleneck in approximating Koopman operators within high-dimensional spaces. We propose the KPA (Koopman Parallel Accelerator), a parallelized algorithm that restructures the Koopman computational workflow to transform sequential time-step computations into parallel tasks. KPA leverages GPU parallelism to improve execution efficiency without compromising model accuracy. To validate the algorithm’s effectiveness, we apply KPA to a flight trajectory prediction scenario based on the Koopman operator. Within the CUDA kernel implementation of KPA, several optimization techniques—such as shared memory, tiling, double buffering, and data prefetching—are employed. We compare our implementation against two baselines: the original Koopman neural operator for trajectory prediction implemented in TensorFlow (TF-baseline) and its XLA-compiled variant (TF-XLA). The experimental results demonstrate that KPA achieves a 2.47× speed up over TF-baseline and a 1.09× improvement over TF-XLA when predicting a 1422-dimensional flight trajectory. Additionally, an ablation study on block size and the number of streaming multiprocessors (SMs) reveals that the best performance is obtained with the block size of 16 × 16 and SM = 8. The results demonstrate that KPA can significantly accelerate Koopman operator computations, making it suitable for high-dimensional, large-scale, or real-time applications. Full article
Show Figures

Figure 1

16 pages, 480 KB  
Study Protocol
A Cognitive Training Programme on Cancer-Related Cognitive Impairment (CRCI) in Breast Cancer Patients Undergoing Active Treatment: A RCT Study Protocol
by Samuel Jiménez Sánchez, Celia Sánchez Gómez, Susana Sáez Gutiérrez, Sara Jiménez García-Tizón, Juan Luis Sánchez González, María Isabel Rihuete Galve, Emilio Fonseca Sánchez and Eduardo José Fernández Rodríguez
J. Clin. Med. 2025, 14(14), 5047; https://doi.org/10.3390/jcm14145047 - 16 Jul 2025
Viewed by 850
Abstract
Background: In light of increasing breast cancer survival rates, it is essential to address cancer-related cognitive impairment (CRCI), a common yet often underestimated symptom. Methods: A randomised controlled trial is proposed involving 50 newly diagnosed participants, divided into a control group (CG) and [...] Read more.
Background: In light of increasing breast cancer survival rates, it is essential to address cancer-related cognitive impairment (CRCI), a common yet often underestimated symptom. Methods: A randomised controlled trial is proposed involving 50 newly diagnosed participants, divided into a control group (CG) and an intervention group (IG). Both groups will receive an educational leaflet, while the IG will also take part in an individualised cognitive training programme based on everyday cognition (80 sessions distributed across four periods, compiled in a training dossier). Cognitive, emotional, and functional variables will be assessed before and after the intervention: cognitive function (MoCA test), everyday cognition (PECC), anxiety (Hamilton), functionality (LB), sleep quality (PSQI), quality of life (ECOG), and subjective memory complaints (FACT-COG). Expected results: Findings may guide future interventions and tailored protocols to alleviate CRCI in breast cancer patients undergoing active treatment. Ethics and dissemination: This study was approved by the Ethics Committee of the University of Salamanca (PI 2023 12 1478-TD). Full article
(This article belongs to the Section Obstetrics & Gynecology)
Show Figures

Graphical abstract

14 pages, 586 KB  
Review
Cues of Trained Immunity in Multiple Sclerosis Macrophages
by Elisa Popa, Hélène Cheval and Violetta Zujovic
Cells 2025, 14(14), 1054; https://doi.org/10.3390/cells14141054 - 10 Jul 2025
Viewed by 984
Abstract
Multiple sclerosis (MS) is a complex autoimmune disease with both genetic and environmental influences, yet its underlying mechanisms remain only partially understood. In this review, we compile evidence suggesting that trained immunity—a form of innate immune memory—may play a crucial role in the [...] Read more.
Multiple sclerosis (MS) is a complex autoimmune disease with both genetic and environmental influences, yet its underlying mechanisms remain only partially understood. In this review, we compile evidence suggesting that trained immunity—a form of innate immune memory—may play a crucial role in the autoimmune component of MS. By examining key findings from immunology, neuroinflammation, and MS pathophysiology, we explore how innate immune cells, particularly monocytes and macrophages, could contribute to disease onset and progression through persistent pro-inflammatory responses. Understanding the impact of trained immunity in MS could open new avenues for therapeutic strategies targeting the innate immune system. Full article
(This article belongs to the Section Cells of the Nervous System)
Show Figures

Figure 1

54 pages, 2065 KB  
Review
Edge Intelligence: A Review of Deep Neural Network Inference in Resource-Limited Environments
by Dat Ngo, Hyun-Cheol Park and Bongsoon Kang
Electronics 2025, 14(12), 2495; https://doi.org/10.3390/electronics14122495 - 19 Jun 2025
Cited by 2 | Viewed by 3988
Abstract
Deploying deep neural networks (DNNs) in resource-limited environments—such as smartwatches, IoT nodes, and intelligent sensors—poses significant challenges due to constraints in memory, computing power, and energy budgets. This paper presents a comprehensive review of recent advances in accelerating DNN inference on edge platforms, [...] Read more.
Deploying deep neural networks (DNNs) in resource-limited environments—such as smartwatches, IoT nodes, and intelligent sensors—poses significant challenges due to constraints in memory, computing power, and energy budgets. This paper presents a comprehensive review of recent advances in accelerating DNN inference on edge platforms, with a focus on model compression, compiler optimizations, and hardware–software co-design. We analyze the trade-offs between latency, energy, and accuracy across various techniques, highlighting practical deployment strategies on real-world devices. In particular, we categorize existing frameworks based on their architectural targets and adaptation mechanisms and discuss open challenges such as runtime adaptability and hardware-aware scheduling. This review aims to guide the development of efficient and scalable edge intelligence solutions. Full article
Show Figures

Figure 1

28 pages, 9320 KB  
Article
Embedded Sensor Data Fusion and TinyML for Real-Time Remaining Useful Life Estimation of UAV Li Polymer Batteries
by Jutarut Chaoraingern and Arjin Numsomran
Sensors 2025, 25(12), 3810; https://doi.org/10.3390/s25123810 - 18 Jun 2025
Cited by 2 | Viewed by 1178
Abstract
The accurate real-time estimation of the remaining useful life (RUL) of lithium-polymer (LiPo) batteries is a critical enabler for ensuring the safety, reliability, and operational efficiency of unmanned aerial vehicles (UAVs). Nevertheless, achieving such prognostics on resource-constrained embedded platforms remains a considerable technical [...] Read more.
The accurate real-time estimation of the remaining useful life (RUL) of lithium-polymer (LiPo) batteries is a critical enabler for ensuring the safety, reliability, and operational efficiency of unmanned aerial vehicles (UAVs). Nevertheless, achieving such prognostics on resource-constrained embedded platforms remains a considerable technical challenge. This study proposes an end-to-end TinyML-based framework that integrates embedded sensor data fusion with an optimized feedforward neural network (FFNN) model for efficient RUL estimation under strict hardware limitations. The system collects voltage, discharge time, and capacity measurements through a lightweight data fusion pipeline and leverages the Edge Impulse platform with the EON™Compiler for model optimization. The trained model is deployed on a dual-core ARM Cortex-M0+ Raspberry Pi RP2040 microcontroller, communicating wirelessly with a LabVIEW-based visualization system for real-time monitoring. Experimental validation on an 80-gram UAV equipped with a 1100 mAh LiPo battery demonstrates a mean absolute error (MAE) of 3.46 cycles and a root mean squared error (RMSE) of 3.75 cycles. Model testing results show an overall accuracy of 98.82%, with a mean squared error (MSE) of 55.68, a mean absolute error (MAE) of 5.38, and a variance score of 0.99, indicating strong regression precision and robustness. Furthermore, the quantized (int8) version of the model achieves an inference latency of 2 ms, with memory utilization of only 1.2 KB RAM and 11 KB flash, confirming its suitability for real-time deployment on resource-constrained embedded devices. Overall, the proposed framework effectively demonstrates the feasibility of combining embedded sensor data fusion and TinyML to enable accurate, low-latency, and resource-efficient real-time RUL estimation for UAV battery health management. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

23 pages, 1475 KB  
Article
Learning Online MEMS Calibration with Time-Varying and Memory-Efficient Gaussian Neural Topologies
by Danilo Pietro Pau, Simone Tognocchi and Marco Marcon
Sensors 2025, 25(12), 3679; https://doi.org/10.3390/s25123679 - 12 Jun 2025
Viewed by 3295
Abstract
This work devised an on-device learning approach to self-calibrate Micro-Electro-Mechanical Systems-based Inertial Measurement Units (MEMS-IMUs), integrating a digital signal processor (DSP), an accelerometer, and a gyroscope in the same package. The accelerometer and gyroscope stream their data in real time to the DSP, [...] Read more.
This work devised an on-device learning approach to self-calibrate Micro-Electro-Mechanical Systems-based Inertial Measurement Units (MEMS-IMUs), integrating a digital signal processor (DSP), an accelerometer, and a gyroscope in the same package. The accelerometer and gyroscope stream their data in real time to the DSP, which runs artificial intelligence (AI) workloads. The real-time sensor data are subject to errors, such as time-varying bias and thermal stress. To compensate for these drifts, the traditional calibration method based on a linear model is applicable, and unfortunately, it does not work with nonlinear errors. The algorithm devised by this study to reduce such errors adopts Radial Basis Function Neural Networks (RBF-NNs). This method does not rely on the classical adoption of the backpropagation algorithm. Due to its low complexity, it is deployable using kibyte memory and in software runs on the DSP, thus performing interleaved in-sensor learning and inference by itself. This avoids using any off-package computing processor. The learning process is performed periodically to achieve consistent sensor recalibration over time. The devised solution was implemented in both 32-bit floating-point data representation and 16-bit quantized integer version. Both of these were deployed into the Intelligent Sensor Processing Unit (ISPU), integrated into the LSM6DSO16IS Inertial Measurement Unit (IMU), which is a programmable 5–10 MHz DSP on which the programmer can compile and execute AI models. It integrates 32 KiB of program RAM and 8 KiB of data RAM. No permanent memory is integrated into the package. The two (fp32 and int16) RBF-NN models occupied less than 21 KiB out of the 40 available, working in real-time and independently in the sensor package. The models, respectively, compensated between 46% and 95% of the accelerometer measurement error and between 32% and 88% of the gyroscope measurement error. Finally, it has also been used for attitude estimation of a micro aerial vehicle (MAV), achieving an error of only 2.84°. Full article
(This article belongs to the Special Issue Sensors and IoT Technologies for the Smart Industry)
Show Figures

Graphical abstract

24 pages, 3425 KB  
Article
A Neural Network Compiler for Efficient Data Storage Optimization in ReRAM-Based DNN Accelerators
by Hsu-Yu Kao, Liang-Ying Su, Shih-Hsu Huang and Wei-Kai Cheng
Electronics 2025, 14(12), 2352; https://doi.org/10.3390/electronics14122352 - 8 Jun 2025
Cited by 1 | Viewed by 884
Abstract
ReRAM-based DNN accelerators have emerged as a promising solution to mitigate the von Neumann bottleneck. While prior research has introduced tools for simulating the hardware behavior of ReRAM’s non-linear characteristics, there remains a notable gap in high-level design automation tools capable of efficiently [...] Read more.
ReRAM-based DNN accelerators have emerged as a promising solution to mitigate the von Neumann bottleneck. While prior research has introduced tools for simulating the hardware behavior of ReRAM’s non-linear characteristics, there remains a notable gap in high-level design automation tools capable of efficiently deploying DNN models onto ReRAM-based accelerators with simultaneous optimization of execution time and memory usage. In this paper, we propose a neural network compiler built on the open-source TVM framework to address this challenge. The compiler incorporates both layer fusion and model partitioning techniques to enhance data storage efficiency. The core contribution of our work is an algorithm that determines the optimal mapping strategy by jointly considering layer fusion and model partitioning under hardware resource constraints. Experimental evaluations demonstrate that the proposed compiler adapts effectively to varying hardware resource limitations, enabling efficient storage optimization and supporting early-stage design space exploration. Full article
(This article belongs to the Special Issue Research on Key Technologies for Hardware Acceleration)
Show Figures

Figure 1

22 pages, 1892 KB  
Review
Determining Factors for the Development of Critical Thinking in Higher Education
by Dora Lucia Jaramillo Gómez, Annie Julieth Álvarez Maestre, Abad Ernesto Parada Trujillo, Carlos Alfredo Pérez Fuentes, Dago Hernando Bedoya Ortiz and Ruth Katherine Sanabria Alarcón
J. Intell. 2025, 13(6), 59; https://doi.org/10.3390/jintelligence13060059 - 22 May 2025
Cited by 4 | Viewed by 6883
Abstract
This study arises from the growing need to train professionals capable of confronting and analyzing the overabundance of information in an increasingly complex world, where critical thinking is seen as an indispensable skill for informed decision making and problem solving. To this end, [...] Read more.
This study arises from the growing need to train professionals capable of confronting and analyzing the overabundance of information in an increasingly complex world, where critical thinking is seen as an indispensable skill for informed decision making and problem solving. To this end, a systematic narrative review methodology was applied to the scientific literature, compiling data from various international databases. The results reveal that physiological factors (memory, attention, nutrition and physical activity), psychological factors (cognitive biases, fear of ambiguity, and metacognition), sociocultural factors (diversity, inequality, and cultural norms), technological factors (digitalization, use of AI, and digital literacy), and educational factors (active pedagogical strategies and collaborative work) play a determining role in the development of critical thinking in higher education. The discussion emphasizes the complex interaction between these factors and underscores the need for holistic approaches that strengthen both cognitive competencies and emotional well-being. In conclusion, we recommend designing comprehensive training interventions that consider the identified factors, promoting inclusive and reflective environments, aimed at developing critical, autonomous graduates capable of facing contemporary challenges. Full article
Show Figures

Figure 1

24 pages, 1425 KB  
Article
Long Short-Term Memory Networks for State of Charge and Average Temperature State Estimation of SPMeT Lithium–Ion Battery Model
by Brianna Chevalier, Junyao Xie and Stevan Dubljevic
Processes 2025, 13(5), 1528; https://doi.org/10.3390/pr13051528 - 15 May 2025
Viewed by 629
Abstract
Lithium–ion batteries are the dominant battery type for emerging technologies in the efforts to slow climate change. Accurate and quick estimations of state of charge (SOC) and internal cell temperature are vital to battery-management systems to enable the effective operation of portable electronics [...] Read more.
Lithium–ion batteries are the dominant battery type for emerging technologies in the efforts to slow climate change. Accurate and quick estimations of state of charge (SOC) and internal cell temperature are vital to battery-management systems to enable the effective operation of portable electronics and electric vehicles. Therefore, a long short-term memory (LSTM) recurrent-neural network is proposed which completes the state estimation of SOC and internal average cell temperature (Tavg) of lithium–ion batteries under varying current loads. The network is trained and evaluated using data compiled from a newly developed extended single-particle model coupled with a thermal dynamic model. Results are promising, with root mean square values typically under 2% for SOC and 1.2 K for Tavg, while maintaining quick training and testing times. In addition, we examined a comparison of a single-feature versus multi-feature network, as well as two different approaches to data partitioning. Full article
(This article belongs to the Section Process Control and Monitoring)
Show Figures

Figure 1

15 pages, 4254 KB  
Proceeding Paper
A Custom Convolutional Neural Network Model-Based Bioimaging Technique for Enhanced Accuracy of Alzheimer’s Disease Detection
by Gogulamudi Pradeep Reddy, Duppala Rohan, Shaik Mohammed Abdul Kareem, Yellapragada Venkata Pavan Kumar, Kasaraneni Purna Prakash and Malathi Janapati
Eng. Proc. 2025, 87(1), 47; https://doi.org/10.3390/engproc2025087047 - 14 Apr 2025
Cited by 1 | Viewed by 758
Abstract
Alzheimer’s disease (AD), an intense neurological illness, severely impacts memory, behavior, and personality, posing a growing concern worldwide due to the aging population. Early and accurate detection is crucial as it enables preventive measures. However, current diagnostic methods are often inaccurate in identifying [...] Read more.
Alzheimer’s disease (AD), an intense neurological illness, severely impacts memory, behavior, and personality, posing a growing concern worldwide due to the aging population. Early and accurate detection is crucial as it enables preventive measures. However, current diagnostic methods are often inaccurate in identifying the disease in its early stages. Although deep learning-based bioimaging has shown promising results in medical image classification, challenges remain in achieving the highest accuracy for detecting AD. Existing approaches, such as ResNet50, VGG19, InceptionV3, and AlexNet have shown potential, but they often lack reliability and accuracy due to several issues. To address these gaps, this paper suggests a novel bioimaging technique by developing a custom Convolutional Neural Network (CNN) model for detecting AD. This model is designed with optimized layers to enhance feature extraction from medical images. The experiment’s first phase involved the construction of the custom CNN structure with three max-pooling layers, three convolutional layers, two dense layers, and one flattened layer. The Adam optimizer and categorical cross-entropy were adopted to compile the model. The model’s training was carried out on 100 epochs with the patience set to 10 epochs. The second phase involved augmentation of the dataset images and adding a dropout layer to the custom CNN model. Moreover, fine-tuned hyperparameters and advanced regularization methods were integrated to prevent overfitting. A comparative analysis of the proposed model with conventional models was performed on the dataset both before and after the data augmentation. The results validate that the proposed custom CNN model significantly overtakes pre-existing models, achieving the highest validation accuracy of 99.53% after data augmentation while maintaining the lowest validation loss of 0.0238. Its precision, recall, and F1 score remained consistently high across all classes, with perfect scores for the Moderate Demented and Non-Demented categories after augmentation, indicating superior classification capability. Full article
(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

18 pages, 4883 KB  
Article
FPGA Programming Challenges When Estimating Power Spectral Density and Autocorrelation in Coherent Doppler Lidar Systems for Wind Sensing
by Sameh Abdelazim, David Santoro and Fred Moshary
Sensors 2025, 25(3), 973; https://doi.org/10.3390/s25030973 - 6 Feb 2025
Cited by 2 | Viewed by 1424
Abstract
In this paper, we present the logic designs of two FPGA hardware programming algorithms implemented for a Coherent Doppler Lidar system used in wind sensing. The first algorithm divides the received time-domain signals into segments, each corresponding to a specific spatial resolution. It [...] Read more.
In this paper, we present the logic designs of two FPGA hardware programming algorithms implemented for a Coherent Doppler Lidar system used in wind sensing. The first algorithm divides the received time-domain signals into segments, each corresponding to a specific spatial resolution. It then calculates the power spectrum for each segment and accumulates these spectra over 10,000 pulse returns. The second algorithm computes the autocorrelation of the received signals and accumulates the results over the same number of pulses. Both signal pre-processing algorithms are initially developed as logic designs and compiled using the Xilinx System Generator toolset to produce a hardware VLSI image. This image is subsequently programmed into an FPGA. However, the hardware implementation of these algorithms presents several challenges: (1) bit growth: multiplication operations in the binary number system significantly increase the number of bits, complicating hardware implementation. (2) Memory constraints: onboard RAM arrays of sufficient size are lacking for accumulating vectors of the calculated Fast Fourier Transforms (FFTs) or autocorrelations. (3) Signal drive issues: large fan-out in the logic design leads to significant capacitance, restricting the driving capabilities of transistor output signals. This article discusses the solutions devised to overcome these challenges. Additionally, it presents atmospheric wind measurements obtained using the two algorithms. Full article
(This article belongs to the Special Issue Integrated Sensor Systems for Environmental Applications)
Show Figures

Figure 1

17 pages, 261 KB  
Article
Echoes of Albany: The Transatlantic Reflections of Anne Grant in Memoirs of an American Lady
by Rob Sutton
Humanities 2025, 14(2), 20; https://doi.org/10.3390/h14020020 - 29 Jan 2025
Viewed by 1301
Abstract
This essay explores the mid-eighteenth-century travel experience of Scottish writer Anne Macvicar Grant [1775–1838]. Grant is perhaps best known for her late eighteenth- and early nineteenth-century travel writing and anthropological discourse focussed primarily upon the Scottish Highlands. Yet, the majority of Grant’s childhood [...] Read more.
This essay explores the mid-eighteenth-century travel experience of Scottish writer Anne Macvicar Grant [1775–1838]. Grant is perhaps best known for her late eighteenth- and early nineteenth-century travel writing and anthropological discourse focussed primarily upon the Scottish Highlands. Yet, the majority of Grant’s childhood was spent in Albany, New York. After she had established herself as a writer and published various texts dealing with her more recent experience in the Scottish Highlands, in 1808, Grant published Memoirs of an American Lady, a semi-biographical account of her childhood spent in the multicultural contact zone of a British military outpost. There are two key issues that this essay explores. First, I discuss the process of memory. Unlike intentional travelogues of the time, Grant’s text was not compiled with the aid of a diary or ledger. Grant’s entire account comprises memories of events that occurred over forty years in the past. Part of this essay then discusses the potential fallibilities of the fragility of human memory upon the traveller. While it may be anticipated that this first issue is detrimental to the account of the traveller, the second key issue that I explore is arguably advantageous to Grant’s account. The extent to which Grant, throughout her life, immersed herself within various marginalised communities undoubtedly allows for the production of a more nuanced and balanced account of external cultures than was the custom at the time. What complicates this account is the mixing of memory and cultural immersion. In her writing around the Scottish Highlands, Grant frequently relies upon her experience of certain cultures as a child to explain and convey her understanding of the different marginalised communities she encounters as an adult. Integral to this essay is the fact that this mixing of memory and cultural exposure also occurs the opposite way around. In the Memoirs, the writer’s recollections of the Mohawk or the Kanien’kehà:ka people and colonial Dutch communities as a child seem to be coloured and subjected by her more recent experience of the Highland people. Full article
(This article belongs to the Special Issue Eighteenth-Century Travel Writing: New Directions)
29 pages, 1433 KB  
Article
Sparse Convolution FPGA Accelerator Based on Multi-Bank Hash Selection
by Jia Xu, Han Pu and Dong Wang
Micromachines 2025, 16(1), 22; https://doi.org/10.3390/mi16010022 - 27 Dec 2024
Viewed by 1636
Abstract
Reconfigurable processor-based acceleration of deep convolutional neural network (DCNN) algorithms has emerged as a widely adopted technique, with particular attention on sparse neural network acceleration as an active research area. However, many computing devices that claim high computational power still struggle to execute [...] Read more.
Reconfigurable processor-based acceleration of deep convolutional neural network (DCNN) algorithms has emerged as a widely adopted technique, with particular attention on sparse neural network acceleration as an active research area. However, many computing devices that claim high computational power still struggle to execute neural network algorithms with optimal efficiency, low latency, and minimal power consumption. Consequently, there remains significant potential for further exploration into improving the efficiency, latency, and power consumption of neural network accelerators across diverse computational scenarios. This paper investigates three key techniques for hardware acceleration of sparse neural networks. The main contributions are as follows: (1) Most neural network inference tasks are typically executed on general-purpose computing devices, which often fail to deliver high energy efficiency and are not well-suited for accelerating sparse convolutional models. In this work, we propose a specialized computational circuit for the convolutional operations of sparse neural networks. This circuit is designed to detect and eliminate the computational effort associated with zero values in the sparse convolutional kernels, thereby enhancing energy efficiency. (2) The data access patterns in convolutional neural networks introduce significant pressure on the high-latency off-chip memory access process. Due to issues such as data discontinuity, the data reading unit often fails to fully exploit the available bandwidth during off-chip read and write operations. In this paper, we analyze bandwidth utilization in the context of convolutional accelerator data handling and propose a strategy to improve off-chip access efficiency. Specifically, we leverage a compiler optimization plugin developed for Vitis HLS, which automatically identifies and optimizes on-chip bandwidth utilization. (3) In coefficient-based accelerators, the synchronous operation of individual computational units can significantly hinder efficiency. Previous approaches have achieved asynchronous convolution by designing separate memory units for each computational unit; however, this method consumes a substantial amount of on-chip memory resources. To address this issue, we propose a shared feature map cache design for asynchronous convolution in the accelerators presented in this paper. This design resolves address access conflicts when multiple computational units concurrently access a set of caches by utilizing a hash-based address indexing algorithm. Moreover, the shared cache architecture reduces data redundancy and conserves on-chip resources. Using the optimized accelerator, we successfully executed ResNet50 inference on an Intel Arria 10 1150GX FPGA, achieving a throughput of 497 GOPS, or an equivalent computational power of 1579 GOPS, with a power consumption of only 22 watts. Full article
Show Figures

Figure 1

Back to TopTop