1. Introduction
The development of Advanced Driver Assistance Systems (ADAS) has become a critical objective in automotive research, targeting improvements in road safety, traffic efficiency, and driver comfort through functionalities such as lane departure warning, obstacle detection, pedestrian recognition, and adaptive cruise control [
1,
2]. Among the various sensing modalities, vision-based systems are particularly notable for their rich contextual awareness, including lane marking identification, traffic sign detection, and real-time obstacle tracking [
2,
3]. However, embedding computer vision in automotive systems imposes significant challenges related to computational demand, energy consumption, and latency.
While traditional solutions using GPUs or multicore CPUs deliver significant processing power, they frequently exhibit high power draw and nondeterministic latency, rendering them less suitable for constrained embedded automotive platforms [
2,
4]. FPGA-based embedded computer vision offers a promising alternative by enabling parallel processing pipelines, deterministic latency, and energy-efficient operation [
2,
5]. Notably, implementations such as YOLOv3-Tiny on development boards like the Kria KV260 achieve detection accuracies exceeding 90% with low power consumption (~5 W), demonstrating the viability of FPGA-based vision for ADAS applications [
5,
6].
In parallel, robust ADAS requires multi-sensor integration. Techniques combining LiDAR, ultrasonic, and Hall-effect sensors enhance vehicle perception and control reliability [
7]. While full-scale automotive systems typically employ CAN or LIN buses for ECU communication, academic and prototype platforms often favor simpler serial protocols such as I
2C and UART for their ease of implementation and educational value [
8,
9]. Despite these simpler interfaces, the underlying architecture remains scalable and conceptually consistent with industrial communication standards.
Several projects have advanced FPGA-driven sensing and actuation: neuro-fuzzy ADAS sensors for personalized driving behavior models [
10] and eco-driving promoters using real-time, on-vehicle FPGA computing platforms [
11]. Similarly, modular FPGA systems have been deployed for pavement defect detection and traffic sign recognition [
12,
13]. Nevertheless, comprehensive platforms that integrate vision, sensor fusion, real-time control via STM32, and wirelessly interfaced GUIs for teleoperation are still scarce.
To address this gap, this manuscript presents the design, integration, and experimental validation of a hybrid ADAS prototype built on a 1:5 scale gasoline-powered vehicle. It features the following:
FPGA-based computer vision, utilizing YOLOv3-Tiny for detection and lane tracking.
STM32-based sensing and actuation, handling LiDAR (via I2C), ultrasonic, and Hall-effect sensors and Pulse Width Modulation-driven servomotors.
Synchronized communication via UART, facilitating data transfer between vision and control units.
Wireless GUI interface, enabling real-time monitoring, teleoperation, and system feedback.
This modular, cost-effective, energy-efficient, and extensible framework positions the platform as a strong candidate for academic training, rapid prototyping, and applied ADAS research while maintaining architectural concepts transferable to industrial environments through future integration with CAN/LIN standards.
The primary purpose of this study is to design and validate a low-cost, modular, and scalable ADAS research platform that enables both academic training and applied development in autonomous vehicle technologies. Unlike previous works focused on single-purpose systems or resource-intensive solutions, this study addresses the lack of integrated, accessible platforms combining real-time vision processing, multi-sensor data acquisition, and electronic control suitable for small-scale prototyping. The main novelty lies in the integration of FPGA-based embedded computer vision with STM32-controlled sensing and actuation within a unified, synchronized architecture, all implemented on a gasoline-powered 1:5 scale vehicle. This approach offers an energy-efficient, educational, and easily extensible system that bridges the gap between academic research platforms and industrial ADAS architectures. The contributions of this work include the hardware and software integration methodology, the demonstration of system scalability and modularity, and the experimental validation highlighting its potential as a practical tool for both research and education in intelligent mobility systems.
It is important to highlight that the present work focuses on the detection of pedestrians, vehicles, and lane markings, rather than directly addressing obstacle avoidance. This distinction is crucial, as the primary objective of this initial development stage is to establish a perception system capable of supporting additional functionalities in future phases—such as obstacle avoidance, autonomous navigation, dynamic path planning, or other applications for academic or research purposes.
The remainder of this paper is organized as follows:
Section 2 details the materials and methods, including the system architecture, hardware components, and the implementation of the computer vision and control modules.
Section 3 presents the experimental setup and evaluation procedures used to validate the proposed platform.
Section 4 discusses the obtained results in terms of detection accuracy, processing latency, and overall system performance, including an analysis of its practical applicability. Finally,
Section 5 summarizes the main conclusions and outlines potential directions for future work, emphasizing the scalability and educational value of the developed ADAS platform.
3. Results
This section details the design, integration, and experimental validation of the embedded vision and electronic control system for autonomous vehicles. It provides a comprehensive overview of the system architecture and prototype assembly, followed by the characterization of key subsystems, including the combined leverages FPGA-based image processing and STM32 microcontroller-based vehicle control, coordinated through serial communication and supported by a graphical interface for immediate variables supervision. The results demonstrate the feasibility and performance of the proposed embedded vision solution in a controlled, scaled autonomous vehicle context.
3.1. Vision System Results
The embedded vision system implemented on the Kria KV260 successfully performed real-time lane segmentation and detection of vehicles, pedestrians, and lane markings. Lane markings were identified using a sliding window algorithm, while a YOLOv3 neural network handled the identification of these three classes. The YOLOv3 architecture, based on the Darknet-53 backbone, consists of 53 convolutional layers with skip connections and residual blocks, optimized for speed and accuracy in embedded systems. The network was fine-tuned using transfer learning from pre-trained weights on the COCO (Common Objects in Context) dataset and retrained on a custom dataset using a learning rate of 0.001, batch size of 16, and the Adam optimizer over 100 epochs.
The custom dataset consisted of approximately 4500 annotated images, with 1500 high-resolution images per class (vehicles, pedestrians, and lane markings), captured from real-world urban and semi-urban environments in Pachuca and San Agustín Tlaxiaca, Mexico. Images were resized to 416 × 416 pixels and annotated using bounding boxes in Roboflow, exported in YOLO format. The dataset was split into training (80%), testing (15%), and validation (5%) sets. Data augmentation techniques such as horizontal flipping, random brightness variation, and image scaling were applied to improve generalization.
This training process enabled the neural network to achieve a mean average precision (mAP) of 87.4% and an inference speed suitable for real-time deployment on FPGA hardware. Wireless transmission of results to a mobile device facilitated system validation and visual monitoring of segmented lanes and detected vehicles, pedestrians, and lane markings, reinforcing the embedded system’s applicability to real-world ADAS applications.
To validate the performance of the system in conditions representative of real-world scenarios, a scaled testing environment was implemented. The testbed consisted of a 2 m wide cycle lane, clearly marked with solid and dashed lane delimiters, mimicking urban road geometries. This controlled environment enabled systematic experimentation under repeatable conditions, facilitating the assessment of lane detection accuracy; reliability in recognizing vehicles, pedestrians, and lane markings; and system responsiveness. The enclosed nature of the site also minimized external interference, providing a stable platform for iterative development and performance tuning of the ADAS prototype. In
Figure 3, the system’s image identification results can be observed.
The dataset size of 4500 images, with 1500 per class, is relatively limited compared to typical large-scale deep learning datasets. To mitigate this limitation, extensive data augmentation techniques were employed to enhance the model’s generalization capability. Although the validation set comprised only 5% of the data, it was complemented by a dedicated 15% test set to ensure a comprehensive evaluation of the model’s performance. Furthermore, the system’s effectiveness was validated in a controlled, full-scale scaled testing environment replicating real-world road conditions, thereby extending the assessment beyond the limitations of the dataset. Future work will focus on expanding and diversifying the dataset with additional real-world samples to further improve the robustness and generalization capacity of the model.
The vision subsystem relies on a CMOS camera coupled to the FPGA for real-time image acquisition and processing. The performance of the detection model was found to be strongly influenced by ambient lighting conditions. While satisfactory detection rates were achieved in well-lit environments, reduced detection accuracy was observed in low-light scenarios, primarily due to the lower contrast of lane markings. To ensure stable illumination during initial testing, the camera was covered to control light exposure and minimize glare effects, enabling reliable image acquisition during daylight conditions, specifically between 9:00 a.m. and 5:00 p.m., under dry weather conditions. In this initial development stage, this strategy allowed consistent detection of lanes, pedestrians, and vehicles under controlled lighting conditions. In future stages, testing will be extended to varied illumination scenarios, including low-light and night-time environments, as well as the evaluation of alternative camera systems with enhanced dynamic range and sensitivity. These improvements are expected to expand the operational capabilities of the vision module and increase its robustness under real-world conditions.
Integration with the STM32-based control unit was verified through UART communication. Based on the detected lane position and object proximity, the FPGA sent structured decision data to the microcontroller to adjust motor speed and steering. This closed-loop interaction demonstrated the feasibility of real-time collaborative control between the vision and electronic control subsystems.
Performance testing on the scaled vehicle platform showed stable operation at speeds up to 25 km/h, with reliable visual detection at distances up to 6 m.
3.1.1. YOLOv3 and Segmentation Performance
To measure the performance of this computer vision system, the neural network and the lane detection algorithm were executed simultaneously for 5 min, recording the FPS (frames per second) value every 5 s to monitor the behavior of the network, as well as to observe its highest, lowest, and average peaks.
Figure 4 shows the results obtained, allowing us to observe that, on average, the FPS value of the network remains at 3 FPS, with a minimum of 2.3 and a maximum of 3.7 at certain moments.
3.1.2. Segmentation Algorithm Accuracy
To evaluate the accuracy of the segmentation algorithm, two different metrics were used:
For the first metric, calculation was performed using 80 images, considering the distance between corners of the segmented area, since the greatest discrepancies between the segmentation algorithm and the ground truth were observed in that region. The mean absolute difference was 33.6 pixels. Given the image resolution (640 × 480 pixels), this is considered an acceptable value for the developed algorithm. Finally, the maximum difference was 75 pixels, and the minimum was 5 pixels. Additionally, the percentage of vertices (Cj) from which the distance to the true segmentation contour is less than 20 pixels was calculated. This threshold was selected based on the mean absolute difference and visual inspection showing that such distance was small.
The results indicate that 16% of the points had a non-significant difference, according to the selected criteria. In
Table 1, it is possible to notice the results of the segmentation based on distance.
Based on the segmented area of the algorithm compared to the actual area, the segmented surface from 80 images was used. Considering the surface segmented by the algorithm and the ground truth surface, the latter is defined as the correct area that follows the road path. The following areas were considered: true positive surface, false positive surface, false negative surface, and true negative surface. Using these values, the true positive rate, true negative rate, false positive rate, and true negative rate were calculated. From these, the metrics were derived, as shown in
Table 2.
The segmentation metrics reported in
Table 2, derived from a pixel-wise comparison between the algorithm’s output and the ground truth lane area over 80 images, demonstrate that the lane detection algorithm reliably identifies relevant lane pixels with a true positive rate (TPR) of 81.28% and a true negative rate (TNR) of 84.54%. This indicates a strong capability to distinguish lane markings from the background, effectively minimizing misclassification. In practical terms, in a real scenario, such segmentation accuracy could support stable lane tracking within the controlled test environment, enabling the vehicle to maintain lane position and execute basic navigational maneuvers. While some misclassifications occur, the system compensates for these through sensor fusion with LiDAR and ultrasonic data, which provide redundancy and contextual information to guide safe vehicle control. Therefore, despite segmentation errors, the combined sensor and control architecture ensures robust decision-making appropriate for the prototype’s scale and intended use. These results highlight the platform’s utility as a research and educational tool to evaluate and improve ADAS algorithms under realistic but safe operating conditions.
3.2. Electronic Control System Results
The electronic control subsystem was developed using an STM32 NUCLEO-F401RE microcontroller, responsible for managing the acquisition, processing, and transmission of sensor data, as well as the actuation of vehicle dynamics based on control signals. The system integrates five core components: a TF-Luna LiDAR sensor via I2C for frontal detection, four HC-SR04 ultrasonic sensors for lateral and near-field distance monitoring, an A3144 Hall-effect sensor for wheel rotation measurement and speed estimation, and two servomotors—one dedicated to steering and the other to acceleration and braking.
The current architecture based on the STM32 board can efficiently handle up to four UART-based sensors if using hardware serial ports. However, if additional UART devices are added via software serial port (bit-banging), CPU overhead and timing constraints may introduce data loss or instability. For I2C sensors like the LiDAR, multiple devices can be supported, as long as they have unique addresses, but the total bus bandwidth (100–400 kHz) can become a bottleneck at high sampling rates. Overall, no major processing bottlenecks are expected unless high-frequency data fusion is performed on a low-end STM32.
The TF-Luna LiDAR module demonstrated stable operation at up to a 250 Hz sampling frequency, effectively detecting in the 0.2 to 8 m range. Its narrow field of view (2°–5°) provided reliable frontal measurements, even under moderate ambient noise. The ultrasonic sensors operated with a precision of ±0.3 cm within their 2–400 cm range and were calibrated to detect proximity asymmetries on both sides of the vehicle, assisting in lateral stabilization and cars or persons avoidance during motion.
For vehicle actuation, two high-torque servomotors were deployed. The KM-3318 servo, controlling the front axle, maintained directional stability with low angular error. The KM-2013 servo modulated throttle input, allowing the vehicle to shift between five discrete speed levels based on real-time feedback from the Hall-effect sensor. The latter was mounted on the transmission shaft and enabled continuous velocity monitoring through pulse counting. Under test conditions, the system achieved a velocity resolution sufficient to maintain control accuracy within ±0.5 km/h across all five speed levels.
All sensor data and actuation states were serialized into an 8-field structured UART frame, which was transmitted from STM32 to the Kria KV260 FPGA. The FPGA, functioning as the decision-making unit, parsed the incoming data to synchronize it with the visual information captured by the onboard camera system. This low-latency communication architecture ensured tight coupling between perception and control, enabling smooth transitions and responsive navigation behaviors in the scaled test vehicle.
Regarding sensor noise and signal interference, adjustments were made to the sensor acquisition timing to minimize measurement errors, particularly in the ultrasonic modules. Due to their working principle based on acoustic wave propagation, ultrasonic sensors are susceptible to signal overlaps and environmental reflections, especially when multiple sensors operate simultaneously. To address this, response times were staggered, and sequential triggering was implemented, ensuring that individual measurements did not interfere with each other. In contrast, LiDAR and Hall-effect sensors did not present significant interference issues, given their different physical measurement principles.
However, in the case of the Hall-effect sensor, mechanical vibrations from the vehicle chassis initially affected reading stability. This issue was mitigated through the installation of physical dampers and mounting isolation elements, which reduced signal fluctuations caused by vibrations during movement. These strategies contributed to improving the reliability of sensor data acquisition under operating conditions.
To enhance the usability and monitoring capabilities of the system, a custom graphical interface was developed. This interface receives the same UART frame, decodes it in real time, and presents both sensor data and actuator states to the user.
3.3. Graphical User Interface
To facilitate visualization and interaction with the embedded control system, a custom graphical user interface (GUI) was developed. This interface serves as both a real-time monitoring dashboard and a bidirectional control panel, enabling the user to visualize sensor data and simultaneously send actuation commands to the vehicle.
The GUI communicates wirelessly with the Kria KV260 FPGA via a dedicated Wi-Fi module. The FPGA, acting as an intermediary, forwards sensor data received from the STM32 microcontroller through UART and dispatches user-generated control commands back to the STM32 for real-time actuation. This bidirectional communication loop ensures synchronous coordination between visual processing, sensor monitoring, and user control. The interface decodes a structured data frame composed of eight comma-separated fields, with the order shown in
Table 3.
The incoming frame is decoded using a custom parser within the FPGA, which extracts each field using the comma delimiter to identify start and end boundaries. These values are then transmitted to the GUI and displayed numerically and graphically, allowing the user to interpret the current state of the vehicle, including its orientation, motor throttle level, object proximity on all sides, frontal distance, and estimated speed in kilometers per hour.
In addition to visualization, the GUI includes control elements—such as sliders and directional inputs—that allow the user to modify the steering angle and throttle speed. These commands are encoded in a compatible format and sent from the GUI to the FPGA and then forwarded to the STM32 for execution via Pulse Width Modulation control of the corresponding servomotors.
The GUI enhances the operational usability of the system by providing an intuitive and efficient interface that bridges perception and actuation layers. It also supports debugging and performance evaluation by logging data during test runs.
Figure 5 presents the developed interface, highlighting the layout of sensor readings, control and communication indicators, and real-time visualization for the vision system monitoring.
In the current development stage, system safety is primarily managed through operation within controlled environments and the availability of manual override via the graphical user interface (GUI). When inconsistent or unreliable sensor data are detected—such as communication loss or erratic readings—the system triggers an alert informing the operator of the anomaly and prompting a system reset if communication does not re-establish automatically. This basic safety mechanism is suitable for the current prototype phase, where testing occurs under supervised conditions. Future development will focus on implementing automated fault detection and recovery protocols, such as continuous monitoring of sensor status, data consistency checks, and automatic fallback strategies, to ensure safe operation without requiring human intervention, especially in more autonomous or unsupervised scenarios.
Figure 6 presents the final physical implementation of the system titled FPGA–STM32-Embedded Vision and Control Platform for ADAS Development on a 1:5 Scale Vehicle, in accordance with the functional design described in
Figure 2. This configuration highlights the main components: the camera for image acquisition, the FPGA–STM32-based processing unit for real-time data analysis through the GUI, and the arrangement of sensors and actuators responsible for vehicle control. This physical representation demonstrates the integration of the various modules and subsystems that comprise the experimental platform, developed for educational and research applications in the field of advanced driver assistance systems (ADAS).
4. Discussion
The embedded ADAS prototype presented in this work was comparatively analyzed against recent FPGA-based object detection systems reported in literature. These studies adopt diverse methodologies, including quantization, hardware parallelism, streaming architectures, and model compression techniques. Such a comparative framework allows for a robust evaluation of the technical advantages and trade-offs of the proposed system.
Implementations such as those in [
23,
24,
25] employ optimized versions of YOLOv3 with quantization strategies like INT8 and 4W4A, achieving high inference rates up to 45 frames per second (FPS). However, these designs are primarily constrained to benchmarking detection models and do not incorporate control strategies or real-world actuator feedback. In contrast, the present work integrates object detection, lane segmentation, and vehicular actuation in a unified embedded platform, providing both perception and control capabilities in real-time.
Architectures described in [
26,
27,
28] utilize modular streaming accelerators and dataflow-oriented pipelines, demonstrating high energy efficiency and scalability. Nonetheless, they often operate in static test environments with preprocessed datasets, limiting their validation under dynamic, real-world conditions. The system proposed herein includes on-board sensory feedback (ultrasonic, LiDAR, and Hall-effect sensors) and is validated through experimental deployment on a gasoline-powered 1:5 scale vehicle, highlighting its functional robustness in outdoor scenarios.
Long-term experiments were not conducted during this initial stage of system validation, as the primary objective was to establish a functional prototype and evaluate its performance under controlled conditions. That said, extended operational testing is planned as a key component of future work, focusing on system durability, sensor stability over time, and component-level reliability under continuous operation. With respect to sensor drift, noise, and potential hardware failures, the current system architecture incorporates basic signal filtering and sequential acquisition strategies to minimize interference and improve data stability, particularly for ultrasonic modules. Sensor fusion using complementary sensors (LiDAR, ultrasonic, and Hall-effect) also contributes to mitigating individual sensor errors. Still, advanced fault-tolerant strategies, such as redundancy, sensor health monitoring, or adaptive recalibration methods, will be explored in subsequent development phases to enhance system robustness and long-term operational reliability.
Several studies [
29,
30,
31] explore resource-efficient inference through weight quantization, filter pruning, and reuse of convolutional modules. While these methods yield significant improvements in throughput and silicon utilization, they may reduce detection accuracy in multi-class scenarios or variable lighting and occlusion conditions. In contrast, this work leverages a YOLOv3-Tiny model trained with a custom dataset acquired from local environments, thus ensuring higher relevance and adaptability to urban traffic contexts.
Although newer YOLO architectures such as YOLOv5, YOLOv7, and YOLOv8 offer improved accuracy and model efficiency, their integration within this study was constrained by hardware compatibility limitations. Specifically, the Xilinx Kria KV260 platform used for vision processing in this project supports pre-optimized hardware acceleration pipelines for a limited set of models. Among these, YOLOv3-Tiny is one of the few officially supported and fully deployable models with existing FPGA IP blocks and reference designs. This constraint strongly influenced the model selection process.
Despite being an earlier version, YOLOv3-Tiny remains highly relevant for embedded and real-time applications due to its lightweight architecture, low latency, and efficient hardware footprint, which are critical for resource-constrained systems. In our implementation, YOLOv3-Tiny enabled reliable people, car, and lane detection within the timing and power consumption requirements of the platform, validating its suitability for the project’s objectives.
Nevertheless, the authors acknowledge that newer architectures optimized for edge devices (e.g., NanoDet, YOLOv5-N, MobileNet-SSD) offer promising improvements and should be explored in future works. Future versions of the platform will consider migrating to custom-accelerated pipelines or deploying on platforms with broader deep learning framework support to allow experimentation with more recent, high-performance models.
High-performance systems employing newer FPGA platforms and detection models—such as those described in [
32,
33,
34]—demonstrate superior detection metrics and real-time throughput. However, their reliance on costly hardware (e.g., Xilinx ZCU104, Virtex-7) and proprietary IP cores increases development complexity and hinders replicability in academic settings. The current system, by comparison, prioritizes accessibility, using the Xilinx Kria KV260 board, sourced from AMD (formerly Xilinx, Inc.), headquartered in San José, California, United States, and STM32 microcontroller, sourced from STMicroelectronics, headquartered in Geneva, Switzerland to maintain a favorable cost–performance ratio suited for educational use and prototype development.
Concerning scalability for deploying newer iterations of YOLO, the current hardware platform, based on the Xilinx Kria KV260, presents specific limitations. The pre-optimized hardware acceleration pipelines available for this FPGA platform support only a restricted set of neural network architectures, including YOLOv3-Tiny. While the FPGA fabric theoretically offers computational resources to implement more advanced models, significant reconfiguration at the hardware description level would be necessary, which falls outside the intended scope of rapid prototyping and educational use. Consequently, migrating to alternative processing platforms or hybrid FPGA–CPU systems will be considered in future development stages to support the deployment of more recent and computationally demanding detection architectures.
Additional contributions, such as those in [
35,
36,
37], propose advanced techniques, including shared memory schemes, low-power neural execution, and reconfigurable processing pipelines. While these strategies optimize power consumption and silicon efficiency, they typically omit system-level integration with actuators and teleoperation interfaces. The proposed system addresses this gap by incorporating UART-based communication for sensor–actuator synchronization and offering a graphical user interface (GUI) for command issuance and data visualization.
Although the present system employs UART and I2C communication protocols for data transfer between modules, which are relatively simple compared to the CAN (Controller Area Network) and LIN (Local Interconnect Network) buses used in full-scale automotive systems, this design decision enhances modularity and learning accessibility. The architecture remains scalable and conceptually consistent with industry standards, allowing students and researchers to understand and replicate critical interactions in vehicular communication systems. As such, the prototype is especially well-suited for academic environments where the emphasis is on foundational concepts, control strategies, and perception–actuation integration.
In comparison to existing FPGA-based ADAS implementations, this work stands out due to its focus on educational and prototype-level deployment using low-cost, modular architecture. For instance, the ADAS platform described in Sensors (2024) integrates a Xilinx ZCU104 FPGA to accelerate YOLOv3-based pavement defect detection, transmitting results via UART-to-CAN conversion for real-time communication [
9]. While that study targets infrastructure inspection, its combination of FPGA vision processing and automotive interface exemplifies comparable integration strategies. Similarly, a study presented a dedicated SSD pedestrian detection accelerator implemented on a Zynq device, demonstrating high inference speed through network compression and hardware optimization [
38]. In addition, a road segmentation system using LiDAR data and FPGA-based CNN inference achieved processing latencies under 17 ms for each scan, evidencing the viability of embedded LiDAR–CNN solutions [
39]. The current platform differs by unifying computer vision (pedestrians, vehicles, and lanes), sensor fusion, and actuator control on a small-scale vehicle using Kria KV260 and STM32, thus bridging the gap between accessible academic prototypes and industrial ADAS frameworks.
In the work developed by Yecheng Lyu et al. [
40], an optimized system is presented for real-time road segmentation using LiDAR data processed through convolutional neural networks (CNNs) implemented on an FPGA, achieving a latency of only 16.9 ms per scan and high accuracy validated with the KITTI dataset. In contrast, the system proposed in this paper introduces a modular, scalable, and low-cost ADAS platform that integrates computer vision, heterogeneous sensors, and embedded control, all implemented on a 1:5 scale vehicle with real-time monitoring and teleoperation capabilities. While the approach by Lyu et al. focuses on a specific task with high computational efficiency, the system presented here offers greater flexibility for experimentation and validation of multiple ADAS functionalities. Thus, both approaches are complementary: one prioritizes performance in a critical process for vehicle autonomy, and the other provides a versatile environment for applied research and practical training.
The work presented by Javier García López et al. [
41] focuses on adapting a deep neural network based on VoxelNet for vehicle detection in LiDAR point clouds, leveraging FPGA technology to achieve real-time performance and high accuracy on complex datasets. In contrast, the proposed research develops a modular, scalable educational ADAS platform implemented on a 1:5 scale vehicle, integrating FPGA and microcontroller-based accelerated processing and control, employing YOLOv3-Tiny for image-based object detection alongside multiple sensors. While the former emphasizes optimizing 3D detection on FPGA hardware, the latter offers a comprehensive hardware–software integration for rapid prototyping and practical training, albeit with inherent limitations due to the reduced scale of the prototype.
Compared with other FPGA-based detection systems reported in literature, the proposed platform prioritizes functionality integration over inference throughput. While it does not surpass high-end architecture in terms of speed or quantization efficiency, it offers significant advantages in terms of cost-effectiveness, didactic utility, and full-system integration. As such, the prototype serves as an effective foundation for ADAS experimentation, embedded systems training, and future research into autonomous mobility technologies.
In contrast to recent developments in autonomous vehicle technologies, the proposed system offers a modular and low-cost integration of real-time perception and control using an STM32 microcontroller and a Kria KV260 FPGA. Unlike the approach presented by Ahmad et al. [
42], which focuses on the detection of small, biased GPS spoofing attacks using time-series analysis and inertial sensor fusion, our system does not rely on GNSS signals and is therefore inherently resilient to such threats. However, this design choice also limits its applicability in large-scale outdoor environments where satellite-based navigation is necessary. On the other hand, the review conducted by Abdollahi et al. [
43] highlights the role of autonomous vehicles in pavement condition assessment and urban infrastructure monitoring. While their work provides a comprehensive overview of sensing technologies and analytical tools, it lacks the hardware-level integration and real-world validation demonstrated in our system. Accordingly, the proposed architecture bridges a relevant gap by delivering an experimentally validated solution that combines vision-based object detection, multisensor data acquisition, and actuation control. This system serves not only as a platform for ADAS experimentation but also as a scalable foundation for future autonomous applications. Nevertheless, future iterations may benefit from integrating secure positioning mechanisms and expanding sensing capabilities for infrastructure-oriented tasks.
Recent studies have demonstrated the potential of embedded devices optimized for real-time inference, such as the DNN model developed by Park et al. [
44], which highlights the importance of reducing computational complexity to achieve efficiency without sacrificing accuracy. In this regard, the implementation of YOLOv3-Tiny on the platform achieves competitive precision with a mean average precision (mAP) of 87.4%, although the average frame rate (FPS) of 3 indicates room for improvement compared to systems targeting higher frequencies for real-time applications [
45].
Regarding the specific functionality of lane and object detection and segmentation, the method based on sliding window segmentation and YOLOv3-Tiny detection shows robust results under controlled conditions, comparable to the effectiveness of commercial and infrastructure-based lane-centering technologies, as described by Kadav et al. [
46]. Although infrastructure-based solutions may offer greater robustness under adverse conditions, the flexibility and lower cost of embedded and vision-based systems represent a significant advantage for research and educational scenarios. Additionally, the creation of a custom dataset with manual annotations from real urban environments strengthens the system’s adaptability to specific contexts, a feature less emphasized in other works relying on standardized datasets [
47].
Finally, the integration of bidirectional wireless communication and a graphical user interface (GUI) for real-time monitoring and teleoperation constitutes a practical and educational contribution that complements technical architecture. Unlike systems focusing exclusively on hardware–software optimization for real-time detection, such as those presented by Zaharia et al. [
48] and Sarvajcz et al. [
45], the proposed approach incorporates a comprehensive framework that facilitates supervision and control in physical test platforms, a key requirement for training and experimental development in ADAS. However, sensitivity to varying lighting conditions and the low FPS suggest that future works should consider improvements in the visual acquisition system and algorithmic optimizations to extend operability in real-world and higher-speed scenarios, common challenges in current literature [
45,
47].
While the current validation was performed using a 1:5 scale vehicle under controlled experimental conditions, this approach inherently limits the generalizability of the results to real-world automotive environments. The use of a scaled platform was intended to provide a safe, low-cost, and flexible testing environment during the early stages of development. However, it is acknowledged that the current experimental validation is limited. The primary goal at this stage was to establish a low-cost, modular platform suitable for academic and prototype-level development, where safety and cost constraints necessitate initial testing in a simplified, scaled environment.
Nevertheless, extensive field testing under diverse environmental conditions—such as varying lighting, weather, and road textures—is necessary for a comprehensive assessment of the system’s robustness and reliability. Expanding the testing scenarios to include both advanced simulation environments and real-world field conditions will allow for evaluation of the platform’s performance in dynamic and uncontrolled settings.
Moreover, transitioning from scaled vehicles to full-scale platforms is identified as a key next step, where the modular and scalable nature of the proposed system will facilitate adaptation to larger testbeds. This progression will enable the assessment of the platform’s applicability to industrial-level ADAS development, addressing current limitations in environmental diversity and operational complexity.
In summary, the proposed system distinguishes itself through its multi-layer integration of perception, control, and human interaction, facilitated by a hybrid FPGA–STM32 architecture and complemented by a wireless graphical interface. While it may not match specialized industrial systems in communication complexity or raw inference speed, it provides a versatile, low-cost, and scalable platform ideal for academic instruction, rapid prototyping, and experimental ADAS research. Its communication architecture, while simplified, is aligned with core automotive principles, making it a didactic tool that bridges conceptual understanding with practical system integration. Future development may incorporate higher-speed or automotive-standard buses, further quantization optimization, and adoption of newer detection models to enhance real-time performance and functionality.