The Genesis of AI by AI Integrated Circuit: Where AI Creates AI

Baungarten-Leon, Emilio Isaac; Ortega-Cisneros, Susana; Abdelmoneum, Mohamed; Vidana Morales, Ruth Yadira; Pinedo-Diaz, German

doi:10.3390/electronics13091704

Open AccessArticle

The Genesis of AI by AI Integrated Circuit: Where AI Creates AI

by

Emilio Isaac Baungarten-Leon

^1,2,*

,

Susana Ortega-Cisneros

^1,*

,

Mohamed Abdelmoneum

³,

Ruth Yadira Vidana Morales

³

and

German Pinedo-Diaz

¹

Centro de Investigación y de Estudios Avanzados, Instituto Politécnico Nacional, Zapopan 45019, Mexico

²

Diseño Ciencia y Tecnologia, Universidad Autónoma de Guadalajara, Ciudad Universitaria, Zapopan 45129, Mexico

³

Intel Corporation-Intel Labs, Hillsboro, OR 97124, USA

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(9), 1704; https://doi.org/10.3390/electronics13091704

Submission received: 5 March 2024 / Revised: 3 April 2024 / Accepted: 4 April 2024 / Published: 28 April 2024

(This article belongs to the Special Issue Generative AI and Its Transformative Potential)

Download

Browse Figures

Versions Notes

Abstract

:

The typical Integrated Circuit (IC) development process commences with formulating specifications in natural language and subsequently proceeds to Register Transfer Level (RTL) implementation. RTL code is traditionally generated through manual efforts, using Hardware Description Languages (HDL) such as VHDL or Verilog. High-Level Synthesis (HLS), on the other hand, converts programming languages to HDL; these methods aim to streamline the engineering process, minimizing human effort and errors. Currently, Electronic Design Automation (EDA) algorithms have been improved with the use of AI, with new advancements in commercial (such as ChatGPT, Bard, among others) Large Language Models (LLM) and open-source tools presenting an opportunity to automate the chip design process. This paper centers on the creation of AI by AI, a Convolutional Neural Network (CNN) IC entirely developed by an LLM (ChatGPT-4), and its manufacturing with the first fabricable open-source Process Design Kit (PDK), SKY130A. The challenges, opportunities, advantages, disadvantages, conversation flow, and workflow involved in CNN IC development are presented in this work, culminating in the manufacturing process of AI by AI using a 130 nm technology, marking a groundbreaking achievement as possibly the world’s first CNN entirely written by AI for its IC manufacturing with a free PDK, being a benchmark for systems that can be generated today with LLMs.

Keywords:

convolutional neural network; hardware design; integrated circuit; large language models

1. Introduction

The history of Integrated Circuit (IC) design is marked by innovation and technological strides. It began in the late 1950s with the introduction of the transistor [1]. Texas Instruments pioneered the first IC in 1958, integrating two transistors on a silicon–germanium bar [2]. Until the arrival of Computer-Aided Design (CAD) tools in 1966, ICs were manually drawn on paper [3].

The evolution of Hardware Description Language (HDL) started in the early 1970s with Register Transfer Level (RTL), allowing for thousands of transistors per IC [4]. DEC’s PDP-16 RT-Level Modules [5], Instruction Set Processor Specifications [6], and Incremental System Programming Language [7] were significant contributions. In the late 1970s, programmable logic devices increased the demand for standard languages, and in 1985, Gateway and Intrametric introduced Verilog and VHSIC Hardware Description Language (VHDL) [8,9].

Alongside Verilog and VHDL, C-based hardware description languages, known as High-Level Synthesis (HLS), emerged. SystemC allowed the use of standard C++ and a class library for HDL generation in 1999, simplifying the IC development process with HLS [10]. Today, HLS tools like LegUP, Xilinx Vivado HLS, and Intel’s HLS compiler transform C++ into HDL.

On the other hand, the journey of Artificial Intelligence (AI) started in the 1960s and improved during the 1970s and 1980s with foundational concepts and algorithms of deep learning and Artificial Neural Networks (ANN) [11,12]. During the late 1980s and early 1990s, the Machine Learning (ML) and AI community experienced a wave of enthusiasm as it was discovered that ANNs could tackle certain problems in novel ways. These networks had the distinct advantage of processing raw and diverse data types and developing hierarchical structures autonomously during the training phase for predictive tasks. However, the computational power at the time was insufficient for large-scale problems, limiting the application to smaller, simpler tasks [12,13,14].

It was not until the end of the 2000s that technological advancements, propelled by Moore’s Law, equipped computers with the necessary power to train extensive ANNs on substantial, real-world challenges, such as the Imagenet project [15]. This advancement was largely due to the advent of general-purpose computing on graphics processing units, which offered superior floating-point performance compared to Central Processing Units (CPUs) [16]. This shift enabled ANNs to achieve remarkable results on complex issues of significant importance.

The last decade has been transformative for ML, especially with the rise of deep learning techniques that utilize ANN. These advancements have significantly enhanced the precision of systems in various domains [17]. Notable progress has been made in fields such as computer vision [18,19,20,21], speech recognition [22,23], language translation [24], and other complex natural language processing tasks [25,26,27,28,29,30]. This progress is attributed to the collective efforts and breakthroughs documented in key research papers.

Additionally, reinforcement learning shows promise in automating the design of custom Application-Specific Integrated Circuit (ASIC) by solving nondeterministic polynomial-hard optimization problems that are currently reliant on human expertise. This approach could revolutionize the synthesis, placement, and routing processes in chip design, potentially outperforming human teams by rapidly generating efficient layouts [31,32,33]. Google’s preliminary experiments with this technology have yielded encouraging results, suggesting a future where machine learning accelerates and enhances the ASIC design process [14].

Research conducted by International Business Strategies Inc. in 2014, 2018, and 2022 categorizes the IC design costs into seven components: Intellectual Property (IP), Architecture, Verification, Physical Design, Software, Prototyping, and Validation. These studies reveal that design costs fluctuate significantly due to two primary factors: the prevailing technology at the time and the nanometer scale at which it is desired to fabricate. For instance, the design cost for a 28 nm circuit was approximately USD 140 million in 2014, reduced to USD 51.3 million in 2018, and further decreased to USD 48 million in 2022. Based on the 2018 and 2022 analyses, the estimated distribution of costs is as follows: IP at 6.85%, Architecture at 5.24%, Verification at 21.24%, Physical Design at 10.2%, Software at 43.32%, Prototyping at 5.24%, and Validation at 7.92%. These percentages provide a framework for approximating the allocation of expenses in IC design.

Advancements in machine learning could streamline the entire ASIC design process, from high-level synthesis to low-level logic placement and routing. This automation could drastically cut down design time from months to weeks, changing the economic calculus by reducing costs in Prototyping, Verification, and Architecture, combined with open-source tools and IPs, design costs would be further reduced. It may be feasible to create customized chips, which are currently reserved for high-volume and high-value scenarios.

Today, commercial LLMs like OpenAI’s ChatGPT [34], Google’s Bard [35], and Microsoft AI chatbot [36] have been used to introduce innovative HDL generation. These methods involve feeding the LLM with the system specifications, which then automatically produce HDL code. This synergy between AI and IC development promises enhanced efficiency and opens new frontiers in the field. Nevertheless, the state-of-the-art models fall short in their ability to effectively comprehend and rectify errors introduced by these tools, making it challenging to autonomously generate comprehensive designs and testbenches with minimal initial human intervention [37,38,39].

This work combines different processes to increase the complexity of an IC and reduce the amount of work required. The primary research inquiry revolves around the capability of contemporary commercial LLMs to produce Convolutional Neural Network (CNN) hardware designs that are not only synthesizable, but also manufacturable using the first open-source Process Design Kits (PDKs) called SKY130A.

The development of AI by AI—a CNN IC engineered for MNIST dataset classification—involves the use of LLM, Vivado HLS, Verilog, OpenLane, and Caravel. AI by AI was entirely crafted by OpenAI’s ChatGPT-4. It began as a TensorFlow (TF) CNN architecture, followed by a downscaling from Python to C++, and then was translated to Verilog using Vivado HLS. The layout design process is made by OpenLane, resulting in a layout IP of the CNN. The journey culminated with the integration of the CNN IP with Caravel, a template System on Chip (SoC) which is ready for manufacturing using ChipIgnite shuttles, a multi-project wafer program by Efabless, with the SKY130A PDK [40,41]. Throughout this paper, we delve deeply into the development of AI by AI IC from TF to tape-out.

The remainder of this work is organized as follows: Section 2 provides an overview of the employed tools, outlining both their advantages and disadvantages; Section 3 explains the workflow and conversation flow; Section 4 is about the implementation of AI by AI IC; Section 5 shows the obtained results; Section 6 presents the discussions; and, finally, Section 7 concludes this work.

2. Development Tools

2.1. Vivado HLS 2019.1

While traditional HDLs like Verilog and VHDL are acknowledged for their efficacy, their low-level abstraction often leads to long development cycles. A divergent approach is presented by HLS, offering a faster and more agile solution for hardware description development [42].

HLS functions through an automated process, enabling the generation of synthesizable RTL code from algorithms scripted in high-level languages such as C/C++ or System C. Although the resulting RTL code is commonly implemented on a Field-Programmable Gate Array (FPGA), it can also be translated into silicon, since it is described in HDL. In this case, the attractiveness of HLS lies in the possibility of generating HW with programming languages [42,43,44], Table 1 shows some advantages and disadvantages of HLS.

2.2. OpenLane

This software is an open-source automated flow for layout design, conformed by various tools from OpenROAD and Qflow, focusing on the RTL to Graphic Design System (GDSII) design. Initially deployed for implementing the StriVe family, a RISC-V based SoC, using free EDA tools and the first open-source PDK SKY130A.

Currently comprising over seventy scripts and utilities, OpenLane can be configured for customized flows, enabling the implementation of diverse designs with any technology or PDK. The flow encompasses stages like synthesis, floorplaning, placement, Clock Tree Synthesis (CTS), routing, tapeout, and signoff [45,46,47].

The OpenLane flow initiates with HDL synthesis where the Yosys synthesis tool optimizes the design, resulting in a netlist mapped by the PDK. During this phase, design constraints like clock definition and boundary conditions can be integrated, and Static Timing Analysis (STA) can be executed using the OpenSTA tool. Subsequently, floorplanning is conducted, with OpenROAD tools employed for macro-related tasks, producing a Design Exchange Format (DEF) file and defining matrix and macro core sizes. The Padring tool is harnessed for chip-level floorplanning, optimizing core pin positions for improved pad frame and core interconnect placement.

Post-floorplanning, standard cell, and macro placement are accomplished using the Re-PlAce tool, with subsequent placement checks conducted via OpenDP. The CTS phase follows, with TritonCTS placing clock branches and OpenDP adding necessary buffers. Routing is executed through a two-step approach: an initial phase with FastRoute, followed by a more intricate process with TritonRoute. In the concluding stages, the design undergoes verifications, including Design Rule Check (DRC), Layout Versus Schematic (LVS), and STA. Successful completion of these checks deems the design suitable for approval [45,48]; a graphical representation of the described process is illustrated in Figure 1 below.

With the rise of OpenLane, new research has made a comparative analysis of this open-source tool with commercial tools [45,46,47,49,50]. Table 2 shows some of the advantages and disadvantages of OpenLane.

2.3. Caravel

Caravel is an SoC template developed by Efabless and built upon SKY130A and GF180MCUC technologies. It comprises three main sections: the template frame and two wrapper modules, known as the management area and user area [51].

The template frame is equipped with essential components, including a clocking module, Delay Locked Loop (DLL), user ID, housekeeping Serial Peripheral Interface (SPI), Power-On Reset (POR), and a General-Purpose Input/Output (GPIO) controller. The management area, housing a RISC-V based SoC, can configure and control the user area. The user area occupies a silicon space of 2.92 mm by 3.52 mm and includes 38 I/O pads, 128 Logic Analyzer (LA) signals, and four power pads. Figure 2 shows the block diagram of Caravel and its three sections [51].

The very nature of a template offers great advantages when designing an IC; however, it also has some limitations. Table 3 shows these advantages and disadvantages.

3. Workflow and Conversation Flow

3.1. Large Language Model Conversation Flow

The cornerstone of this work lies in the use of a commercial LLM for precise code generation, guided by the conversational flow depicted in Figure 3.

The process starts by combining code from a higher abstraction level, if available, with the initial prompt. If the AI response does not meet the expected criteria, the creation of a more detailed prompt is initiated, to clarify specific requirements.

Upon receiving the expected response, the progression involves code simulation and testing, which means running the function for different cases and getting the expected result. The conversation concludes when the code functions as intended. However, in cases of code malfunction, the subsequent step entails the crafting of a new prompt incorporating error messages, heightened specificity, illustrative examples, or details regarding required code modifications, e.g., if the code does not work due to a data type error, it communicates so to the AI. After multiple iterations, when the LLM consistently produces similarly incorrect responses, it indicates the need to commence a new chat session.

3.2. From TensorFlow to Layout

Throughout the entire workflow, LLM played a central role in code generation, aligning with the conversation flow detailed in Section 3.1. AI by AI commences with the creation and training of the CNN architecture via TF. This initial phase allows training the CNN and capturing its essential weights and biases.

Considering the limitations of Caravel, we chose to implement a compact CNN with the following layers: Input layer (28 × 28 × 1), Convolutional Layer 1 (26 × 26 × 4), Max Pooling Layer 1 (6 × 6 × 4), Convolutional Layer 2 (4 × 4 × 8), Max Pooling Layer 2 (2 × 2 × 8), Flattening Layer (1 × 32), and dense layer.

Subsequently, the transformation of the TF model into a set of Python functions dedicated to executing the inference of the CNN, without the use of libraries, is initiated. A pivotal following step involves converting the Python-based forward function into C++, allowing the use of Vivado HLS.

The workflow culminates with the implementation of the CNN at the layout level, integrating it with Caravel. Figure 4 presents a visual representation of this process.

4. Development of AI by AI

The development of AI by AI consists of a series of dialogues with ChatGPT-4, following the conversational structure outlined in Figure 3. For access to the complete conversations, the generated code, and the entire project, please refer to the following GitHub repository: https://github.com/Baungarten-CINVESTAV/AI_by_AI (accessed on 4 March 2024). Table 4 provides the ChatGPT URL of each conversation and the main topic covered in those conversations, accessed on 4 March 2024.

This chapter is structured into five distinct subsections, as visually represented in Figure 4. In each of these sections, the relevant prompts, primary challenges, key considerations, and the step-by-step development process are detailed. The journey commences with the creation of the CNN using TF, and culminated with the generation of the GDSII file ready for manufacturing.

4.1. CNN with TF

The CNN was designed for image inference tasks toward the renowned MNIST dataset [52]. To harness the power of cloud computing, we opted for Google Colab [53], primarily due to its integration of TF libraries and the capacity to use GPUs.

The noteworthy prompts that emerged during the interactions with ChatGPT-4 included:

The approach taken involved implementing a compact network using the following layers 4 × 3 × 3 Conv2D, 4 × 4 MaxPool, 8 × 3 × 3 Conv2D, 2 × 2 MaxPool, flatten, and finally, the dense layer, as well as the use of half-precision floating-point format to optimize resource usage. Figure 5 illustrates the CNN created.

The implemented CNN utilizes a total of 666 parameters. This breakdown encompasses 36 weights and 4 biases for the initial convolutional layer, 288 weights and 8 biases for the second convolutional layer, and, finally, 320 weights and 10 biases for the dense layer. In terms of memory consumption, this results in a total of 1.332 KB required only for storing the weights and biases. At the end of the training phase, the model showed an accuracy of 99.4%. Part of the TF code of the CNN generated by the IA can be found below.

4.2. Forward Function in Python

Implementing the inference function in Python without the use of the TF library is a critical step in the process because, as we approach lower-level languages or avoid the use of libraries, we obtain answers with a higher number of errors. To face that problem, we provide the LLM with examples in a higher level language. In this case, ChatGPT-4 is instructed to utilize the pre-existing network, created with TF, to create the inference function using the weights and biases from previously saved NumPy files.

Key prompts from interactions with ChatGPT-4:

The previous chat generated six essential secondary functions required for inference implementation: relu, softmax, conv2d_forwar, maxpool2d_forward, flatten, dense_forward, and a main function named forward, which calls within it the secondary functions. The following code shows the definition of the forward function and how it was used to perform the test phase.

Chat interactions from Section 4.1 and Section 4.2 were brief, primarily due to the use of Python.

4.3. From Python to C++

Utilizing a low-level programming language necessitated a more explicit approach to crafting prompts. This involved providing the entire code for the seven previously generated functions and demanded a higher number of iterations.

Main prompts obtained during interactions with ChatGPT-4:

After the “From Python to C code” conversations mentioned in Table 4, we achieved a successful implementation of all the layers of the CNN in a short time. The C++ code presented below shows how the forward function is called N times for the test phase.

Part of the forward_pass function is presented below, where each of the layers, both convolutional and maxpool, was implemented through a series of for loops, where variable i represents the pixel coordinate in x, variable j represents the pixel coordinate in y, and variable k represents the filter number. On the other hand variables di and dj represent the kernel, being a 3 × 3 kernel for the first convolutional layer.

The C++ code provided by the AI can be easily scaled and customized to create various convolutional layers, changing only the ranges of the first two for loops that represent the size of the image, the third for represents the amount of filter that the layer has, and the last two for loops represent the size of the kernel. This versatility opens the opportunity to construct a wide range of CNNs, and all with the code provided by the AI.

4.4. Vivado HLS Considerations

The C++ code generated by the IA uses floating data types, although Vivado HLS supports this type of data when implemented at the hardware level it uses a restricted Floating Point Units (FPUs) IP, so its use is limited only to Xilinx boards.

To face this issue, C++ functions that utilize 16-bit integer data types, but perform floating-point operations at the bit level, were developed through a series of LLM conversations, keeping in mind the IEEE® 754 half-precision floating-point format.

A total of eight functions were developed: addition, subtraction, multiplication, division, exponential, softmax, relu, and max. The addition, multiplication and division functions can be found in Appendix A.

The main prompts obtained during interactions with ChatGPT-4 are:

The generated functions are then used to perform floating operations and used to replace the arithmetic symbols of the existing solution; e.g., instead of executing the Electronics 13 01704 i011

operationpresented in the forward function, the operation is executed as Electronics 13 01704 i012

where the multiplication of the pixel and the kernel is performed by the multiply_custom_floa function, and the summation of the convolution by the add function.

Due to variations in rounding methods for floating operations, the accuracy experienced a 1.4% reduction, which means that change from 99.4% to 98%. However, this error can be avoided if the floating functions created use exactly the same rounding algorithm used by TF.

4.5. Integration of the CNN with Caravel

To integrate the CNN with the SoC template Caravel involves the creation of a single macro encompassing the logic of all the modules generated by HLS, because the logical density of the design utilizes the majority of the user area an external memory was employed for image storage which was connected to Caravel via GPIO ports. Meanwhile, the CNN was linked to the Caravel RISCV processor using the LA ports as Figure 6 illustrates. This connection allowed the RISCV processor to manage the initiation of the inference process, with the signal la_data_in[2], system restarts, with the signal la_data_in[1], and receive the response of the inference from the CNN, with the signal la_data_out[31:28]; Table 5 shows the connection between AI by AI and Caravel.

The verilog code provided to the OpenLane layout tool is just an instantiation of the IP generated by HLS connected to the Caravel ports; Appendix B shows this instantiation.

5. Results

After establishing the connections between Caravel and the CNN, a testbench of the entire SoC was developed using the training data set to evaluate the performance of the CNN. Due to the RISC-V managing the SoC, some registers using C++ were configured to enable the utilization of LA ports, allowing communication between the CNN and the RISC-V processor, as well as GPIOs that enabled connectivity between the external SRAM and the SoC.

Figure 7 illustrates the SoC testbench, the image stored in memory, and the C++ code programmed in the RISC-V processor. The figure depicts the processor’s handling of reset signals, start processes, the waiting period for the done signal, and the resulting inference values. After 1000 iterations, the system yields the same results as the HLS test, with an accuracy of 98%, proving that it works as intended.

Table 6 presents the layout specifications, with the SKY130A PDK, for the AI by AI system, including the gate count, die area, latency, maximum frequency, and power consumption.

The outcome of the RTL to GDSII conversion process, along with its integration with the RISC-V made with Caravel, is visually presented in Figure 8. It illustrates two distinct areas: the user area, representing a flat implementation of the CNN, and the management area, housing the processor and its associated peripherals.

This project was the winner of the AI-generated design competition hosted by Efabless, which can be accessed at this link: https://efabless.com/genai/challenges/2-winners (accessed on 4 March 2024). Additionally, the CNN SoC is currently undergoing fabrication through the multi-project wafer shuttle CI 2309, which is available at https://platform.efabless.com/shuttles/CI%202309 (accessed on 4 March 2024).

6. Discussion

The findings of this research highlight significant aspects, such as:

The current limitations of LLMs in generating HDL code.
Establishing a workflow that utilizes LLMs to generate and downscale systems from TF to HDL.
Introducing a new approach for converting HLS to GDSII using open-source PDKs and tools.
Achieving the fabrication of a CNN IC entirely created by AI.
Setting a precedent for current AI-generated systems by providing specific system information, such as core area, cells per square millimeter, latency, power consumption, number of flip-flops, and total number of cells.
Offering open-source access to the entire project, from the initial conversation with the AI to the final GDSII files generated.

These findings directly address our central research question, “are contemporary commercial LLMs capable of producing synthesizable and manufacturable CNN hardware designs using the first open-source PDKs (SKY130A)?”,by providing new understanding and evidence that current commercial LLMs are not capable of directly creating a CNN in HDL; however, they are capable of creating synthesizable HLS code that can be used to generate IC with open-source tools. The paper elucidates the development of AI by AI, an innovative IC harnessing the power of AI. Our methodology involved the transformation of AI-generated TF code into Verilog, progressing through layout implementation and seamless integration with a RISC-V via Caravel. This process ultimately enabled us to propel AI by AI into the manufacturing phase through the ChipIgnite program.

AI by AI stands as a pioneering achievement, being the first CNN IC of its kind to be entirely conceptualized by AI and be fabricated with the open-source PDK SKY130A. Our approach harmoniously merges cutting-edge technologies, such as commercial LLMs, with more traditional ones like HLS and Verilog, creating an innovative workflow for developing intricate digital systems, particularly CNNs, and exploring the capacities of the current LLM. Frameworks like Caravel and multi-project wafer programs such as ChipIgnite have simplified and made cost-effective the layouts development and fabrication process.

While current commercial LLMs may not yet excel in rapidly and accurately producing Verilog and VHDL code, they have matured enough to proficiently handle programming tasks. The sequential transition from higher abstraction to lower abstraction languages, supplemented by tools like HLS, empowers us to generate functional Verilog code that seamlessly integrates into the silicon-level implementation process. This combination of technologies and methodologies has opened new horizons for AI-driven IC development.

7. Conclusions

AI is experiencing a boom in various sectors, including IC design. With LLMs such as ChatGPT, exploration in HDL generation has begun, which could reduce design costs by 31.72%—impacting prototyping, architecture, and verification phases, and compressing design timelines from months to weeks. Additionally, leveraging open-source tools and IPs could further reduce costs associated with software (43.32%) and IP (6.85%), respectively. Despite the potential, current LLMs have difficulties in producing complex HDLs systems with accurate performance, and open-source IPs are not as abundant as software libraries. Therefore, current research is focused on high-level languages such as Python and C++ to enable LLMs to efficiently create complex systems such as CNNs. HLS becomes crucial in this context for translating high-level code into HDL that, through the physical design flow, generates an IC that can be manufactured. The use of HLS causes some issues related to floating point operations, which can lead to a loss of accuracy and increased logical demands. If the loss of accuracy is significant, we recommend accessing the TF code and replicating in C++ the rounding algorithms it uses. This research establishes a benchmark for current LLM capabilities in ICs design, in particular for the design of CNNs, and is a point of comparison for evaluating future AI-generated ICs.

Author Contributions

Conceptualization, E.I.B.-L. and S.O.-C.; methodology, E.I.B.-L. and S.O.-C.; software, E.I.B.-L.; validation, E.I.B.-L., M.A., R.Y.V.M. and G.P.-D.; formal analysis, E.I.B.-L. and R.Y.V.M.; investigation, E.I.B.-L. and M.A.; resources, S.O.-C.; writing—original draft preparation, E.I.B.-L.; writing—review and editing, E.I.B.-L., S.O.-C., M.A., R.Y.V.M. and G.P.-D.; supervision, S.O.-C. and M.A.; project administration, E.I.B.-L. and S.O.-C.; funding acquisition, S.O.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available or are mentioned in this paper.

Conflicts of Interest

Dr. Mohamed Abdelmoneum and Dr. Ruth Ruth Yadira Vidana Morales are employed by Intel Corporation. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Half Precision Floating Point Functions in C++

Appendix B. Top Module Verilog Code

References

Bardeen, J.; Brattain, W.H. The transistor, a semi-conductor triode. Phys. Rev. 1948, 74, 230. [Google Scholar] [CrossRef]
Kilby, J.S.C. Turning potential into realities: The invention of the integrated circuit (Nobel lecture). ChemPhysChem 2001, 2, 482–489. [Google Scholar] [CrossRef] [PubMed]
Spitalny, A.; Goldberg, M.J. On-line operation of CADIC (computer aided design of integrated circuits). In Proceedings of the 4th Design Automation Conference, Los Angeles, CA, USA, 19–22 June 1967; pp. 7-1–7-20. [Google Scholar]
Barbacci, M. A Comparison of Register Transfer Languages for Describing Computers and Digital Systems. IEEE Trans. Comput. 1975, C-24, 137–150. [Google Scholar] [CrossRef]
Bell, C.G.; Grason, J.; Newell, A. Designing Computers and Digital Systems Using PDP 16 Register Transfer Modules; Digital Press: Los Angeles, CA, USA, 1972. [Google Scholar]
Barbacci, M.R.; Barnes, G.E.; Cattell, R.G.G.; Siewiorek, D.P. The ISPS Computer Description Language: The Symbolic Manipulation of Computer Descriptions; Departments of Computer Science and Electrical Engineering, Carnegie-Mellon University: Pittsburgh, PA, USA, 1979. [Google Scholar]
Barbacci, M.R. The Symbolic Manipulation of Computer Descriptions: ISPL Compiler and Simulator; Department of Computer Science, Carnegie Mellon University: Pittsburgh, PA, USA, 1976. [Google Scholar]
Huang, C.L. Method and Apparatus for Verifying Timing during Simulation of Digital Circuits. U.S. Patent 5,095,454, 10 March 1992. [Google Scholar]
Shahdad, M.; Lipsett, R.; Marschner, E.; Sheehan, K.; Cohen, H. VHSIC hardware description language. Computer 1985, 18, 94–103. [Google Scholar] [CrossRef]
Gupta, R.; Brewer, F. High-level synthesis: A retrospective. In High-Level Synthesis: From Algorithm to Digital Circuit; Springer: Dordrecht, The Netherlands, 2008; pp. 13–28. [Google Scholar]
Minsky, M.; Papert, S. Perceptrons: An Introduction to Computational Geometry; Massachusetts Institute of Technology: Cambridge, MA, USA, 1969; Volume 479, p. 104. [Google Scholar]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Tesauro, G. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Comput. 1994, 6, 215–219. [Google Scholar] [CrossRef]
Dean, J. 1.1 The Deep Learning Revolution and Its Implications for Computer Architecture and Chip Design. In Proceedings of the 2020 IEEE International Solid-State Circuits Conference—(ISSCC), San Francisco, CA, USA, 16–20 February 2020; pp. 8–14. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Li, F.-F. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
Luebke, D.; Harris, M. General-purpose computation on graphics hardware. In Proceedings of the Workshop, SIGGRAPH, Los Angeles, CA, USA, 8–12 August 2004; Volume 33, p. 6. [Google Scholar]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.E.; Mohamed, A.r.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.N.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 2012, 29, 82–97. [Google Scholar] [CrossRef]
Chan, W.; Jaitly, N.; Le, Q.V.; Vinyals, O. Listen, attend and spell. arXiv 2015, arXiv:1508.01211. [Google Scholar]
Wu, Y.; Schuster, M.; Chen, Z.; Le, Q.V.; Norouzi, M.; Macherey, W.; Krikun, M.; Cao, Y.; Gao, Q.; Macherey, K.; et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv 2016, arXiv:1609.08144. [Google Scholar]
Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; Kuksa, P. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 2011, 12, 2493–2537. [Google Scholar]
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G.S.; Dean, J. Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 2013, 26, 3111–3119. [Google Scholar]
Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 2014, 27, 3104–3112. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
Shazeer, N.; Mirhoseini, A.; Maziarz, K.; Davis, A.; Le, Q.; Hinton, G.; Dean, J. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv 2017, arXiv:1701.06538. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Yu, C.; Xiao, H.; De Micheli, G. Developing synthesis flows without human knowledge. In Proceedings of the 55th Annual Design Automation Conference, San Francisco, CA, USA, 24–29 June 2018; pp. 1–6. [Google Scholar]
Huang, G.; Hu, J.; He, Y.; Liu, J.; Ma, M.; Shen, Z.; Wu, J.; Xu, Y.; Zhang, H.; Zhong, K.; et al. Machine learning for electronic design automation: A survey. ACM Trans. Des. Autom. Electron. Syst. 2021, 26, 1–46. [Google Scholar] [CrossRef]
Kahng, A.B. Machine learning applications in physical design: Recent results and directions. In Proceedings of the 2018 International Symposium on Physical Design, Monterey, CA, USA, 25–28 March 2018; pp. 68–73. [Google Scholar]
OpenAI. Introducing ChatGPT. 2022. Available online: https://openai.com/blog/chatgpt (accessed on 8 February 2024).
Pichai, S. An Important Next Step on Our AI Journey. 2023. Available online: https://blog.google/technology/ai/bard-google-ai-search-updates/ (accessed on 8 February 2024).
Microsoft. Microsoft Edge Features—Bing Chat. 2023. Available online: https://www.microsoft.com/en-us/edge/features/bing-chat?form=MT00D8 (accessed on 8 February 2024).
Chang, K.; Wang, Y.; Ren, H.; Wang, M.; Liang, S.; Han, Y.; Li, H.; Li, X. ChipGPT: How far are we from natural language hardware design. arXiv 2023, arXiv:2305.14019. [Google Scholar]
Thakur, S.; Ahmad, B.; Fan, Z.; Pearce, H.; Tan, B.; Karri, R.; Dolan-Gavitt, B.; Garg, S. Benchmarking Large Language Models for Automated Verilog RTL Code Generation. In Proceedings of the 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), Antwerp, Belgium, 17–19 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
Blocklove, J.; Garg, S.; Karri, R.; Pearce, H. Chip-Chat: Challenges and Opportunities in Conversational Hardware Design. arXiv 2023, arXiv:2305.13243. [Google Scholar]
Efabless. Efabless Caravel “Harness” SoC—Caravel Harness Documentation. Available online: https://caravel-harness.readthedocs.io/en/latest/ (accessed on 8 February 2024).
Welcome to SkyWater SKY130 PDK’s Documentation! Available online: https://skywater-pdk.readthedocs.io/en/main/ (accessed on 8 February 2024).
Srilakshmi, S.; Madhumati, G.L. A Comparative Analysis of HDL and HLS for Developing CNN Accelerators. In Proceedings of the 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore, India, 2–4 February 2023; pp. 1060–1065. [Google Scholar]
Zhao, J.; Zhao, Y.; Li, H.; Zhang, Y.; Wu, L. HLS-Based FPGA Implementation of Convolutional Deep Belief Network for Signal Modulation Recognition. In Proceedings of the IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, 26 September–2 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 6985–6988. [Google Scholar]
Lee, H.S.; Jeon, J.W. Comparison between HLS and HDL image processing in FPGAs. In Proceedings of the 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), Seoul, Republic of Korea, 1–3 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–2. [Google Scholar]
Shalan, M.; Edwards, T. Building OpenLANE: A 130nm openroad-based tapeout-proven flow. In Proceedings of the 39th International Conference on Computer-Aided Design, San Diego, CA, USA, 2–5 November 2020; pp. 1–6. [Google Scholar]
Zezin, D. Modern Open Source IC Design tools for Electronics Engineer Education. In Proceedings of the 2022 VI International Conference on Information Technologies in Engineering Education (Inforino), Moscow, Russia, 12–15 April 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
Charaan, S.; Nalinkumar, S.; Elavarasan, P.; Prakash, P.; Kasthuri, P. Design of an All-Digital Phase-locked loop in a 130 nm CMOS Process using open-source tools. In Proceedings of the 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC), Chennai, India, 22–23 April 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 270–274. [Google Scholar]
Ghazy, A.; Shalan, M. Openlane: The open-source digital asic implementation flow. In Proceedings of the Workshop on Open-Source EDA Technologies (WOSET), Online, 27 October 2020. [Google Scholar]
Chupilko, M.; Kamkin, A.; Smolov, S. Survey of Open-source Flows for Digital Hardware Design. In Proceedings of the 2021 Ivannikov Memorial Workshop (IVMEM), Nizhny Novgorod, Russia, 24–25 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 11–16. [Google Scholar]
Hesham, S.; Shalan, M.; El-Kharashi, M.W.; Dessouky, M. Digital ASIC Implementation of RISC-V: OpenLane and Commercial Approaches in Comparison. In Proceedings of the 2021 IEEE International Midwest Symposium on Circuits and Systems (MWSCAS), Lansing, MI, USA, 9–11 August 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 498–502. [Google Scholar]
Efabless. Homepage. Available online: https://efabless.com/ (accessed on 26 February 2024).
Deng, L. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
Bisong, E. Google colaboratory. In Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners; Apress: Berkeley, CA, USA, 2019; pp. 59–64. [Google Scholar]

Figure 1. OpenLane workflow [48].

Figure 2. Caravel SoC architecture [51].

Figure 3. Conversation flow with LLM for code generation, highlighting the transition points and the recommendation for generation of prompts and new chat session.

Figure 4. Workflow for the development of a CNN using LLM, from TF architecture to GDSII throughout Caravel integration.

Figure 5. CNN architecture for the task of classifying MNIST images.

Figure 6. AI by AI and Caravel integration diagram.

Figure 7. Testbench of the IC AI by AI implemented with SKY130A standard cells.

Figure 8. Caravel GDSII file and CNN layout details.

Table 1. Advantages and disadvantages of HLS.

Advantages of HLS	Disadvantages of HLS
Reduces development time and effort	Does not have the same quality of results as HDLs
Architecture selection and optimization	Inconveniences in the hardware description
Parallelism and pipelining	Does not support all the features and constructs of the input languages
Allocates and shares resources efficiently	May not be compatible with all the existing tools and flows

Table 2. Advantages and disadvantages of OpenLane.

Advantages of OpenLane	Disadvantages of OpenLane
The entire flow is configured through a single configuration file	Less control over the flow compared to commercial tools
Automated flow, requires no manual intervention, once configured	Commercial tools have better time optimization
Open-source, no charge for use	OpenLane uses more logic cells in the design
Reduces the time and expertise required to obtain the GDSII	OpenLane generated designs tend to consume more power

Table 3. Advantages and disadvantages of Caravel.

Advantages of Caravel	Disadvantages of Caravel
Allows low-cost and low-risk custom SoC design	Limited to SKY130A and GF180MCUC PDKs.
Supports various open-source tools and flows for IC design	May not be suitable for complex or high-end IC design projects
Enables fast SoC prototyping	Limited by 10 mm $^{2}$ and 38 GPIO pins
Enables collaboration and sharing with the open-source hardware community

Table 4. ChatGPT-4, conversation URL.

Subject of the Conversation	URL
Implementing a CNN in TF (accessed on 4 March 2024)	https://chat.openai.com/share/4e8a7cf2-a9e9-4461-a4b3-b9e8b4aa284f
Implementation of a forward function in Python without libraries (Bare-Metal) (accessed on 4 March 2024)	https://chat.openai.com/share/c96772be-4dac-43da-8013-c657dd935efa
From Python to C code I (accessed on 4 March 2024)	https://chat.openai.com/share/c96772be-4dac-43da-8013-c657dd935efa
From Python to C code II (accessed on 4 March 2024)	https://chat.openai.com/share/64b09191-401e-4d04-8eb5-5383b95ceea5
Bias and weights as global parameters (accessed on 4 March 2024)	https://chat.openai.com/share/4b8237a4-20c3-434b-89fb-084fc5b57287
From C to HLS I (accessed on 4 March 2024)	https://chat.openai.com/share/9037bfcd-8d23-4701-bafd-59eca930a822
From C to HLS II (accessed on 4 March 2024)	https://chat.openai.com/share/84dd776b-0036-4fec-a878-dbcb33f6f210
Add function, half-precision floating-point (accessed on 4 March 2024)	https://chat.openai.com/share/0f617bfd-f59a-49a3-a561-20b2779ca121
Mult, Relu, Max function, half-precision floating-point (accessed on 4 March 2024)	https://chat.openai.com/share/2b207fc6-5952-4ef7-a562-64765e2d6722
Exponent function, half-precision floating-point (accessed on 4 March 2024)	https://chat.openai.com/share/5345f69b-5e04-4fdf-a062-f29b2fcc4564

Table 5. Pinout of Caravel and AI by AI.

Caravel	AI by AI	Type
wb_clk_i	o_mux_clk	Input
io_in[36]	o_mux_clk	Input
io_in[37]	s_mux_clk	Input
o_mux_clk	ap_clk	Input
la_data_in[1]	in_ap_rst	Input
io_in[35]	in_ap_rst	Input
wb_clk_i	o_mux_clk	Input
io_in[36]	o_mux_clk	Input
io_in[37]	s_mux_clk	Input
o_mux_clk	ap_clk	Input
la_data_in[1]	in_ap_rst	Input
io_in[35]	in_ap_rst	Input
la_data_out[2]	ap_start	Input
la_data_out[3]	ap_done	Output
la_data_out[4]	ap_ready	Output
io_out[16:5]	image_r_Addr_A	Output
io_out[17]	image_r_EN_A	Output
N/A	image_r_WEN_A	Output
N/A	image_r_Din_A	Output
io_in[33:18]	image_r_Dout_A	Input
io_out[34]	image_r_Clk_A	Output
N/A	image_r_Rst_A	Output
la_data_out[31:28]	ap_return	Output

Table 6. AI by AI layout specifications with the SKY130A PDK.

Parameter	Value
Core area	10.27 mm $^{2}$
Core Utility	8.747 mm $^{2}$
Cells per mm $^{2}$	26,241
Latency	161.19 K
Maximum frequency	40 MHz
Static Power	70.5 mW
Switching Power	50.5 mW
Buffers	65,142
Flip-Flops	49,973
Diode	33,839
Number of Cells	94,415

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Baungarten-Leon, E.I.; Ortega-Cisneros, S.; Abdelmoneum, M.; Vidana Morales, R.Y.; Pinedo-Diaz, G. The Genesis of AI by AI Integrated Circuit: Where AI Creates AI. Electronics 2024, 13, 1704. https://doi.org/10.3390/electronics13091704

AMA Style

Baungarten-Leon EI, Ortega-Cisneros S, Abdelmoneum M, Vidana Morales RY, Pinedo-Diaz G. The Genesis of AI by AI Integrated Circuit: Where AI Creates AI. Electronics. 2024; 13(9):1704. https://doi.org/10.3390/electronics13091704

Chicago/Turabian Style

Baungarten-Leon, Emilio Isaac, Susana Ortega-Cisneros, Mohamed Abdelmoneum, Ruth Yadira Vidana Morales, and German Pinedo-Diaz. 2024. "The Genesis of AI by AI Integrated Circuit: Where AI Creates AI" Electronics 13, no. 9: 1704. https://doi.org/10.3390/electronics13091704

APA Style

Baungarten-Leon, E. I., Ortega-Cisneros, S., Abdelmoneum, M., Vidana Morales, R. Y., & Pinedo-Diaz, G. (2024). The Genesis of AI by AI Integrated Circuit: Where AI Creates AI. Electronics, 13(9), 1704. https://doi.org/10.3390/electronics13091704

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Genesis of AI by AI Integrated Circuit: Where AI Creates AI

Abstract

1. Introduction

2. Development Tools

2.1. Vivado HLS 2019.1

2.2. OpenLane

2.3. Caravel

3. Workflow and Conversation Flow

3.1. Large Language Model Conversation Flow

3.2. From TensorFlow to Layout

4. Development of AI by AI

4.1. CNN with TF

4.2. Forward Function in Python

4.3. From Python to C++

4.4. Vivado HLS Considerations

4.5. Integration of the CNN with Caravel

5. Results

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Half Precision Floating Point Functions in C++

Appendix B. Top Module Verilog Code

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI