Jump to content

Draft:LLM-aided Design

From Wikipedia, the free encyclopedia

LLM-aided design refers to the use of large language models (LLMs) as intelligent collaborators throughout the end-to-end process of modern system design, including conceptualization, prototyping, verification, and optimization. This emerging interdisciplinary paradigm integrates advances in natural language processing (NLP), program synthesis, and AI-driven reasoning to augment human capabilities in domains such as electronic design automation (EDA), software engineering, hardware design, and cyber-physical systems.

Unlike traditional automation tools, LLMs - especially transformer-based architectures like GPT-4[1], Claude [2], LLaMA[3], and domain-specialized variants such as CodeLlama[4] or ChipGPT[5] - are capable of interpreting, generating, and refining structured and unstructured data including natural language specifications, HDL (Hardware Description Language)/HDL-like code, constraint definitions, tool scripts, and design documentation. LLM-aided design thus represents a shift from tool-assisted engineering to a form of co-design in which machine intelligence participates actively in architectural exploration, logic synthesis, formal verification, and post-silicon validation. Situated at the intersection of artificial intelligence, computer-aided design (CAD), and systems engineering, this field is reshaping the boundaries of human–machine collaboration in design-centric disciplines.

Introduction

[edit]

Traditional engineering methodologies, while rigorous and deeply refined over decades, are often limited by their manual nature. System and product design tasks, whether in hardware or software domains, have long depended on the expertise of human engineers to bridge the semantic gap between human intent and machine-executable specifications, frequently using ad hoc translation steps and toolchains. However, the process is often time-consuming, error-prone, and heavily reliant on domain-specific expertise. In recent years, the field of engineering design has witnessed a transformative convergence of artificial intelligence (AI) and domain-specific modeling. Among the most impactful developments is the rise of Large Language Models (LLMs) as versatile co-designers capable of interpreting natural language, synthesizing domain-specific artifacts, and reasoning about complex system architectures. LLMs—such as GPT-4[1], Claude[2], and LLaMA[3] — are capable of understanding and generating code, documents, and designs from natural language descriptions. This capacity opens a new avenue where human designers can work collaboratively with AI systems to accelerate innovation, ensure design correctness, and reduce time-to-market. Instead of requiring deep HDL or CAD knowledge, designers can express intent in natural language and rely on the model to output Verilog, VHDL, HLS C, or firmware code.

What distinguishes LLM-aided design from earlier paradigms of automated design is the deep contextualization and generalization capability of these models. Unlike rule-based or template-driven systems, LLMs can reason across modalities, encode domain-specific heuristics, and learn from heterogeneous data sources—including textbooks, design specifications, formal logic, and empirical codebases—without requiring extensive retraining for each new application. This makes them particularly attractive for fast-paced, cross-domain design scenarios such as system-on-chip integration, embedded systems, robotic controls, and cyber-physical system modeling.

Importantly, LLMs are not simply text generators. When embedded in interactive workflows that involve simulation tools, compilers, and validators, these models can engage in prompt-correct-refine cycles—mimicking expert behavior and incorporating contextual verification feedback. LLM-aided design introduces a new epistemological layer into the engineering process: one where models do not merely execute instructions but actively participate in design reasoning. This has enabled real-world applications such as HLS code repair using template retrieval, formal assertion generation, and flow control automation. It has also birthed circuit foundation models (CFMs)—domain-adapted versions of LLMs that can reason about and generate across the entire RTL-to-GDSII pipeline.

Background and Foundations of LLM-Aided Design

[edit]

The integration of large language models (LLMs) into electronic design automation (EDA) represents a pivotal evolution in how hardware systems are conceived, specified, and validated. While EDA has historically been defined by deterministic workflows, rule-based synthesis tools, and extensive manual intervention, the emergence of LLMs has introduced a new design paradigm—one driven by probabilistic reasoning, semantic abstraction, and human-language interaction. This shift aligns with the broader trajectory of artificial intelligence, where general-purpose models have increasingly been specialized for domain-specific tasks, including those traditionally considered the preserve of expert engineers.

Historical Evolution: From Transformers to Circuit Reasoning

[edit]

The foundation of LLM-aided design lies in the breakthrough transformer architecture introduced by Vaswani et al. (2017)[6]. This architecture rapidly displaced RNNs and LSTMs[7] in natural language processing due to its ability to model long-range dependencies with self-attention mechanisms. It formed the basis for the GPT series, beginning with GPT-2 and culminating in GPT-4o, each iteration demonstrating increasing capabilities in zero-shot reasoning, code generation, and language understanding.

By 2020, GPT-3 captured the AI community's attention with its ability to generate working code—including simple HTML, Python, and even Verilog. This prompted researchers in hardware design to hypothesize that the structural similarities between programming languages and hardware description languages (HDLs) could be exploited by LLMs for logic design and verification tasks. Initial experiments using Codex[8] and GPT-3 to write Verilog or assist in debugging highlighted the potential but also exposed critical limitations: lack of syntax guarantees, hallucinations, and incompatibility with downstream synthesis tools.

These limitations catalyzed a new direction: the creation of domain-specific foundation models tailored to EDA. These models—referred to as circuit foundation models — are trained or fine-tuned on HDL corpora, simulation traces, synthesis logs, and constraint files. By 2023, tools like RTLLM[9] and ChipGPT[5] began to deliver on the promise of LLM-aided design through carefully engineered prompting strategies, feedback loops, and domain-aligned datasets.

Timeline of LLM Trends in EDA
Year Milestone
2017 Transformer introduced by Vaswani et al. [6]
2020 GPT-3 exhibits rudimentary HDL generation capability.
2021 Prompt-based Verilog code generation appears in exploratory tools.
2022 ChipGPT[5], RTLLM[9] pioneer structured, feedback-driven generation pipelines.
2023 Domain-specific finetuning (VeriGen[10], RTLCoder[11]); agent frameworks RTLFixer[12], MEIC[13] become practical.
2024 Vision-language fusion (LayoutCopilot[14]), analog LLMs (LaMAGIC[15], AnalogCoder[16]) expand to new design domains.
2025 Multi-agent architectures and graph-text fusion (DRC-Coder[17], VerilogCoder[18]) reshape design verification.

Paradigms: Decoder vs. Encoder Models in Co-Design

[edit]

Two core paradigms now define the architectural ecosystem of LLMs in design automation:

1. Decoder-Based Autoregressive Models: Based on architectures like GPT[1] and CodeLlama[4], these models excel in generation-centric tasks. They translate natural language specifications into HDL, generate testbenches, produce timing constraints, and even repair buggy RTL. Prompt chaining and few-shot learning have made them particularly effective in synthesis-aligned code generation.

2. Encoder-Based Graph Reasoning Models: Inspired by models such as BERT and adapted into graph neural networks (e.g., ChipFormer[19]), these are optimized for inference tasks over structural representations like netlists or IRs. They predict congestion, estimate timing, identify bottlenecks, and perform logic equivalence checks.

The design ecosystem is increasingly embracing hybrid strategies, where decoder models generate artifacts and encoder models verify or optimize them—forming a closed co-design loop. This dual architecture mirrors human design workflows where generation and validation are tightly interleaved.

Methodological Landscape of LLM-Aided Design

[edit]

LLM-aided design spans multiple stages of the hardware-software co-design pipeline, encompassing natural language specification, HDL synthesis, analog circuit design, formal verification, and layout generation. While foundational techniques such as prompting, supervised fine-tuning (SFT), and retrieval-augmented generation (RAG) underpin much of the field, their practical application diverges significantly based on the nature of the task. To provide a comprehensive view, the following summary table classifies typical LLM methodologies by their corresponding EDA task domain for a few recent domain-specific representative LLMs/Tools:

Methodology by Task Domain in LLM-Aided Design
Representative LLMs/Tools LLM Methodology Used Task Domain
RTLLM[9], VeriGen[10], RTLFixer[12] Prompt engineering, self-refinement, score-based SFT Specification to HDL
ChatEDA[20], ChipNeMo[21] Instruction tuning, retrieval-augmented generation Constraint Generation
AutoSVA[22], LLM4DV[23] Coverage-driven generation Testbench & Assertions
LayoutCopilot[14], ChatEDA[20] Vision-Language models, TCL script generation Floorplan/Layout Synthesis
AnalogCoder[16], LaMAGIC[15], LLANA[24] Topology suggestion, layout constraints, Bayesian tuning Analog Circuit Synthesis

Core Methodologies

[edit]

LLM-aided design encompasses a rich spectrum of methodologies that leverage large language models to support or automate critical phases of the electronic design automation (EDA) and hardware design stack. Below are a few core methodologies, enhanced and extended using insights from recent tools and frameworks:

Specification to HDL Translation

[edit]

LLMs can generate synthesizable RTL (Verilog, VHDL) directly from natural language specifications. This process is significantly enhanced using:

  • Prompt engineering and hierarchical prompting, to scaffold structured code generation,
  • Context window expansion, to provide multi-level module and signal context,
  • Self-refinement and feedback from compiler logs, allowing the LLM to iteratively repair and converge to synthesizable HDL,
  • Score-based supervised fine-tuning (SFT), as seen in tools like RTLLM[9], VeriGen[10], and RTLFixer[12], to improve alignment with design objectives and functional correctness.

Testbench and Assertion Generation

[edit]

LLMs synthesize SystemVerilog assertions, property checks, and full test environments using examples and coverage goals. Verification environments, SystemVerilog assertions (SVA), and test stimuli can be automatically synthesized using:

  • Coverage-driven generation, where LLMs aim to satisfy specific coverage goals and random seed diversity,
  • Tools such as AutoSVA[22] and LLM4DV[23] have shown higher assertion coverage and better bug exposure than traditional constrained-random verification methods.

HDL Debugging and Repair

[edit]

Using templates, similarity search, and error log analysis, LLMs can auto-repair syntax and functional bugs. LLMs assist in both syntactic repair (fixing compilation errors) and semantic repair (correcting logical/functional behavior), leveraging:

  • Template libraries and error log parsing,
  • Similarity search from past fixes,
  • Retrieval-Augmented Generation (RAG) pipelines such as RTLFixer[12] and MEIC[13], which iteratively improve code until it passes lint, synthesis, or formal checks,
  • Probabilistic retry mechanisms and guided patch insertion.

HLS Code Refinement

[edit]

Standard C/C++ is often incompatible with HLS constraints (e.g., recursion, pointers). LLMs identify and rewrite such constructs by:

  • Detecting and rewriting non-HLS-friendly patterns using prompt-repair pipelines,
  • Generating test harnesses and compiler hints (e.g., `#pragma HLS unroll`),
  • Tools like GPT4AIGChip[25] convert ML kernels into synthesizable HLS by combining structural abstraction and loop pattern rewrites.

Constraint Generation

[edit]

Constraint files are essential for synthesis, placement, and timing correctness. LLMs like ChatEDA and ChipNeMo support this through:

  • Instruction tuning, enabling fine-grained command generation (e.g., for SDC, XDC formats),
  • Retrieval-Augmented Generation (RAG), which pulls prior constraints from similar designs or databases to ensure domain-consistent generation,
  • Generating multi-domain timing, placement, and IO constraints with contextual accuracy.

Floorplan and Layout Synthesis

[edit]

Physical design requires careful placement and routing. LLM-vision hybrid models such as LayoutCopilot[14] and ChatEDA[20] employ:

  • Vision-language modeling to interpret and manipulate layout imagery (DEF/GDSII),
  • TCL script generation, customized for tools like Innovus and ICC2,
  • Automatic power grid and macro placement proposals, based on learned design intents.

Analog Circuit Synthesis

[edit]

Analog design poses unique challenges due to its sensitivity and lack of digital abstraction. Tools like AnalogCoder[16], LLANA[24], and LaMAGIC[15] use:

  • Topology suggestion via LLMs, based on specification matching (gain, slew, bandwidth),
  • Layout constraint prediction, such as symmetry, matching, and parasitic awareness,
  • Bayesian optimization and tuning, informed by LLM predictions for transistor sizing and performance trade-offs.

These methodologies collectively show that LLMs are not just code generators, but design agents capable of integrating with CAD flows, reasoning over heterogeneous inputs (text, code, specs, layout), and adapting to domain-specific constraints. As tools mature, the distinction between synthesis, verification, and optimization continues to blur—paving the way for closed-loop, autonomous hardware design.

Among these, HDL generation has emerged as one of the most deeply investigated tasks in LLM-aided EDA research, serving as a methodological testbed for broader design automation challenges. It captures the full interplay between natural language, symbolic code, feedback refinement, and tool integration. The following case study synthesizes key techniques employed in HDL generation workflows.

Methodological Classification of HDL Generation: A Case Study

[edit]

The following table, constructed using detailed insights from recent papers, including the 2025 survey by Pan et al., highlights the methodologies underlying LLM-aided HDL generation

HDL Generation Methodologies Using LLMs
Project Name Model Used Approach Type Summary
ChipGPT [5] ChatGPT (GPT-3.5) In-context Learning Zero-code RTL generation using NL prompts; utilizes prompt manager and structured chaining.
RTLLM[9] GPT-3.5 Prompt Engineering Multi-step planning-based prompt design with syntax and functional log feedback.
AutoChip[26] GPT-3.5 turbo Iterative Feedback Uses compile/simulation logs to iteratively refine HDL, reducing human debugging effort.
Chip-Chat[27] ChatGPT-4 Conversational Co-design Full pipeline HDL synthesis guided via interactive dialogue with GPT-4.
VeriGen[10] CodeGen-16B Fine-tuning Trained on textbook + GitHub Verilog, improved synthesis-valid output, syntax robustness.
ChatEDA[20] LLaMA-20B QLoRA + Instruction Tuning Trained on GPT-4-generated EDA instructions; interprets and executes user commands.
ChipNeMo[21] LLaMA 7B/13B/70B DAPT + Tokenizer Mod Custom tokenizer, retrieval-augmented models trained on logs and scripts.
RTLCoder[11] Mistral-7B Scored SFT Uses synthesis scores to steer SFT toward functionally valid and resource-efficient HDL
BetterV[28] CodeLlama + TinyLlama Controlled Gen + SFT Bayesian discriminator modifies token probability for valid HDL output
RTLFixer[12] GPT-4 RAG + Agent Framework Uses ReAct prompting and error categorization DBs for debug-oriented HDL refinement.
VerilogCoder[18] GPT-4 / LLaMA3 Multi-Agent + AST Trace Uses waveform tracing and signal planning to backtrace functional errors in RTL.


These methods reveal key trends and research frontiers:

  • Prompting + Logs: Tools like AutoChip[26] and RTLLM[9] demonstrate that prompting alone, when combined with feedback from toolchains, is sufficient for competitive HDL generation without model retraining.
  • Fine-tuning on RTL: Projects like VeriGen[10] and RTLCoder[11] show that targeted finetuning, especially with quality metrics (e.g., synthesis logs, functional correctness), significantly improves output robustness.
  • Controlled Generation: BetterV[28] introduces probabilistic controls in token sampling, pushing Verilog generation beyond maximum-likelihood decoding.
  • Agent Architectures: RTLFixer[12] and VerilogCoder[18] embody an emerging paradigm where LLMs serve not just as code generators, but as self-refining agents—reading logs, tracing waveforms, and performing symbolic analysis.

The table also underscores the growing importance of multi-agent collaboration, retrieval-augmented generation (RAG), and tool-in-the-loop frameworks, pushing beyond simple completion tasks toward autonomous reasoning and repair. The clear performance advantages of fine-tuned and multi-modal frameworks over naive prompting, as shown in benchmarks like VerilogEval[29] and PyHDL-Eval[30], affirm that true engineering-grade HDL generation necessitates tightly integrated model-tool co-evolution.

Hybrid pipelines combining prompt engineering, score-based training, and agent-based feedback loops are becoming the de facto standard in advanced EDA research. These methods not only enhance correctness and synthesizability but also extend the LLM's functional roles—from passive code suggestion to active circuit engineering.


Datasets and Evaluation Infrastructure

[edit]

Robust datasets are foundational to the development, tuning, and evaluation of large language models in EDA. These datasets span a variety of formats—ranging from tokenized Verilog corpora and annotated tool logs to natural language specifications and performance metrics. They enable domain adaptation, supervised fine-tuning, and benchmarking for generation quality and synthesis validity.

Recent efforts have not only expanded dataset volume but also enhanced diversity and granularity. Instruction-tuned datasets like ChatEDA[20] teach LLMs how to interact with toolchains; benchmark sets such as VerilogEval[29] assess model output quality; and design-level corpora like RTLCoder[11] and MG-Verilog[31] offer structural annotations and synthesis metadata. The MG-Verilog[31] dataset contributes human-annotated multilingual Verilog pairs that support abstraction and translation across languages. The VeriGen[10] dataset supports foundational educational finetuning using Verilog problems drawn from textbooks.

Open-source releases such as OpenRTL are particularly notable for their scale and openness. OpenRTL contains over 127,000 real-world RTL modules annotated with comments and module-level metadata, making it one of the largest available corpora for training and evaluating generative models in hardware design. This dataset, combined with task-specific corpora like RTLCoder[11] (which includes synthesis success and timing annotations), provides a solid benchmark for measuring model effectiveness across functional correctness and implementation metrics.

Tooling and Infrastructure: Practical Deployments

[edit]

Several practical tools now demonstrate that LLM-aided design is no longer theoretical:

  • AutoChip[26]: Automates the entire RTL generation process from high-level specifications using decoder models and iterative refinement.
  • ChatEDA[20] : Serves as a natural language interface for controlling Vivado, Quartus, or Innovus workflows. It interprets user intent and translates it into tool-specific commands.
  • RTLLM[9]-Editor: An IDE that integrates real-time HDL generation, compilation feedback, and syntax repair.
  • LLM4DV[23] and AutoSVA[22]: Specialized for formal verification, these tools generate SystemVerilog assertions and support coverage-driven testbench synthesis.

These tools reflect an operational maturity and are being integrated into prototyping, verification closure, and constraint generation workflows.

Applications

[edit]

The following are a few applications of LLM-aided Design:

Hardware and Embedded Systems

[edit]

LLMs are being integrated into the workflows of embedded system designers for tasks including:

  • Register-transfer level (RTL) generation: LLMs can generate Verilog/VHDL modules from high-level specifications.
  • System-on-chip (SoC) configuration: Automating the instantiation of IP blocks and bus protocols.
  • Embedded firmware synthesis: Translating control logic into C or assembly for embedded microcontrollers.

Software Architecture and Code Generation

[edit]

LLMs are used for:

  • Code synthesis: Generating boilerplate or logic-intensive code in multiple languages.
  • Test case generation and formal verification: Creating unit tests and translating specifications into formal properties for tools like Dafny or Coq.
  • Refactoring and documentation: Improving code quality and creating human-readable documentation automatically.

Electronic System-Level Design (ESL)

[edit]

At higher abstraction levels, LLMs facilitate:

  • Design space exploration: Suggesting viable design alternatives under constraints.
  • Model-based systems engineering: Interpreting and generating SysML/UML models.
  • Specification translation: Converting human language requirements into machine-interpretable formats.

Challenges and Future Trajectories

[edit]

While LLMs have unlocked new capabilities, several open challenges remain:

  • Verification Overhead: LLMs still rely on traditional simulation/formal methods to verify correctness.
  • Scalability: Handling designs with deep hierarchy, complex control logic, or mixed-signal blocks remains difficult.
  • Tool Version Drift: LLMs must adapt to different EDA tool versions and configuration nuances.

Looking forward, promising directions include:

  • Multi-Agent Design Copilots: Role-based LLMs (RTL engineer, verifier, P\&R optimizer) collaborating on shared state.
  • Fusion Architectures: GNNs, CNNs, and LLMs unified into end-to-end reasoning systems.
  • Self-Tuning Models: Online refinement based on simulation logs and usage telemetry.
  • Compliance-Aware Generation: Models that natively generate timing-, area-, and safety-compliant designs.
  • On-Device Inference: Lightweight models assisting with synthesis and debug on edge design environments.

In conclusion, LLM-aided design marks a transition from rule-based to learning-based design automation. It is not merely a tool for accelerating tasks but a co-designer that understands, generates, and refines logic across the full abstraction stack—from natural language to GDSII. As foundational models become increasingly specialized and infrastructure matures, LLMs are poised to transform the very fabric of digital system design.

See Also

[edit]

References

[edit]
  1. ^ a b c OpenAI et al. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL], 2024. Available online
  2. ^ a b Anthropic et al. The Claude 3 Model Family: Opus, Sonnet, Haiku. Anthropic Model Card (PDF), 2024. [https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf Available online
  3. ^ a b Touvron, Hugo et al. LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971 [cs.CL], 2023. Available online
  4. ^ a b Rozière, Baptiste et al. Code Llama: Open Foundation Models for Code. arXiv:2308.12950 [cs.CL], 2024. Available online
  5. ^ a b c d Chang, Kaiyan; Wang, Ying; Ren, Haimeng; Wang, Mengdi; Liang, Shengwen; Han, Yinhe; Li, Huawei; and Li, Xiaowei. ChipGPT: How Far Are We from Natural Language Hardware Design. arXiv:2305.14019 [cs.AI], 2023. Available online
  6. ^ a b Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Lukasz; and Polosukhin, Illia. Attention Is All You Need. arXiv:1706.03762 [cs.CL], 2023. Available online
  7. ^ Sherstinsky, Alex. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena, vol. 404, 2020, p. 132306. Available online
  8. ^ Chen, Mark et al. Evaluating Large Language Models Trained on Code. arXiv:2107.03374 [cs.LG], 2021. Available online
  9. ^ a b c d e f g Lu, Yao; Liu, Shang; Zhang, Qijun; and Xie, Zhiyao. RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model. arXiv:2308.05345 [cs.LG], 2023. Available online
  10. ^ a b c d e f Thakur, Shailja; Ahmad, Baleegh; Pearce, Hammond; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh; and Garg, Siddharth. VeriGen: A Large Language Model for Verilog Code Generation. ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, article 46, 2024, pp. 1–31. Available online
  11. ^ a b c d e Liu, Shang; Fang, Wenji; Lu, Yao; Wang, Jing; Zhang, Qijun; Zhang, Hongce; and Xie, Zhiyao. RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique. arXiv:2312.08617 [cs.PL], 2024. Available online
  12. ^ a b c d e f Tsai, Yun-Da; Liu, Mingjie; and Ren, Haoxing. RTLFixer: Automatically Fixing RTL Syntax Errors with Large Language Models. arXiv:2311.16543 [cs.AR], 2024. Available online
  13. ^ a b Xu, Ke; Sun, Jialin; Hu, Yuchen; Fang, Xinwei; Shan, Weiwei; Wang, Xi; and Jiang, Zhe. MEIC: Re-thinking RTL Debug Automation using LLMs. arXiv:2405.06840 [cs.AR], 2024. Available online
  14. ^ a b c Liu, Bingyang; Zhang, Haoyi; Gao, Xiaohan; Kong, Zichen; Tang, Xiyuan; Lin, Yibo; Wang, Runsheng; and Huang, Ru. LayoutCopilot: An LLM-powered Multi-agent Collaborative Framework for Interactive Analog Layout Design. arXiv:2406.18873 [cs.AR], 2025. Available online
  15. ^ a b c Chang, Chen-Chia; Shen, Yikang; Fan, Shaoze; Li, Jing; Zhang, Shun; Cao, Ningyuan; Chen, Yiran; and Zhang, Xin. LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits. arXiv:2407.18269 [cs.AR], 2024. Available online
  16. ^ a b c Lai, Yao; Lee, Sungyoung; Chen, Guojin; Poddar, Souradip; Hu, Mengkang; Pan, David Z.; and Luo, Ping. AnalogCoder: Analog Circuit Design via Training-Free Code Generation. arXiv:2405.14918 [cs.LG], 2024. Available online
  17. ^ Chang, Chen-Chia; Ho, Chia-Tung; Li, Yaguang; Chen, Yiran; and Ren, Haoxing. DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent. In: Proceedings of the 2025 International Symposium on Physical Design (ISPD ’25), ACM, 2025, pp. 143–151. Available online
  18. ^ a b c Ho, Chia-Tung; Ren, Haoxing; and Khailany, Brucek. VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool. arXiv:2408.08927 [cs.AI], 2025. Available online
  19. ^ Lai, Yao; Liu, Jinxin; Tang, Zhentao; Wang, Bin; Hao, Jianye; and Luo, Ping. ChiPFormer: Transferable Chip Placement via Offline Decision Transformer. arXiv:2306.14744 [cs.LG], 2023. Available online
  20. ^ a b c d e f Wu, Haoyuan; He, Zhuolun; Zhang, Xinyun; Yao, Xufeng; Zheng, Su; Zheng, Haisheng; and Yu, Bei. ChatEDA: A Large Language Model Powered Autonomous Agent for EDA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 43, no. 10, 2024, pp. 3184–3197. Available online
  21. ^ a b Liu, Mingjie et al. ChipNeMo: Domain-Adapted LLMs for Chip Design. arXiv:2311.00176 [cs.CL], 2024. Available online
  22. ^ a b c Orenes-Vera, Marcelo; Manocha, Aninda; Wentzlaff, David; and Martonosi, Margaret. AutoSVA: Democratizing Formal Verification of RTL Module Interactions. arXiv:2104.04003 [cs.AR], 2021. Available online
  23. ^ a b c Zhang, Zixi; Szekely, Balint; Gimenes, Pedro; Chadwick, Greg; McNally, Hugo; Cheng, Jianyi; Mullins, Robert; and Zhao, Yiren. LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation. arXiv:2310.04535 [cs.LG], 2025. Available online
  24. ^ a b Amaduzzi, Andrea; Zama Ramirez, Pierluigi; Lisanti, Giuseppe; Salti, Samuele; and Di Stefano, Luigi. LLaNA: Large Language and NeRF Assistant. arXiv:2406.11840 [cs.CV], 2024. Available online
  25. ^ Fu, Yonggan; Zhang, Yongan; Yu, Zhongzhi; Li, Sixu; Ye, Zhifan; Li, Chaojian; Wan, Cheng; and Lin, Yingyan Celine. GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models. arXiv:2309.10730 [cs.LG], 2025. Available online
  26. ^ a b c Thakur, Shailja; Blocklove, Jason; Pearce, Hammond; Tan, Benjamin; Garg, Siddharth; and Karri, Ramesh. AutoChip: Automating HDL Generation Using LLM Feedback. arXiv:2311.04887 [cs.PL], 2024. Available online
  27. ^ Blocklove, Jason; Garg, Siddharth; Karri, Ramesh; and Pearce, Hammond. Chip-Chat: Challenges and Opportunities in Conversational Hardware Design. In: Proceedings of the 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD), IEEE, Sept. 2023, pp. 1–6. Available online
  28. ^ a b Pei, Zehua; Zhen, Hui-Ling; Yuan, Mingxuan; Huang, Yu; and Yu, Bei. BetterV: Controlled Verilog Generation with Discriminative Guidance. arXiv:2402.03375 [cs.AI], 2024. Available online
  29. ^ a b Liu, Mingjie; Pinckney, Nathaniel; Khailany, Brucek; and Ren, Haoxing. VerilogEval: Evaluating Large Language Models for Verilog Code Generation. arXiv:2309.07544 [cs.LG], 2023. Available online
  30. ^ Batten, Christopher; Pinckney, Nathaniel; Liu, Mingjie; Ren, Haoxing; and Khailany, Brucek. PyHDL-Eval: An LLM Evaluation Framework for Hardware Design Using Python-Embedded DSLs. In: Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD (MLCAD '24), ACM, 2024, article 10, pp. 1–17. Available online
  31. ^ a b Zhang, Yongan; Yu, Zhongzhi; Fu, Yonggan; Wan, Cheng; and Lin, Yingyan Celine. MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation. arXiv:2407.01910 [cs.LG], 2024. Available online