LLM-Powered, Expert-Refined Causal Loop Diagramming via Pipeline Algebra
Abstract
1. Introduction
2. Materials and Methods
2.1. Pipeline Algebra
2.2. Implementation
2.3. Templates
2.4. Causal Loop Diagram (CLD)
2.5. Attributed Directed Graph (ADG)
2.6. CLD Variable Naming Guidelines
- Unambiguous variable phrasing. Name each variable so that an increase (↑) or decrease (↓) is clear—e.g., Inventory Level, rather than Inventory [34] (Guideline table).
- Polarity on every link. Mark each causal arrow, + or −, and confirm with the “if …then …” test [7], ch 5, Section 2.7.
- Close the feedback loops. Convert open chains into closed loops to give the diagram dynamic meaning [36].
- Avoid “starburst” nodes. More than about four incoming or outgoing arrows often signals an aggregate that should be decomposed.
- Flag exogenous drivers. Identify variables outside the boundary so the endogenous structure is unambiguous.
- Label loops R or B and verify sign counts. Count negative links: with odd ⇒ balancing and even ⇒ reinforcing.
- Check sign consistency around each loop. A mis-signed link flips an R loop to B or vice-versa [39].
- Provide a causal narrative. A short prose explanation surfaces missing links or variables before simulation [40].
- Use neutral, quantity-like names. Avoid evaluative labels that obscure polarity (e.g., job stress, not stressed employees) [34].
- No bidirectional arrows. If causation runs both ways, insert an intermediate variable, rather than two opposing arrows [34].
- Keep each variable conceptually homogeneous. Do not bundle disparate constructs in one node; split or rename as needed [41].
- Eliminate duplicate or synonym variables. After drafting, run a uniqueness check and merge nodes that are merely different labels for the same concept [36].
- Glossary definitions. Supply a one-sentence definition for every variable because clear, localized semantics help the LLM disambiguate synonyms and anchor embeddings, improving automated vertex alignment and question-answering.
- Descriptive multi-word names. Use self-contained phrases such as capacity factor, rather than abbreviations like CF, because the richer surface forms boost embedding quality and reduce alias collisions in LLM-based matching.
- Edge-level provenance. Store a source sentence, citation, or SME note with every causal link because provenance metadata lets downstream reflection or explanation prompts trace reasoning paths and focus internal or external expert review where confidence is low.
- Loop narrative tags. Provide a plain-language, one-line description for each feedback loop because concise narratives help both humans and language models verify sign counts, detect missing links, and generate coherent summaries [40].
2.7. CLD Link Polarity Determination Rule
- Applicability. This rule determines the polarity (+ or −) of a direct causal link from a source, Variable X, to a destination, Variable Y (), in a causal loop Diagram (CLD).
- Principle (ceteris paribus). To apply the test, consider only the direct influence of X on Y. Mentally change Variable X, and determine how Variable Y would respond directly to that change, assuming that all other variables in the system remain momentarily constant (ceteris paribus).
- Test and Assignment
- Consider a hypothetical increase in Variable X.
- If this increase in X causes Variable Y to increase, the link is positive (+).
- If this increase in X causes Variable Y to decrease, the link is negative (–).
- (As a check, the opposite change in X must yield a consistent polarity): Consider a hypothetical decrease in Variable X.
- If this decrease in X causes Variable Y to decrease, the link is positive (+).
- If this decrease in X causes Variable Y to increase, the link is negative (–).
- Conclusion. The link polarity is positive (+) if X and Y change in the same direction and negative (–) if X and Y change in opposite directions, based only on the direct, isolated impact of X on Y.
2.8. Pipeline Specification of Work Reported Herein
3. Results
3.1. AI-Generated CLD
3.2. SME Critique of AI-Generated CLD
Global technology learning—this is too vague. What does this mean? See the comment for learning rate, it applies to this variable as well. Learning rate—this is too vague. Learning rates are applied to capital expenses, O&M expenses, and capacity factor. In other words, by improving the technology, we make it cheaper to make, cheaper to maintain, and we extract more power (capacity factor). Macroeconomic and finance conditions—this is an odd variables because it can mean anything. The ? instead of +/− shows just that—it is unclear how financial condition will affect the market competition. Projected energy demand—projected demand does not increase the market price; the market price is the "today" measure, “projected demand” is the possible future measure. The projected demand affects the willingness to invest though. Profit or expected profit—two profits in the name of this variable. Odd that AI labeled it this way, just a note. ROI threshold—return on investment threshold is a predetermined desired value, e.g., min 10%. There is already a dependency between the expected profit and willingness to invest, no reason to have another dependency on ROI threshold. Federal and state policies—this is too vague. Policies could push willingness to invest both ways. EPA is a policy, and stricter EPA rules will result in greater portion of declined permit approvals and reduced willingness to invest while tax incentives increase willingness to invest. Permitting rules framework—the permitting rules framework affects the project permit rate which in turn affect the project development failure rate. The permitting rules don’t affect permit time; the number of applications does. Lastly, the permit time does not directly affect the project development failure rate. It just takes longer to get from “development” to “construction”. The SME also noted that significant knowledge is required in order to construct a prompt and to validate the AI response.
3.3. Categorization of SME Critiques
3.4. Pipeline Updates in Response to SME Critique
4. Discussion
5. Limitations
5.1. Structural Consequences of This Framing
- Single-rater, unblinded evaluation. Quality was judged by a single SME without blinding or inter-rater reliability [44], leaving the results vulnerable to anchoring and confirmation effects.
- Post hoc rubric construction. Error categories were derived after observing the outputs, rather than being specified a priori, increasing researchers’ degrees of freedom and increasing vulnerability to experimenter bias.
- Lack of quantitative, comparative metrics. We did not predefine or report task-grounded metrics (e.g., variable/edge/loop recovery, sign and delay accuracy, and agreement with a ground truth); nor did we conduct statistical comparisons against baselines or ablations.
- Insufficient control of stochasticity and drift. We did not perform replicate generations under fixed hyperparameters (e.g., temperature, seed) or formally analyze variance across runs/models/time, limiting claims about stability and sensitivity. Reproducibility [20] is suspect, though it is inherent to contemporary AI systems that the bit-for-bit full reproducibility one might wish for is not plausible [45].
- Adaptive tuning without hold-out re-evaluation. Pipeline updates were informed via an SME critique on the same case, but improvements were not re-assessed on held-out problems, inviting overfitting to evaluator preferences.
- Narrow scope and inputs. Evaluation centered on a small number of cases and abstract-level inputs, which restricts external validity and overlooks information present in full texts or diverse domains.
- Subjective outcome emphasis. Findings rely primarily on qualitative critique, rather than on executable tests (e.g., translation to stock–flow models and reproduction of stylized behaviors) that would tie structure to dynamics.
5.2. Why the Framing Matters
- The LLM operated within a human-preframed conceptual space, so evaluation emphasized conformity to that frame, rather than capacity for independent problem framing or justification.
- Outputs were judged against human-established representations, biasing assessment toward rubric-like fidelity, rather than internal coherence, novelty, or transfer.
- The overall design resembled classic “tool validation” (single case, single rater), leaving little incentive to incorporate replication, exploration of alternative framings, or cross-domain generalization.
5.3. What a Different Framing Would Enable
- Independent problem framing. Permit the model to extract and synthesize its own problem statement from raw sources (and multiple sources), with provenance logging and a priori evaluation criteria.
- Parallel, controlled replications. Run multiple independent instantiations (and, where relevant, models/personas) under (if possible) fixed hyperparameters and seeds to measure variance and convergence.
- Multi-criteria evaluation. Complement fidelity-to-human metrics with (i) internal dynamic coherence (e.g., translation to stock–flow and simulation checks), (ii) novelty/coverage relative to human CLDs, and (iii) transfer to related domains.
- Co-evolution with audit trails. Structure human–AI critique–revise cycles with full edit logs to quantify contributions, learning effects, and the marginal value of guidance artifacts.
- Baselines and ablations. Compare human-only, AI-only, and hybrid pipelines; ablate orchestration components (e.g., guidelines, session resets, HITL) to attribute effects and estimate effect sizes.
- Full reproducibility instrumentation. Release prompts, artifacts, and code; measure HITL time/cost; and pin model/version settings to mitigate drift and facilitate independent replication.
6. Conclusions
- Formal rigor and reproducibility: PA provides formal rigor and reproducibility through the use of typed and auditable morphisms () that prevent data mismatches, deterministic replay via memoization to ensure identical outputs even with stochastic components like LLMs, and complete provenance from its state-passing discipline, which creates a transparent audit trail.
- Conciseness and epistemic clarity: the framework offers conciseness and epistemic clarity by functioning as a high-density “notation as thought” that forces analysts to focus on the structure of their investigation and by replacing verbose configuration files with a compact algebraic notation that is easier to grasp.
- Unification and flexibility: it achieves unification and flexibility by integrating multi-modal components, including LLM calls, symbolic math, and human-in-the-loop edits, under a single formal structure.
- Advanced capabilities for AI workflows: PA enables advanced capabilities for AI workflows, including a “falsifiable prompting” protocol that treats LLM outputs as testable contracts to prevent hallucinations, a robust foundation for agentic AI to plan and verify its own pipelines, and support for meta-algorithmic control to dynamically optimize workflows for cost, speed, or accuracy.
7. Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Phase II CLD Results
Appendix A.1. AI-Generated Variable Glossary
THIS TABLE IS AI-GENERATED | |||
# | Variable Name | One-Sentence Definition | Role |
1 | Installed Wind Generation Capacity | Cumulative nameplate megawatts of grid-connected land-based wind turbines. | Stock |
2 | Annual Wind Installations | Additional wind-turbine megawatts that achieve commercial operation each year (flow into Installed Capacity). | Flow |
3 | Cumulative Operating Experience | Total turbine-years logged by the operating wind fleet, basis for learning-by-doing effects. | Stock |
4 | Wind Turbine Capital Cost | Overnight capital cost ($/kW) for a new wind project, including turbines, balance-of-station, and construction. | Intermediate |
5 | Levelized Cost of Wind Electricity | Present-value cost of delivering 1 MWh from a new wind project over its lifetime (LCOE). | Intermediate |
6 | Economic Attractiveness of Wind Projects | Composite metric (e.g., IRR gap vs. hurdle rate) indicating how appealing wind projects are to investors. | Intermediate |
7 | Federal Production Tax Credit Level | Statutory $/MWh incentive available to qualifying U.S. wind projects. | Exogenous policy lever |
8 | Available High-Quality Wind Sites | Fraction of land with adequate wind speeds, grid access, and low siting conflict that remains undeveloped. | Depletable resource |
9 | Grid Integration Margin | Remaining transmission capacity and ancillary-service headroom that can absorb additional wind generation without curtailment. | Capacity buffer |
10 | Supply-Chain Manufacturing Capacity | Annual gigawatts of wind-turbine equipment that the domestic/global supply chain can deliver. | Capacity buffer |
11 | Steel & Critical-Mineral Prices | Indexed cost of steel, copper, and rare-earth materials relevant to turbine production (exogenous driver). | Exogenous cost driver |
12 | Public Acceptance of Wind Farms | Indicator of social license—permitting success rates and community attitudes toward wind development. | Soft variable |
13 | Wind Resource Quality | Mean wind-power density at prospective sites (exogenous natural condition). | Exogenous natural |
14 | CO2 Emissions from Electricity | Annual million tonnes of CO2 emitted by the national power sector. | Stock |
15 | Climate-Change Concern Level | Salience of climate risks in public and policymaker discourse. | Soft driver |
16 | R&D Investment in Wind Technology | Annual dollars devoted to turbine, control-system, and O&M innovation for wind power. | Flow |
17 | Learning-by-Doing Rate | Empirical percentage cost decline observed for each doubling of cumulative operating experience. | Parameter |
18 | Financial Capital Availability | Volume of project-finance and tax-equity capital actively seeking wind deals. | Capacity buffer |
19 | Alternative-Technology Competitiveness | Relative attractiveness of solar PV, storage, and other low-carbon substitutes competing for investment. | Competitive pressure |
Appendix A.2. AI-Generated Link Info
THIS TABLE IS AI-GENERATED | |||
From → To | +/- | Delay? | Provenance / note |
Annual Wind Installations → Installed Wind Generation Capacity | + | – | definitional |
Installed Wind Generation Capacity → Cumulative Operating Experience | + | – | accounting |
Cumulative Operating Experience → Learning-by-Doing Rate | + | ✓ (years) | classic Wright curve |
Learning-by-Doing Rate → Wind Turbine Capital Cost | − | ✓ | empirically 10–20% per doubling |
Wind Turbine Capital Cost → Levelized Cost of Wind Electricity | + | – | LCOE formula |
Levelized Cost of Wind Electricity → Economic Attractiveness of Wind Projects | − | – | lower LCOE boosts IRR |
Federal Production Tax Credit Level → Economic Attractiveness of Wind Projects | + | – | policy incentive |
Economic Attractiveness of Wind Projects → Annual Wind Installations | + | ✓ (planning/construction lag) | standard investment pipeline |
Installed Wind Generation Capacity → Available High-Quality Wind Sites | − | – | site depletion |
Available High-Quality Wind Sites → Economic Attractiveness of Wind Projects | + | – | siting quality affects yield |
Installed Wind Generation Capacity → Grid Integration Margin | − | ✓ | increasing curtailment |
Grid Integration Margin → Economic Attractiveness of Wind Projects | + | – | curtailment lowers revenue |
Supply-Chain Manufacturing Capacity → Wind Turbine Capital Cost | − | ✓ | economies of scale |
Annual Wind Installations → Supply-Chain Manufacturing Capacity | + | ✓ | capacity expansion investment |
Steel & Critical-Mineral Prices → Wind Turbine Capital Cost | + | – | materials pass-through |
Public Acceptance of Wind Farms → Economic Attractiveness of Wind Projects | + | ✓ | permitting ease |
Climate-Change Concern Level → Federal Production Tax Credit Level | + | ✓ | policy feedback |
Installed Wind Generation Capacity → CO2 Emissions from Electricity | − | ✓ | displacement of fossil |
CO2 Emissions from Electricity → Climate-Change Concern Level | − | ✓ | success can erode urgency (risk) |
R&D Investment in Wind Technology → Wind Turbine Capital Cost | − | ✓ (tech lag) | innovation |
Economic Attractiveness of Wind Projects → Financial Capital Availability | + | ✓ | investors follow returns |
Financial Capital Availability → Annual Wind Installations | + | ✓ | funding constraint |
Alternative-Technology Competitiveness → Financial Capital Availability | − | ✓ | capital allocation |
Appendix A.3. AI-Generated Standard Business Loops
THIS TABLE IS AI-GENERATED | |||
Loop ID | Type | Core causal path | Short narrative |
R1 Learning Curve | R | Annual Installations → Cumulative Operating Experience + → Technology Learning Rate − → Technology Capital Cost + → LCOE − → Economic Attractiveness + → Annual Installations | Experience lowers cost, cost boosts demand, and demand adds more experience. |
R2 Finance Magnet | R | Economic Attractiveness + → Financial Capital Availability + → Annual Installations → (lower costs & higher returns feed back to Attractiveness) | High returns draw bigger capital pools that fund still more projects. |
R3 Supply-Chain Scale-Up | R | Annual Installations + → Supply-Chain Capacity − → Technology Capital Cost + → LCOE − → Economic Attractiveness + → Annual Installations | Factory scale spreads overhead and lowers cost, fuelling demand that keeps factories full. |
R4 Policy Flywheel | R | Climate Concern + → Policy Incentive Level + → Economic Attractiveness + → Annual Installations − → Sector CO2 Emissions − → Climate Concern | Public worry drives subsidies; rapid build cuts emissions, sustaining concern long enough for deep penetration. |
R5 Price-Support | R | Price Support Programs + → Policy Incentive Level + → Economic Attractiveness + → Annual Installations → (success sustains Price Support) | Fiscal credits directly raise ROI and create political momentum to preserve the credits. |
R6 Manufacturing Incentive | R | Manufacturing Incentives + → Supply-Chain Capacity − → Technology Capital Cost + → Economic Attractiveness + → Annual Installations → (utilisation validates further incentives) | Production rebates seed domestic plants; lower costs boost volumes that harvest more rebates. |
R7 Infrastructure Expansion | R | Grid Funding + → Transmission Build-out Rate + → Grid Margin + → Economic Attractiveness + → Annual Installations → (stakeholders lobby for more Grid Funding) | Public infrastructure spend lifts capacity limits, enabling growth that justifies further spend. |
R8 Finance Catalyser | R | Loan Guarantees + → Financial Capital Availability + → Annual Installations → (portfolio success underwrites new guarantees) | Risk transfer lowers borrowing cost; successful builds validate and expand guarantee programmes. |
B1 Market Saturation | B | Annual Installations + → Installed Capacity − → Available High-Quality Sites + → Economic Attractiveness − → Annual Installations | Finite prime sites mean economics deteriorate as easy opportunities are used up. |
B2 Capacity Constraint | B | Annual Installations + → Installed Capacity − → Grid Margin + → Economic Attractiveness − → Annual Installations | Shared-asset congestion or curtailment risk slows further builds until capacity is expanded. |
B3 Complacency | B | Annual Installations − → Sector CO2 Emissions − → Climate Concern − → Policy Incentive Level − → Economic Attractiveness − → Annual Installations | Early success reduces perceived urgency, eroding policy support and cooling growth. |
B5 Commodity Squeeze | B | Annual Installations + → Raw Material Prices + → Technology Capital Cost + → LCOE − → Economic Attractiveness − → Annual Installations | Rapid scale-up tightens commodity supply, raising costs and damping demand unless countered by diversification programmes. |
References
- Valerdi, R.; Rouse, W.B. When systems thinking is not a natural act. In Proceedings of the 2010 IEEE International Systems Conference, San Diego, CA, USA, 5–8 April 2010; pp. 184–189. [Google Scholar] [CrossRef]
- Forrester, J.W. Counterintuitive behavior of social systems. Theory Decis. 1971, 2, 109–140. [Google Scholar] [CrossRef]
- Kim, D.H. Introduction to Systems Thinking; Pegasus Communications: Waltham, MA, USA, 1999. [Google Scholar]
- Meadows, D.H.; Wright, D. Thinking in Systems: A Primer; Wright, D., Ed.; Chelsea Green Pub.: White River Junction, VA, USA, 2008. [Google Scholar]
- Monat, J.P.; Gannon, T.F. What is systems thinking? A review of selected literature plus recommendations. Am. J. Syst. Sci. 2015, 4, 11–26. [Google Scholar]
- Yan, H.; Wang, L.; Goh, J.; Shen, W.; Richardson, J.; Yan, X. Towards Understanding the Causal Relationships in Proliferating SD Education—A System Dynamics Group Modelling Approach in China. Systems 2023, 11, 361. [Google Scholar] [CrossRef]
- Sterman, J. Business Dynamics: Systems Thinking and Modeling for a Complex World; Irwin/McGraw-Hill: Boston, MA, USA, 2000; p. xxvi. 982p. [Google Scholar]
- Goodman, M.R. Elementary System Dynamics Structures. Ph.D. Thesis, MIT, Cambridge, MA, USA, 1972. [Google Scholar]
- Sterman, J.D. Learning in and about complex systems. Syst. Dyn. Rev. 1994, 10, 291–330. [Google Scholar] [CrossRef]
- Taha, H.; Durham, J.; Smith, C.; Reid, S. Qualitative Causal Loop Diagram: One Health Model Conceptualizing Brucellosis in Jordan. Systems 2023, 11, 422. [Google Scholar] [CrossRef]
- Vaswani, A. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- OpenAi. GPT-4 Technical Report; OpenAI: San Francisco, CA, USA, 2023. [Google Scholar]
- Yao, S.; Zhao, J.; Yu, D.; Du, N.; Shafran, I.; Narasimhan, K.; Cao, Y. ReAct: Synergizing Reasoning and Acting in Language Models. arXiv 2022, arXiv:2210.03629. [Google Scholar] [CrossRef]
- Liu, N.Y.G.; Keith, D.R. Leveraging Large Language Models for Automated Causal Loop Diagram Generation: Enhancing System Dynamics Modeling through Curated Prompting Techniques. arXiv 2025, arXiv:2503.21798. [Google Scholar] [CrossRef]
- Arndt, H. AI and education: An investigation into the use of ChatGPT for systems thinking. arXiv 2023, arXiv:2307.14206. [Google Scholar] [CrossRef]
- Gupta, A.; Zuckerman, E.; O’Connor, B.T. Harnessing Toulmin’s theory for zero-shot argument explication. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024. [Google Scholar]
- Baylor, D.; Breck, E.; Cheng, H.T.; Fiedel, N.; Foo, C.Y.; Haque, Z.; Haykal, S.; Ispir, M.; Jain, V.; Koc, L. Tfx: A tensorflow-based production-scale machine learning platform. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NC, Canada, 13–17 August 2017; pp. 1387–1395. [Google Scholar]
- Spivak, D.I. Category Theory for the Sciences; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
- Iverson, K.E. Notation as a tool of thought. Commun. ACM 1980, 23, 444–465. [Google Scholar] [CrossRef]
- Peng, R.D. Reproducible Research in Computational Science. Science 2011, 334, 1226–1227. [Google Scholar] [CrossRef]
- Hughes, J. Why Functional Programming Matters. Comput. J. 1989, 32, 98–107. [Google Scholar] [CrossRef]
- Moggi, E. Notions of computation and monads. Inf. Comput. 1991, 93, 55–92. [Google Scholar] [CrossRef]
- Whitehead, A.N. An Introduction to Mathematics; H. Holt and Company: New York, NY, USA, 1911; p. 256. [Google Scholar]
- Simske, S.J. Meta-Algorithmics: Patterns for Robust, Low Cost, High Quality Systems; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Lee, E.A.; Parks, T.M. Dataflow process networks. Proc. IEEE 1995, 83, 773–801. [Google Scholar] [CrossRef]
- Mac Lane, S. Categories for the Working Mathematician, 2nd ed.; Graduate texts in Mathematics; Springer: New York, NY, USA, 1998; p. xii. 314p. [Google Scholar]
- Michie, D. “Memo” Functions and Machine Learning. Nature 1968, 218, 19–22. [Google Scholar] [CrossRef]
- Feldman, S.I. Make—A program for maintaining computer programs. Softw. Pract. Exp. 1979, 9, 255–265. [Google Scholar] [CrossRef]
- Wolfram Research, Inc. Wolfram Mathematica; Version 14.2; Wolfram Research, Inc.: Champaign, IL, USA, 2023; Available online: https://www.wolfram.com/mathematica/ (accessed on 18 August 2025).
- Goodman, M. Causal Loop Diagramming (D-1755-2); Report; MIT: Cambridge, MA, USA, 1973. [Google Scholar]
- Richardson, G.P. Problems with causal-loop diagrams. Syst. Dyn. Rev. 1986, 2, 158–170. [Google Scholar] [CrossRef]
- Sweeney, L.B.; Sterman, J.D. Bathtub dynamics: Initial results of a systems thinking inventory. Syst. Dyn. Rev. 2000, 16, 249–286. [Google Scholar] [CrossRef]
- Kim, D.H.; Senge, P.M. Putting systems thinking into practice. Syst. Dyn. Rev. 1994, 10, 277–290. [Google Scholar] [CrossRef]
- Kim, D.H. Systems Thinking Tools: A User’s Reference Guide; Pegasus Communications: Waltham, MA, USA, 1995. [Google Scholar]
- Brandes, U.; Eiglsperger, M.; Lerner, J.; Pich, C. Graph Markup Language (GraphML). Handbook of Graph Drawing and Visualization; Chapman & Hall: London, UK, 2013. [Google Scholar]
- Richardson, G.P.; Pugh, A.L. Introduction to System Dynamics Modeling with DYNAMO; MIT Press: Cambridge, MA, USA, 1981; Wright-Allen Series in System Dynamics; p. xi. 413p. [Google Scholar]
- Forrester, J.W. Industrial Dynamics; MIT Press: Cambridge, MA, USA, 1961; 464p. [Google Scholar]
- Lane, D.C. The emergence and use of diagramming in system dynamics: A critical account. Syst. Res. Behav. Sci. 2008, 25, 3–23. [Google Scholar] [CrossRef]
- Goodman, M.R. Study Notes in System Dynamics; Wright-Allen Press: Cambridge, MA, USA, 1974; p. xiv. 388p. [Google Scholar]
- Vennix, J.A.M.; Akkermans, H.A.; Rouwette, E.A.J.A. Group model-building to facilitate organizational change: An exploratory study. Syst. Dyn. Rev. 1996, 12, 39–58. [Google Scholar] [CrossRef]
- Richardson, G.P. Problems in causal loop diagrams revisited. Syst. Dyn. Rev. 1997, 13, 247–252. [Google Scholar] [CrossRef]
- Lawrence, S. RE: Lana’s Paper [Personal communication]. E-mail, To: Reinholtz, K.; Colorado State University, Ft Collins, CO, USA, 1 April 2025.
- Lawrence, S.; Herber, D.R.; Shahroudi, K.E. Leveraging System Dynamics to Predict the Commercialization Success of Emerging Energy Technologies: Lessons from Wind Energy. Energies 2025, 18, 2048. [Google Scholar] [CrossRef]
- Gwet, K.L. Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters; Advanced Analytics, LLC: Gaithersberg, MD, USA, 2014. [Google Scholar]
- Heil, B.J.; Hoffman, M.M.; Markowetz, F.; Lee, S.I.; Greene, C.S.; Hicks, S.C. Reproducibility standards for machine learning in the life sciences. Nat. Methods 2021, 18, 1132–1135. [Google Scholar] [CrossRef] [PubMed]
- Kohavi, R. Glossary of terms (incl Confusion Matrix - Kirk). Mach. Learn. 1998, 30, 271–274. [Google Scholar]
Code | Category (Focus) | Typical Symptom |
---|---|---|
A | Ambiguous/overly broad variable | Name fails to convey a single, well-bounded concept or its sign. |
B | Duplicate/syntactically odd label | Internal redundancy or repeated wording in the variable name. |
C | Mis-specified causal linkage or polarity | Link direction or sign is inconsistent with domain logic. |
D | Redundant/unnecessary element | Adds no new information; duplicates an existing construct. |
E | Process-level limitation | The issue lies in the human-AI workflow, not in the diagram content. |
Variable | Code(s) | Rationale |
---|---|---|
Global technology learning | a | Label too vague; concept unclear |
learning rate | A | Same vagueness; unspecified which rates apply. |
Macroeconomic and finance conditions | A, C | Broad label; polarity unknown (“?”). |
Projected energy demand | C | Incorrect causal path to market price. |
Profit or expected profit | B | Redundant wording inside the label |
ROI threshold | D | Adds a second dependency already implied by expected profit |
Federal and state policies | A | Over-general; can push willingness both ways |
Permitting rules framework | C, D | Mis-states drivers of permit time and failure; adds extra link |
Prompting/validation effort | E | Workflow limitation, rather than a diagram variable |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Reinholtz, K.; Shahroudi, K.E.; Lawrence, S. LLM-Powered, Expert-Refined Causal Loop Diagramming via Pipeline Algebra. Systems 2025, 13, 784. https://doi.org/10.3390/systems13090784
Reinholtz K, Shahroudi KE, Lawrence S. LLM-Powered, Expert-Refined Causal Loop Diagramming via Pipeline Algebra. Systems. 2025; 13(9):784. https://doi.org/10.3390/systems13090784
Chicago/Turabian StyleReinholtz, Kirk, Kamran Eftekhari Shahroudi, and Svetlana Lawrence. 2025. "LLM-Powered, Expert-Refined Causal Loop Diagramming via Pipeline Algebra" Systems 13, no. 9: 784. https://doi.org/10.3390/systems13090784
APA StyleReinholtz, K., Shahroudi, K. E., & Lawrence, S. (2025). LLM-Powered, Expert-Refined Causal Loop Diagramming via Pipeline Algebra. Systems, 13(9), 784. https://doi.org/10.3390/systems13090784