Management Science
Abstract
Against the backdrop of the "double helix" driving logic shift between AI4Science and Science4AI, leveraging AI Agents to integrate massive, heterogeneous scientific knowledge, discover and simulate the generation process of novel human ideas holds significant importance. This facilitates the evolution from scientific discovery starting with scientists' experience and intuition towards approaches based on big data and AI algorithms, enabling scientists to shift from laborious information screening to high-level creative thinking, and actively breaking disciplinary barriers to catalyze numerous interdisciplinary scientific discoveries.
This paper proposes a framework integrating graph reasoning and multi-agent collaboration (Graph-reasoning GAMP). The framework first employs prompt engineering to extract triples from paper abstracts and utilizes Neo4j for storage, forming a large-scale scientific knowledge graph as a structured knowledge base. Subsequently, multiple functionally distinct AI Agents (such as Domain Expert Agent, Path Exploration Agent, Innovativeness Assessment Agent, etc.) are designed to form a collaborative system. Each Agent is powered by large language models, endowing them with capabilities in semantic understanding, entity extraction, path finding, and more.
For instance, in the Domain Expert Agent, through knowledge bases and prompt engineering, the Agent focuses on plausibility judgments at the levels of genes, proteins, and signaling pathways; in the Path Exploration Agent, various path search methods such as breadth-first algorithms, genetic algorithms, and large model-guided search are employed, enabling the Agent to concentrate on discovering the most novel paths. These Agents collaboratively explore and reason on the graph, generating and evaluating the generation paths of new ideas by simulating the "brainstorming" and "hypothesis-validation" cycles of scientific teams.
Taking the achievement awarded the 2021 Nobel Prize in Physiology or Medicine as an example, we collected literature related to temperature and touch receptors from the Web of Science Core Collection, Scopus, and PubMed databases between January 1, 1995, and December 31, 2005, as a case study, constructing a three-layer "problem-solution-effect" knowledge network for empirical research.
The innovations of this article are: first, connecting symbolism and connectionism, fully leveraging the capabilities of graph-structured reasoning and the powerful semantic understanding of large models; second, designing a structured multi-agent collaboration protocol with clear division of labor, simulating real research teams. Limitations include: the need for further depth and refinement in formally representing "paths of novel ideas," deep semantic understanding, and assessing the breakthrough potential of new ideas.
Full Text
A Novel Framework for Idea Generation Path Recognition Integrating Graph Reasoning and Multi-Agent Collaboration
Liang Guoqiang, Lin Gege, Zhang Zhihao, Zhang Shuo
(School of Economics and Management, Beijing University of Technology, Beijing, 100124)
*Funded by: Beijing Natural Science Foundation General Project "Research on the Emergence Mechanism and Identification Methods of Research Fronts from a Co-evolution Perspective" (Grant No. 9232002); National Natural Science Foundation of China Youth Program "Large Model-Enabled Personalized Transaction Recommendation Method and Application Research for High-Value Patents" (Grant No. 72404020) and "Identification of Disruptive Low-Carbon Technologies and Dynamic Selection of Innovation Pathways" (Grant No. 72304023).
This version posted 2025-10-13.
Keywords: Scientific Discovery; Diagrammatic Reasoning; Large Language Models; AI Agent
China Library Classification:
1 Introduction
Scientific breakthroughs and the generation of new ideas have long relied on the intuition and experience of individual scientists, as well as collaboration among research teams.
This process is often characterized by high costs, long cycles, and serendipity. As scientific knowledge enters a phase of explosive growth and disciplinary barriers become increasingly rigid, traditional literature review and brainstorming models struggle to comprehensively capture cross-domain knowledge connections, potentially missing significant innovation opportunities. This challenge not only concerns science and technology itself but also imposes higher demands on the efficiency of research management and the optimization of resource allocation.
In recent years, the development of artificial intelligence, particularly large language models (LLMs) and knowledge graphs (KGs), has provided new avenues for solving this challenge presents new possibilities. AI4Science aims to leverage artificial intelligence technologies to address core problems in scientific discovery, while Science4AI focuses on how scientific practice can reciprocally inform artificial intelligence theory, forming a double-helix synergistic development relationship between the two. Against this background, this paper explores a middle path: constructing a computational framework capable of simulating the cognitive collaboration processes of interdisciplinary research teams, with the goal of achieving computational identification and evaluation of pathways for generating novel scientific ideas.
This paper proposes a novel framework for identifying idea generation paths, named GAMP, which integrates graph reasoning with multi-agent collaboration. The core concept of the GAMP framework is to deeply fuse symbolism (represented by structured knowledge such as knowledge graphs) and connectionism (represented by semantic understanding from LLMs). Through a multi-agent system with distinct roles and orderly collaboration, it enables directed and heuristic exploration across vast scientific knowledge graphs, thereby automatically generating, evaluating, and filtering scientific hypothesis paths with potential breakthrough significance.
2 Literature Review
The proposed GAMP framework in this paper stands at the intersection of multiple rapidly evolving research domains. To clearly position the contributions of this study, this chapter will systematically review the state-of-the-art in scientific knowledge graph construction and application, graph reasoning algorithms, large language model applications in scientific domains, and multi-agent systems. It will further conduct an in-depth analysis of the limitations in existing work, thereby providing theoretical justification for the necessity and innovation of the GAMP framework.
2.1 Construction and Application of Scientific Knowledge Graphs (SKG)
Scientific knowledge graphs, as carriers of structured scientific knowledge, serve as the core infrastructure supporting the computability of scientific discovery. Traditional SKGs are primarily constructed by extracting entities (such as concepts, methods, and materials) and relations (such as "used for," "inhibits," and "causes") from large-scale scientific literature (e.g., papers and patents). In recent years, construction methods have evolved from predefined template matching to leveraging large language models for deep semantic understanding and extraction. For instance, Shi et al. utilized event knowledge graph technology and LLMs to construct a large-scale scientific experimental knowledge graph in the field of organic solar cells, containing tens of thousands of nodes and relations, effectively supporting experimental plan recommendation and evolutionary analysis. At the application level, SKGs have become essential tools for scientific and technological intelligence analysis, key technology identification, and disciplinary knowledge evolution analysis. Cao et al., by constructing a "science-technology" knowledge topic complex network, improved the PageRank algorithm to achieve fine-grained identification of key core technologies in the field of CNC machine tools.
2.2 Advances in Graph Reasoning Algorithms for Path Discovery
Graph reasoning algorithms aim to mine potentially meaningful paths from knowledge graphs and serve as the core technology for scientific breakthrough path identification. Early methods primarily relied on graph traversal algorithms based on random walks or meta-paths, which are efficient but heavily dependent on predefined path patterns, resulting in poor flexibility. Subsequently, knowledge graph embedding (KGE) methods mapped entities and relations into low-dimensional vector spaces and performed link prediction through vector operations; however, these methods suffer from poor interpretability and struggle to generate clear paths. In recent years, path-based interpretable reasoning has become a research hotspot. For example, the KGExplainer framework provides verifiable explanations for knowledge graph completion predictions by exploring multiple collaborative reasoning paths, demonstrating advantages in fields such as biomedicine. Graph reinforcement learning (GRL) combines graph neural networks with reinforcement learning, enabling agents to learn exploration strategies on graph structures to discover optimal paths, offering a new paradigm for handling scientific knowledge associations in non-Euclidean spaces.
2.3 Application and Adaptation of Large Language Models (LLMs) in Scientific Research
Large Language Models (LLMs), with their powerful natural language understanding and generation capabilities, have brought revolutionary tools to scientific knowledge processing. Domain-specific models (such as HuatuoGPT and BenTsao) have demonstrated reliability in multi-turn medical dialogues and diagnostic assistance through instruction tuning. In terms of reasoning paradigms, Chain-of-Thought (CoT) and Retrieval-Augmented Generation (RAG) are widely used to enhance the logicality and factual accuracy of LLMs. Particularly noteworthy is the emerging frontier of interactive iterative reasoning paradigms between LLMs and knowledge graphs. For instance, the Debate on Graph (DoG) framework reduces long-path interference by introducing a multi-role LLM team (such as problem simplification experts and reviewers) to conduct iterative debate and reasoning on the KG. The FiDeLiS framework, on the other hand, combines reasoning path retrieval-augmented generation (Path-RAG) and deductive verification beam search (DVBS), aiming to simultaneously improve the factuality and efficiency of question answering.
2.4 Collaboration Paradigms and Efficiency Optimization in Multi-Agent Systems
Multi-Agent systems provide a distributed approach to solving complex problems through division of labor and collaboration among multiple intelligent agents, and have gained renewed vitality in recent years through integration with LLMs. Early multi-agent systems primarily focused on designing communication protocols and collaboration mechanisms. Currently, LLM-driven agents have become a research hotspot. The Multi-Agent Debate (MAD) paradigm enables multiple LLM agents to collaborate in reasoning through a "round-table debate" approach, effectively enhancing decision quality, but suffers from high computational overhead and latency. To improve efficiency, new collaboration paradigms have been proposed. The MARS (Multi-Agent Review System) framework draws inspiration from academic review processes by designing role divisions of "author-reviewer-meta-reviewer," reducing token consumption and inference time by approximately 50% while maintaining reasoning quality through minimized frequent inter-agent communication. Further advancing this direction, the Federation of Agents (FoA) framework proposes a semantic-aware communication architecture that achieves dynamic agent capability matching and task decomposition through Versioned Capability Vectors (VCVs), laying the foundation for collaborative work in large-scale heterogeneous agent federations.
In summary, while significant progress has been made in various fields, each approach has certain limitations. SKG provides a structured knowledge base but lacks semantic understanding and flexible reasoning capabilities. Graph reasoning algorithms excel at discovering structural patterns in knowledge graphs but demonstrate insufficient depth in semantic comprehension. LLMs possess powerful semantic understanding and generation capabilities, yet their factual accuracy is difficult to guarantee, and they cannot perform structured exploration. Multi-agent systems offer collaborative paradigms for complex problem-solving, but their general architectures are difficult to directly apply to scientific discovery scenarios requiring high rigor. The framework that deeply integrates graph reasoning, LLMs, and multi-agent collaboration to simulate real research teams for scientific breakthrough path identification remains in its early stages.
3 GAMP Method Framework Design
3.1 Overall Framework
The GAMP framework aims to automatically identify promising scientific breakthrough pathways by simulating a virtual, highly specialized interdisciplinary research team that collaboratively explores and reasons on structured scientific knowledge graphs.
The overall architecture consists of three core layers: the data layer, the knowledge layer, and the agent collaboration layer. Each layer interacts through clearly defined interfaces, collectively completing the entire process from raw data to innovative pathway output, as shown in Figure 1.
Figure 1 Overall Framework of GAMP
Figure 1 Framework of GAMP
Data layer: Serving as the foundation, it is responsible for integrating multi-source heterogeneous scientific data, including academic literature databases (such as Web of Science, PubMed), patent databases, and specialized domain databases. This layer performs data collection, cleaning, and preprocessing, providing raw materials for knowledge construction.
Knowledge Layer: Serving as the cornerstone of the framework, its core is the scientific knowledge graph. This paper adopts an innovative three-layer semantic model of "Problem-Solution-Effect" for the structured representation of scientific knowledge. This layer transforms unstructured text into machine-understandable and machine-reasoning semantic networks, providing a unique and trustworthy source of facts for the reasoning of upper-layer Agents.
Agent Collaboration Layer: This serves as the "brain" and engine of the framework. It consists of a multi-agent system where each agent is powered by a large language model and assigned specific roles and tasks. The agents communicate and collaborate asynchronously through a shared workspace, simulating the cyclical process of "hypothesis proposal - peer review - revision and refinement" found in real-world scientific research teams.
3.2 Construction of Three-Layer Scientific Knowledge Graph: "Problem-Solution-Effect"
To precisely characterize the intrinsic logic of scientific discoveries, we constructed a three-layer scientific knowledge graph of "Problem-Solution-Effect". This model goes beyond simple entity-relationship extraction and aims to capture the complete thought chain of "posing problems - designing solutions - verifying effects" in scientific research.
Problem Layer: Nodes represent the core scientific problems or challenges that research attempts to solve (e.g., "How to identify the molecular receptors for noxious heat stimuli?").
Solution Layer: Nodes represent specific methods, techniques, compounds, tools, or theories used to address problems (e.g., "capsaicin", "gene knockout technology", "calcium imaging technology").
Effect Layer: Nodes represent the outcomes, discoveries, biological functions, or performance metrics generated after implementing solutions (e.g., "activated TRPV1 ion channels", "caused intracellular calcium concentration elevation", "produced thermal nociceptive behavioral responses").
When using large models for entity extraction, the prompt is: "You are a scientific knowledge engineer. Please extract from the following paper abstract: In it, accurately identify the 【Research Problem】, 【Core Method or Substance Used】 and 【Most Critical Research Finding or Effect】. Please ensure the extracted content comes directly from the text, avoiding assumptions. The output format is JSON: {"problem": "", "solution": "", "effect": ""}. Perform normalization (eliminating naming ambiguities, such as unifying "VR1" with "TRPV1") and linking on the extracted entities. Subsequently, store the cleaned triples into Neo4j. Intra-layer and inter-layer connections are established through rich relationship types.
For instance, the relationship between "problem" and "solution" is "through... research", while the relationship between "solution" and "effect" includes "leads to", "inhibits", "enhances", etc. Figure 2 shows a schematic diagram of the network after extraction by the large model, where L1, L2, and L3 correspond to the problem layer, solution layer, and effect layer, respectively.
Figure 2 Entity Relationship Diagrams at Different Periods
Figure 2 Entities relationship among different periods
3.3 Multi-Agent System Detailed Design
The core of the GAMP framework lies in its multi-agent collaboration system. We have established clear roles, responsibilities, and decision-making mechanisms collectively simulate an efficient virtual research team.
Agent Roles and Functional Definitions:
Chief Scientist Agent: Acts as the team leader and coordinator. Responsible for receiving user queries and breaking down complex questions. The problem is decomposed into subtasks, which are assigned to other agents. After synthesizing opinions from all parties, final decisions and path ranking are made.
Domain Expert Agents (Multiple): Each Agent represents a specific discipline (e.g., Molecular Biologist Agent, Physiologist Agent, Chemist Agent). Their core responsibility is to evaluate the scientific rationality and logical coherence of each step in the pathway from their respective disciplinary perspectives. They are role-anchored through specific instructions; for instance, the Molecular Biologist Agent's instructions emphasize its deep understanding of genes, proteins, and signaling pathways.
Path Exploration Agent: Responsible for conducting active exploration on the SKG. It combines traditional graph algorithms (such as breadth-first search to discover direct associations) with semantic guidance from LLMs (for example, LLM predicting "which ion channels might functionally complement TRPV1?"), enabling it to escape local optima and discover non-obvious associations.
Innovativeness Assessment Agent: Focused on evaluating the breakthrough potential of paths. It scores the novelty and potential impact of paths based on predefined quantitative metrics (such as path topological novelty, semantic rarity) and LLM's deep semantic understanding.
Fact-Checking Agent: Serves as the "gatekeeper" of system reliability. Its task is to ensure that all generated inferences and hypotheses can find evidential support in the SKG, strictly suppressing potential "hallucinations" that may arise from LLMs, thereby enhancing the overall credibility of the system.
4 Core Algorithms and Implementation
The decision-making core of each Agent is a meticulously designed prompt engineering template. This template solidifies the agent's role, tasks, knowledge background, and behavioral constraints, ensuring consistency and professionalism in its actions. Taking the molecular biologist agent as an example, its decision-making prompt template is shown in Figure 3. The core algorithm of the path search Agent primarily draws upon existing approaches such as breadth-first search and ant colony optimization, which will not be elaborated further here.
Figure 3 Domain Expert Agent Prompt Template
Figure 3 Prompt of Specialized Expert Agent
Novelty assessment primarily employs the formula Novelty(P) = 1 / (1 + log( freq(P) )), where freq(P) is the frequency of occurrence of this path or its subpaths in historical literature is inversely proportional to the degree of novelty.
5 Empirical Research and Discussion
In 2021, research achievements in the field of temperature receptors and tactile receptors were awarded the Nobel Prize in Physiology or Medicine, revealing the neural signaling pathways and mechanisms underlying human temperature sensation, pain, and touch. Receptors within human cells can sensitively detect high-temperature (heat) or low-temperature (cold) stimuli in the environment. This temperature-sensing mechanism, along with the responses triggered by mechanical force stimuli in touch, is closely related to the formation of pain, providing new targets for pain treatment strategies. The breakthrough results in this field offer an excellent validation opportunity for the methodological framework discussed earlier. Its mature developmental trajectory can provide a rich practical foundation, enabling the rationality of this methodological framework to be verified.
Considering the authority and completeness of the data, this paper selected the Web of Science Core Collection database, Scopus, and PubMed database as data sources. The commonly used English expressions for the subject terms "temperature" and "tactile receptors" in SCI papers were used as basic search entities; simultaneously, search formulas were established based on broader and narrower terms in the MeSH thesaurus; areas with questionable precision were retained to avoid omissions. The search time span was from January 1, 1995, to December 31, 2005, with the search conducted on March 28, 2024, yielding 3,234 articles. After format conversion, deduplication, and other operations, only the publication year and abstract were retained, resulting in 3,107 valid abstract entries.
By inputting the abstract data into the GAMP framework, we identified the trajectory of new ideas generated in this field from 1995 to 2003. There are multiple entries, as shown in Table 1.
Table 1 Top 5 novel idea generation paths based on the GAMP framework
Table 1 Top 5 new idea generation paths based on GAMP
Ranking | Generation Path | Key Node Description | Novelty | Is history truly real? | Practical New Thinking |
---|---|---|---|---|---|
1 | 【Problem】Thermal pain mechanism → 【Solution】Capsaicin → 【Effect】Activation | TRPV1 was identified as Thermal Nociceptors | 0.92 | Yes (hit) | (important) |
2 | 【Problem】Cold Sensation Mechanism → 【Solution】Menthol → 【Effect】Activation | TRPM8 was identified as Cold receptors | 0.88 | Now) | No (with forward |
3 | 【Problem】Thermal hyperalgesia → 【Solution】Inflammatory factors → 【Effect】Enhancement | Explaining Inflammatory Fever Enhanced TRPV1 Function | 0.85 | (Prospective) | No (with preceding |
4 | 【Problem】Noxious Stimulus Gating → 【Solution】Capsaicin Analogs → 【Effect】Discovery of TRPV1 Isoforms | Predicted TRPV1 function Energy diversity | 0.81 | (Prospective) | |
5 | 【Problem】Thermal Sensation → 【Solution】Capsaicin Resistance Research → 【Effect】Discovery of Potential Novel Thermoreceptors | Suggesting the existence of other heat Receptor | 0.78 |
The experimental results demonstrate that the GAMP framework can not only effectively trace back the paths of major historical scientific breakthroughs with outstanding recognition accuracy (high hit rate and high ranking), but more importantly, it can generate highly insightful and forward-looking research hypotheses based on historical knowledge states. However, this project has not yet conducted ablation experiments, and there remain numerous detailed issues requiring further investigation. For instance, the framework's performance is highly dependent on the quality of the underlying SKG, and biases or gaps in historical data can directly affect the results. The framework is more adept at generating combinatorial innovations within existing knowledge systems, while its capability to identify completely paradigm-shifting new ideas requires further validation. Additionally, the judgment metrics for novel ideas adopted in this paper rely solely on a single formula measurement, which is relatively crude. Therefore, there remains substantial room for improvement in these detailed aspects.
In summary, this article aims to provide peers with an inspirational framework for generating novel ideas, which can significantly enhance the pace of scientific discovery is rapid, but due to time constraints, only partial results have been organized and reported for discussion purposes.
References
[1] Lu, C., Lu, C., Lange, R.T., Foerster, J.N., Clune, J., & Ha, D. (2024). The ArXiv, AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. abs/2408.06292.
[2] Kitano, H. (2021). Nobel Turing Challenge: creating the engine for scientific discovery. NPJ Systems Biology and Applications, 7.
[3] Tang, J., Xia, L., Li, Z., & Huang, C. (2025). AI-Researcher: Autonomous Scientific Innovation. ArXiv, abs/2505.18705.
[4] Hu, Mingxue et al. "A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers." ArXiv abs/2508.21148 (2025): n. pag.
[5] King, R.D. (2012). Opportunities for Automated Workflow and Planning with Robot Scientists.
[6] Ghafarollahi, A., & Buehler, M.J. (2024). SciAgents: Automating scientific ArXiv, abs/2409.05556. discovery through multi-agent intelligent graph reasoning.
[7] Ghareeb, A., Chang, B., Mitchener, L., Yiu, A., Szostkiewicz, C.J., Laurent, J.M., Razzak, M.T., White, A.D., Hinks, M.M., & Rodriques, S.G. (2025). Robin: A multi-agent system for automating scientific discovery. ArXiv, abs/2505.13400.
[8] Savage, N. (2012). Automating scientific discovery. Communications of the ACM, 55, 9 - 11.
[9] Gower, A.H., Korovin, K., Brunnsåker, D., Kronström, F., Reder, G.K., Tiukova, I.A., Reiserer, R.S., Wikswo, J.P., & King, R.D. (2024). The Use of AI-Robotic Systems for Scientific Discovery. ArXiv, abs/2406.17835.
This version posted 2025-10-13.