Google researchers, working in collaboration with Peking University, have introduced a new artificial intelligence framework called PaperBanana that aims to automate the creation of publication-ready academic diagrams and statistical plots. The system is designed to reduce the time and manual effort required to produce high-quality visuals that are central to research papers across scientific and technical disciplines.
Academic research relies heavily on visual elements to explain experimental setups, workflows and data analysis. While text generation tools and coding assistants have advanced rapidly in recent years, researchers often still need to manually design figures using specialised software or scripting tools. This process can be time-consuming and requires both technical and visual design skills. PaperBanana seeks to address this gap by converting textual descriptions directly into diagrams and charts that meet publication standards.
The framework uses an agent-based architecture in which multiple specialised AI agents collaborate to complete the task. Instead of relying on a single model to interpret text and generate visuals, PaperBanana divides the process into structured steps. Each agent is responsible for a specific function, enabling more precise control over how visuals are created and refined.
The process begins with a retrieval agent that identifies relevant reference figures from existing academic material. These references help establish conventions around layout, symbols and formatting. A planning agent then translates the user’s text input into a structured visual plan, outlining the components, relationships and hierarchy needed for the figure. This plan serves as a blueprint for the final output.
A styling agent applies academic design principles, ensuring that the visual follows norms commonly seen in peer-reviewed publications. This includes layout balance, clarity of labels and appropriate use of shapes and colours. The visual generation is handled by a visualiser agent, which creates the diagram or statistical plot using a combination of vision-language models and executable code. For data-driven charts, the system can generate plotting scripts that allow users to reproduce and modify results using standard programming tools.
The final stage involves a critic agent that evaluates the output against the original text description and reference examples. This agent checks for logical accuracy, completeness and visual consistency. If issues are detected, feedback is sent back through the pipeline for iterative improvement. This feedback loop helps ensure that the resulting visuals closely match the intended methodology or data representation.
The developers of PaperBanana tested the framework using a benchmark dataset drawn from a large number of real academic papers. These test cases included a variety of diagram types and chart formats commonly used in computer science and machine learning research. According to the evaluation results, PaperBanana demonstrated higher accuracy and visual quality than baseline approaches that rely on single-step generation.
One of the notable aspects of the system is its ability to handle both conceptual diagrams and statistical plots. Methodology diagrams often require clear representation of processes, dependencies and data flow, while charts must accurately reflect numerical relationships. By supporting both categories, PaperBanana addresses a wide range of visual needs within academic publishing.
The introduction of PaperBanana reflects broader interest in agentic AI systems that can perform complex, multi-step tasks with limited human input. Rather than acting as passive tools, these systems are designed to reason, plan and refine outputs autonomously. In research environments, such capabilities could help streamline workflows that currently involve multiple tools and manual intervention.
Researchers and developers have expressed interest in the potential of automated figure generation, particularly as publication expectations continue to rise. Clear and well-designed visuals play an important role in peer review and knowledge dissemination. Automating parts of this process could help researchers focus more on experimental design, analysis and interpretation rather than presentation mechanics.
At the same time, the developers acknowledge the need for human oversight. Visuals in research papers are not merely decorative elements but often serve as key evidence supporting scientific claims. Errors or misrepresentations could have serious consequences. PaperBanana’s design includes iterative checks to reduce such risks, but final review by authors remains essential.
Beyond immediate applications in academic publishing, the framework highlights how AI could reshape creative and technical workflows. Similar approaches could be extended to slide creation, technical documentation or interactive data visualisation tools. As agent-based systems mature, they may increasingly assist with tasks that combine reasoning, structure and visual communication.
PaperBanana has been released alongside supporting resources to encourage further experimentation and development. By making the framework accessible to the research community, the developers aim to foster exploration of how multi-agent systems can be applied to other areas of scientific communication.
The launch also signals continued investment by major technology companies and academic institutions in tools that enhance research productivity. As scientific output grows globally, automation that supports clarity and consistency may become an important component of the research ecosystem.
PaperBanana illustrates how artificial intelligence is moving beyond text and code into more structured creative domains. By automating the production of publication-ready diagrams and plots, the framework addresses a practical challenge faced by researchers and offers a glimpse into how AI agents could support the future of academic work.