Yunjue Agent
Official Introduction

Yunjue Agent: Our In-Situ Self-Evolving Agent System for Open-Ended Tasks

Yunjue Agent is the official in-situ self-evolving agent system from Yunjue Technology, designed to convert runtime execution feedback into reusable tools for open-ended tasks.

Yunjue Technology · Tech Report: arXiv:2601.18226v2 · Last reviewed:

Zero-Start Tool Library Parallel Batch Evolution 73.5% DeepSearchQA

System Overview

We follow an in-situ self-evolving paradigm: each execution outcome becomes a supervision signal, and tools are continuously synthesized, optimized, merged, and reused in a shared capability pool.

In-Situ Learning

Sequential tasks are treated as a live experience stream. No ground-truth labels are required for capability accumulation.

Tool-First Evolution

Tools serve as verifiable primitives with binary execution feedback, reducing strategy drift and improving reliability.

Parallel Batch Evolution

Aggregator and Merger components cluster and consolidate tools, improving evolution efficiency at scale.

EGL Convergence

Evolutionary Generality Loss (EGL) is used to monitor stabilization, analogous to training loss in optimization.

Published Evidence and Benchmarks

The following figures and values are summarized from the public tech report and project assets provided by Yunjue Technology.

Main benchmark comparison results for Yunjue Agent
Benchmark comparison figure from the project materials, highlighting zero-start and transfer performance.

Selected Reported Metrics

  • HLE: 48.0%
  • DeepSearchQA: 73.5%
  • FinSearchComp: 65.0%
  • Transfer behavior: warm-start evaluations report low marginal tool growth in later domains.

Interpretation: the system demonstrates both initial capability bootstrapping and cross-domain reuse through evolved tools.

Dataset Reported Yunjue Agent Score Setup Source
HLE 48.0% Zero-start Tech Report (arXiv:2601.18226v2)
DeepSearchQA 73.5% Zero-start Tech Report (arXiv:2601.18226v2)
FinSearchComp 65.0% Zero-start Tech Report (arXiv:2601.18226v2)
xbench-ScienceQA / xbench-DeepSearch Improved vs. backend baselines Zero-start + transfer Tech Report + project README

Architecture and Execution Roles

The system uses coordinated roles for planning, tool generation, execution, and integration, with an absorbing mechanism to update the global tool pool.

Yunjue Agent architecture overview diagram
High-level architecture from official project materials.

Local Demo Snapshots

These demos illustrate tool decomposition, PDF retrieval, stock information tasks, and skill-driven automation workflows.

Web Demo: Tool Evolution in Action

Web demo showing tool evolution process

Example: decomposing a user request and generating task-specific tools during runtime.

Web Demo: Reuse for Financial Query

Web demo showing stock information query

Example: reusing evolved tools to answer US stock-related queries.

CLI Skill Demo: Knowledge to Automation

CLI skill demo for spreadsheet analysis automation

Example: converting structured skill descriptions into executable tool workflows.

CLI Skill Demo: Report to Deliverable

CLI skill demo for report summarization and slide generation

Example: locating local files, summarizing documents, and producing presentation artifacts.

Trust, Provenance, and Reproducibility

To align with E-E-A-T principles, we provide transparent authorship, dated updates, public artifacts, and clear source references for key claims.

Authorship and Affiliation

Tech report authors include Haotian Li, Shijun Yang, Weizhen Qi, Silei Zhao, Rui Hua, Mingzhu Song, Xiaojian Yang, and Chao Peng, affiliated with Yunjue Technology.

Correspondence: qiweizhen@yunjuetech.com

Public Update Timeline

  1. : Initial open-source release.
  2. : Trace dataset and reproduction workflow published.
  3. : Tech report updated with additional performance and EGL analysis.
  4. : Reproduction branch and demo updates.

FAQ

Short answers for common implementation and evaluation questions.

What differentiates Yunjue Agent from static agents?

Yunjue Agent evolves tools during inference and keeps successful tools for future tasks, rather than relying only on a fixed toolset defined before deployment.

What does zero-start mean in this context?

Zero-start means the system begins with an empty tool library and builds capability through runtime tool synthesis, validation, and reuse.

How can results be verified independently?

Use the public repository, reproduce branch, and released trace datasets to inspect execution logs and rerun benchmark workflows.

What is the role of tool evolution in the system?

Tool evolution is the primary capability engine: tools are synthesized, executed, optimized, and absorbed into a shared pool when they prove useful.

What is Parallel Batch Evolution?

Parallel Batch Evolution processes multiple tasks in batches and uses aggregation and merging steps to reduce redundant tools and improve evolution efficiency.

What does EGL indicate?

Evolutionary Generality Loss (EGL) is used as a convergence signal to track whether the evolved tool library is becoming more stable and reusable.

Is there evidence of cross-domain transfer?

Yes. Warm-start evaluations reported in the project materials indicate that previously evolved tools can transfer to new domains with reduced marginal tool growth.

What are the practical deployment options?

The project provides both a local web demo and a CLI skill demo. The published quick-start notes indicate the current tested environment is macOS.

What license is used for the open-source project?

The repository is released under the Apache License 2.0, enabling transparent use and reproducibility under standard open-source terms.