OpenTrain AI
Maintained implementation availablenone

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

January 1, 2025arXiv: 2501.04575
2 repos73 stars~a few days to reproduce
arXiv PDF

Abstract

Results & Benchmarks

TaskDatasetMetricValue
Agentic tool useGPT-4o 1 1 1 gpt-4o-2024-08-06 versionAccuracy30.5

Hardware Requirements

  • Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Best Implementation

InfiXAI/InfiGUIAgent

73 3 May 2025
License
CI
Deps
Docker
  • Selected reallm-labs/infiguiagent as the strongest maintained implementation for new work.
  • Repository activity is within the last 24 months.

Reproduction Path

  1. 1

    Start with reallm-labs/infiguiagent and validate setup instructions in README.

  2. 2

    Reproduce the baseline result with the provided defaults before modifying hyperparameters.

  3. 3

    Log exact dependency versions and runtime environment for reproducibility.

Time to first repro: a few daysLicense metadata missingNo CI workflows detectedDependency manifest is missing

Additional Implementations

Official

No additional official repositories detected.

Community

  • InfiXAI/InfiGUI-R1Confidence: low

    Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"

    Stars: 64Forks: 5Last push: Dec 2025License: Apache-2.0

Hugging Face Artifacts

No trustworthy direct or curated related Hugging Face artifacts were found yet.

Research Context