LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis

Q: What is the best open-source implementation of "LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis"?

The best maintained implementation is ictnlp/llama-omni2 with 268 stars on GitHub. Confidence: high. Reproducibility: Limited.

Q: How reproducible is "LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis"?

Estimated time to first reproduction: a few hours. Risk flags: License metadata missing, No CI workflows detected. Start with ictnlp/llama-omni2 and validate setup instructions in README.

Q: Are there pretrained models available for "LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis"?

Yes, 2 Hugging Face models found. The top result is ICTNLP/LLaMA-Omni2-0.5B with 38 downloads.

Q: What framework is used to implement "LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis"?

The primary implementation uses pytorch.

Published: May 1, 2025

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 268

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: pytorch

Time to first repro: a few hours

2 risk flags

arXiv PDF

Technical details

Canonical key: arxiv-2505.02625

Cache status: Fresh

Generated at: Apr 29, 2026, 9:04 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few hours

2 risk flags

pytorch

Results & Benchmarks

Freshness tier: hot

Direct + Inferred Evidence

Representation learning

LLaMA-Omni2-7B

ASR-WER.

3.26

Source: paper fulltext

Representation learning

w/o Gate Fusion

ASR-WER.

4.89

Source: paper fulltext

Representation learning

w/o Text Embedding

ASR-WER.

6.83

Source: paper fulltext

Language modeling

Streaming TTS

ASR-WER.

3.26

Source: paper fulltext

Language modeling

Offline TTS

ASR-WER.

3.51

Source: paper fulltext

Benchmark evidence drill-down

6 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Representation learning	LLaMA-Omni2-7B	ASR-WER.	3.26	paper-derived	No explicit refs
Representation learning	w/o Gate Fusion	ASR-WER.	4.89	paper-derived	No explicit refs
Representation learning	w/o Text Embedding	ASR-WER.	6.83	paper-derived	No explicit refs
Language modeling	Streaming TTS	ASR-WER.	3.26	paper-derived	No explicit refs
Language modeling	Offline TTS	ASR-WER.	3.51	paper-derived	No explicit refs
Language modeling	Text Pretrained	ASR-WER.	10.34	paper-derived	No explicit refs

LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis focuses on instruction tuning.

Use This Implementation Because…

Confidence: high

ictnlp/llama-omni2 is the strongest maintained implementation based on ranking signals. Dependency/environment manifests are present.

Open ictnlp/llama-omni2

Reproduction Risks

License metadata missing
No CI workflows detected

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 95/100, grounding 95/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

ictnlp/llama-omni2

best maintained

Maintenance: Stale risk

Confidence: High

Reproducibility: Limited

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 268
Last push: May 19, 2025 (346d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

rainbowluocs/openomni

alternative

Maintenance: Recently updated

Confidence: Low

Reproducibility: Limited

Strong overlap with paper title keywords · Community adoption signal (137 stars)

Stars: 137
Last push: Nov 8, 2025 (173d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

eblessings/LLaMA-Omni2-v1.1

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search

Stars: 0
Last push: Jul 3, 2025 (301d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Best implementation now

ictnlp/llama-omni2

Confidence: High

Reproducibility: Limited

ictnlp/LLaMA-Omni2

Stars: 268

Forks: 27

Last push: May 19, 2025

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Community adoption signal (268 stars)

License –

CI –

Deps ✓

Docker –

Selected ictnlp/llama-omni2 as the strongest maintained implementation for new work.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction readiness

Setup Required

Time to first repro: hours

Last checked: Apr 29, 2026

Dependencies pinned, manual setup needed

· ictnlp/llama-omni2 has pyproject.toml but requires manual environment setup.
· Last push was 346 days ago — expect possible dependency version conflicts.
· No Dockerfile — you will set up the environment manually.
· No CI pipeline — test coverage is unknown.

Open ictnlp/llama-omni2

Quick start

git clone https://github.com/ictnlp/llama-omni2.git
pip install -e .

Additional implementations

No additional verified repositories beyond the primary recommendation.

Possible but unverified matches (2)

These repositories had low-confidence matching signals and are hidden by default.

rainbowluocs/openomni

Confidence: Low

Stars: 137
eblessings/LLaMA-Omni2-v1.1

Confidence: Low

Stars: 0

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

ICTNLP/LLaMA-Omni2-0.5B

Curated Related

Downloads: 38

Likes: 8
ICTNLP/LLaMA-Omni2-7B

Curated Related

Downloads: 50

Likes: 4

Broaden model search

Transformer Instruction tuning Transformer Large Language Models Instruction tuning

Datasets

No trustworthy dataset matches right now.

Search datasets on Hugging Face

Spaces

No trustworthy demo spaces right now.

Search spaces on Hugging Face

Explore on Hugging Face

Search models Search datasets Search spaces

Research context

Tasks

Instruction tuning

Methods

Transformer

Domains

Large Language Models

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Explore Similar Papers

Jump to Paper2Code search queries derived from this paper's research context.

Instruction tuning Transformer Large Language Models

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.

Post a Job Get a Quote