What is the best open-source implementation of "TrustLLM: Trustworthiness in Large Language Models"?

The best maintained implementation is HowieHwong/TrustLLM with 627 stars on GitHub. Confidence: high. Reproducibility: Moderate.

How reproducible is "TrustLLM: Trustworthiness in Large Language Models"?

Estimated time to first reproduction: a few days. Risk flags: Dependency manifest is missing. Start with HowieHwong/TrustLLM and validate setup instructions in README.

What framework is used to implement "TrustLLM: Trustworthiness in Large Language Models"?

The primary implementation uses none.

TrustLLM: Trustworthiness in Large Language Models

Published: Jan 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 627

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: none

Time to first repro: a few days

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2401.05561

Cache status: Stale (SWR served)

Generated at: Jun 15, 2026, 10:09 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few days

1 risk flag

none

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Natural language processing

Mistral-7b

Social Chemistry 101

0.647

Source: paper fulltext

Benchmark evidence drill-down

1 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Natural language processing	Mistral-7b	Social Chemistry 101	0.647	paper-derived	No explicit refs

TrustLLM: Trustworthiness in Large Language Models is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

HowieHwong/TrustLLM is the strongest maintained implementation based on ranking signals. CI workflows are present. License is declared (MIT).

Open HowieHwong/TrustLLM

Reproduction Risks

Dependency manifest is missing

Hardware Notes

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 95/100, grounding 85/100, status high.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

HowieHwong/TrustLLM

best maintained

Maintenance: Stale risk

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 627
Last push: Jun 24, 2025 (361d ago)

CIReleases

Risk flags

No Docker setup
Dependency manifest missing

VyetGokyra/awaresome_LLM_eval_benchmark

alternative

Maintenance: Stale risk

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search

Stars: 8
Last push: Aug 12, 2025 (312d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

anshug/ai-security-evals

alternative

Maintenance: Recently updated

Confidence: Low

Reproducibility: Limited

Matched via arXiv identifier search

Stars: 2
Last push: May 17, 2026 (34d ago)

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

Best implementation now

HowieHwong/TrustLLM

Confidence: High

Reproducibility: Moderate

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Stars: 627

Forks: 67

Last push: Jun 24, 2025

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (627 stars)

License ✓

CI ✓

Deps –

Docker –

Selected HowieHwong/TrustLLM as the strongest maintained implementation for new work.
Includes CI workflow signals.
Repository activity is within the last 24 months.

Reproduction readiness

Major Work

Time to first repro: days

Last checked: Jun 15, 2026

Hardware requirements

Expect multi-day setup/compute for meaningful reproduction based on current guidance.

No dependency manifest — manual reconstruction required

· HowieHwong/TrustLLM has no requirements.txt, environment.yml, pyproject.toml, or Dockerfile.
· You will need to reverse-engineer dependencies from import statements in the source code.
· Last push was 361 days ago.

Open HowieHwong/TrustLLM