Skip to content
implementation starting point
Benchmarks: missing
Time to repro: a few hours
1 risk flag
pytorch

Results & Benchmarks

Freshness tier: cold
Direct + Inferred Evidence

No concrete benchmark grounding is available yet. Treat the page as context or an implementation starting point only.

Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

alexandervnikitin/kernel-language-entropy is the strongest maintained implementation based on ranking signals. License is declared (BSD-3-Clause-Clear). Dependency/environment manifests are present.

Open alexandervnikitin/kernel-language-entropy

Reproduction Risks

  • No CI workflows detected
Evidence disclosure

Evidence graph: 3 refs, 3 links.

Utility signals: depth 55/100, grounding 75/100, status medium.

Implementation Comparison

Top 3 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

Maintenance: Stale
Confidence: High
Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars
35
Last push
Dec 17, 2024 (550d ago)
Dependencies

Risk flags

  • No push in 12+ months
  • No CI pipeline detected
  • No tagged releases
iinemo/lm-polygraph
alternative
Maintenance: Recently updated
Confidence: Low
Reproducibility: Strong

Community adoption signal (477 stars)

Stars
477
Last push
May 18, 2026 (33d ago)
CIDockerfileReleasesDependencies

Risk flags

  • Low confidence match
Maintenance: Stale
Confidence: Medium
Reproducibility: Moderate

Matched via arXiv identifier search · Strong overlap with paper title keywords

Stars
35
Last push
Dec 17, 2024 (550d ago)
Dependencies

Risk flags

  • No push in 12+ months
  • No CI pipeline detected
  • No tagged releases

Best implementation now

alexandervnikitin/kernel-language-entropy
Confidence: High
Reproducibility: Moderate

Code for Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities (NeurIPS'24)

Stars: 35
Forks: 4
Last push: Dec 17, 2024
License: BSD-3-Clause-Clear
Official implementation from Papers with Code
Repository link is mentioned in the paper metadata
Strong overlap with paper title keywords
Community adoption signal (35 stars)
License ✓
CI –
Deps ✓
Docker –
  • Selected alexandervnikitin/kernel-language-entropy as the strongest maintained implementation for new work.
  • Includes dependency/environment manifest signals.
  • Repository activity is within the last 24 months.

Reproduction readiness

Setup Required
Time to first repro: hours
Last checked: Jun 19, 2026

Dependencies pinned, manual setup needed

  • · alexandervnikitin/kernel-language-entropy has environment.yml but requires manual environment setup.
  • · Last push was 550 days ago — expect possible dependency version conflicts.
  • · No Dockerfile — you will set up the environment manually.
  • · No CI pipeline — test coverage is unknown.
Open alexandervnikitin/kernel-language-entropy

Quick start

git clone https://github.com/alexandervnikitin/kernel-language-entropy.git
conda env create -f environment.yml && conda activate <env-name>

No benchmark numbers could be verified. You will not be able to validate reproduction correctness against published numbers.

Additional implementations

Official

No additional official repositories detected.

Community

  • Code for Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities (NeurIPS'24)

    Stars: 35
    Last push: Dec 17, 2024
    License: BSD-3-Clause-Clear

These repositories had low-confidence matching signals and are hidden by default.

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

No trustworthy model matches right now.

Search models on Hugging Face

Datasets

Spaces

Research context

Evaluation & Human Feedback Data

Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.

Open in HFEPX

Need human evaluators for your AI research? Scale annotation with expert AI Trainers.