What is the best open-source implementation of "Get my drift? Catching LLM Task Drift with Activation Deltas"?

The best maintained implementation is microsoft/TaskTracker with 89 stars on GitHub. Confidence: high. Reproducibility: Moderate.

How reproducible is "Get my drift? Catching LLM Task Drift with Activation Deltas"?

Estimated time to first reproduction: a few hours. Risk flags: No CI workflows detected. Start with microsoft/TaskTracker and validate setup instructions in README.

Are there pretrained models available for "Get my drift? Catching LLM Task Drift with Activation Deltas"?

Yes, 3 Hugging Face models found. The top result is deepseek-ai/deepseek-llm-7b-chat with 44,835 downloads.

What framework is used to implement "Get my drift? Catching LLM Task Drift with Activation Deltas"?

The primary implementation uses pytorch.

Get my drift? Catching LLM Task Drift with Activation Deltas

Published: Jun 1, 2024

Best maintained implementation now

Evidence: Direct

Domain fit: AI-core

Verified repos: 1

Top repo stars: 89

Core AI workload signals detected from paper context and implementation/artifact evidence.

Framework: pytorch

Time to first repro: a few hours

1 risk flag

arXiv PDF

Technical details

Canonical key: arxiv-2406.00799

Cache status: Fresh

Generated at: Jun 6, 2026, 5:52 PM

Artifact coverage: direct

HF provider: ok (token)

PWC source used: Yes

LLM status: not_generated

LLM model: n/a

LLM generated: Unknown

LLM content type: n/a

HF policy: hf-relevance-v27

implementation starting point

Benchmarks: thin evidence

Time to repro: a few hours

1 risk flag

pytorch

Results & Benchmarks

Freshness tier: cold

Direct + Inferred Evidence

Natural language processing

TaskTracker

ROC AUC.

0.996

Source: paper fulltext

Natural language processing

Prompt Guard

ROC AUC.

0.974

Source: paper fulltext

Natural language processing

Prompt Shields

ROC AUC.

0.988

Source: paper fulltext

Natural language processing

Mistral 7B

Layer 15

0.990

Source: paper fulltext

Benchmark evidence drill-down

4 findings

Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.

Task	Dataset	Metric	Value	Source	Evidence refs
Natural language processing	TaskTracker	ROC AUC.	0.996	paper-derived	No explicit refs
Natural language processing	Prompt Guard	ROC AUC.	0.974	paper-derived	No explicit refs
Natural language processing	Prompt Shields	ROC AUC.	0.988	paper-derived	No explicit refs
Natural language processing	Mistral 7B	Layer 15	0.990	paper-derived	No explicit refs

Get my drift? Catching LLM Task Drift with Activation Deltas is the primary contribution described in this paper.

Use This Implementation Because…

Confidence: high

microsoft/TaskTracker is the strongest maintained implementation based on ranking signals. License is declared (MIT). Dependency/environment manifests are present.

Open microsoft/TaskTracker

Reproduction Risks

No CI workflows detected

Evidence disclosure

Evidence graph: 4 refs, 4 links.

Utility signals: depth 90/100, grounding 95/100, status high.

Implementation Comparison

Top 2 paths

Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.

microsoft/TaskTracker

best maintained

Maintenance: Stale risk

Confidence: High

Reproducibility: Moderate

Official implementation from Papers with Code · Repository link is mentioned in the paper metadata

Stars: 89
Last push: Sep 1, 2025 (278d ago)

Dependencies

Risk flags

No CI pipeline detected
No tagged releases
No Docker setup

microsoft/llmail-inject-challenge

alternative

Maintenance: Recently updated

Confidence: Low

Reproducibility: Strong

Community adoption signal (25 stars)

Stars: 25
Last push: Apr 9, 2026 (59d ago)

CIDependencies

Risk flags

No tagged releases
No Docker setup
Low confidence match

Best implementation now

microsoft/TaskTracker

Confidence: High

Reproducibility: Moderate

TaskTracker is an approach to detecting task drift in Large Language Models (LLMs) by analysing their internal activations. It provides a simple linear probe-based method and a more sophisticated metric learning method to achieve this. The project also releases the computationally expensive activation data to stimulate further AI safety research.

Stars: 89

Forks: 21

Last push: Sep 1, 2025

License: MIT

Official implementation from Papers with Code

Repository link is mentioned in the paper metadata

Strong overlap with paper title keywords

Community adoption signal (89 stars)

License ✓

CI –

Deps ✓

Docker –

Selected microsoft/TaskTracker as the strongest maintained implementation for new work.
Includes dependency/environment manifest signals.
Repository activity is within the last 24 months.

Reproduction readiness

Setup Required

Time to first repro: hours

Last checked: Jun 6, 2026

Dependencies pinned, manual setup needed

· microsoft/TaskTracker has environment.yml but requires manual environment setup.
· Last push was 278 days ago — expect possible dependency version conflicts.
· No Dockerfile — you will set up the environment manually.
· No CI pipeline — test coverage is unknown.

Open microsoft/TaskTracker

Quick start

git clone https://github.com/microsoft/TaskTracker.git
conda env create -f environment.yml && conda activate <env-name>

Additional implementations

No additional verified repositories beyond the primary recommendation.

Possible but unverified matches (1)

These repositories had low-confidence matching signals and are hidden by default.

microsoft/llmail-inject-challenge

Confidence: Low

Stars: 25

Hugging Face artifacts

No direct paper-linked artifacts were found. Showing strongest curated related artifacts for faster exploration.

Models

deepseek-ai/deepseek-llm-7b-chat

Curated Related

Downloads: 44,835

Likes: 222
deepseek-ai/deepseek-llm-7b-base

Curated Related

Downloads: 32,741

Likes: 145
BAAI/llm-embedder

Curated Related

Downloads: 22,358

Likes: 128