Skip to content
← Back to explorer

Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy

Bianca Raimondi, Maurizio Gabbrielli · Feb 19, 2026 · Citations: 0

Abstract

The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics. This study investigates the internal neural representations of cognitive complexity using Bloom's Taxonomy as a hierarchical lens. By analyzing high-dimensional activation vectors from different LLMs, we probe whether different cognitive levels, ranging from basic recall (Remember) to abstract synthesis (Create), are linearly separable within the model's residual streams. Our results demonstrate that linear classifiers achieve approximately 95% mean accuracy across all Bloom levels, providing strong evidence that cognitive level is encoded in a linearly accessible subspace of the model's representations. These findings provide evidence that the model resolves the cognitive difficulty of a prompt early in the forward pass, with representations becoming increasingly separable across layers.

Human Data Lens

  • Uses human feedback: No
  • Feedback types: None
  • Rater population: Unknown
  • Unit of annotation: Unknown
  • Expertise required: Coding

Evaluation Lens

  • Evaluation modes: Automatic Metrics
  • Agentic eval: None
  • Quality controls: Not reported
  • Confidence: 0.35
  • Flags: low_signal, possible_false_positive

Research Summary

Contribution Summary

  • The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics.
  • This study investigates the internal neural representations of cognitive complexity using Bloom's Taxonomy as a hierarchical lens.
  • By analyzing high-dimensional activation vectors from different LLMs, we probe whether different cognitive levels, ranging from basic recall (Remember) to abstract synthesis (Create), are linearly separable within the model's residual strea

Why It Matters For Eval

  • The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics.

Related Papers