Each field below shows whether the signal looked explicit, partial, or missing in the available metadata. Use this to judge what is safe to trust directly and what still needs full-paper validation.
Human Feedback Types
partial Demonstrations
Confidence: Low Direct evidence
Directly usable for protocol triage.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.
Evaluation Modes
missing None explicit
Confidence: Low Not found
Validate eval design from full paper text.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.
Quality Controls
missing Not reported
Confidence: Low Not found
No explicit QC controls found.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.
Benchmarks / Datasets
missing Not extracted
Confidence: Low Not found
No benchmark anchors detected.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.
Reported Metrics
missing Not extracted
Confidence: Low Not found
No metric anchors detected.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.
Rater Population
missing Unknown
Confidence: Low Not found
Rater source not explicitly reported.
Evidence snippet: We present AITutor-EvalKit, an application that uses language technology to evaluate the pedagogical quality of AI tutors, provides software for demonstration and evaluation, as well as model inspection and data visualization.