🔗
Pearson Correlation
Loading...
Measures semantic understanding quality
✓
Accuracy
Loading...
% of correct predictions (±0.5 threshold)
📏
Mean Absolute Error
Loading...
Average prediction deviation
📊
R-Squared (R²)
Loading...
Variance explained fraction
🎓
Grade Agreement
Loading...
Same grade level matching
📈
RMSE
Loading...
Root mean squared error
📖 Metric Interpretations
🔗 Pearson Correlation Coefficient (-1 to 1)
What it measures: Linear relationship between
predicted and expected scores.
Values:
Loading...
Interpretation: Values > 0.7 indicate strong
agreement with expected grading patterns.
✓ Accuracy (%)
What it measures: Percentage of predictions within
acceptable threshold (±0.5 points).
Calculation: (Correct Predictions / Total) × 100
Target: >85% accuracy for production deployment.
📏 Mean Absolute Error (MAE)
What it measures: Average absolute difference
between predicted and expected scores.
Unit: Points (0-10 scale)
Target: <0.5 points for high-quality grading.
📊 R-Squared (0 to 1)
What it measures: Proportion of variance in
expected scores explained by predictions.
Values: 0.5+ is acceptable, 0.7+ is strong, 0.9+ is
excellent.
Interpretation: R² = 0.85 means 85% of grade
variation is properly captured.
🎓 Grade Agreement (%)
What it measures: Percentage of predictions in same
letter grade (A/B/C/D/F) as expected.
Grades: F(0-2), D(2-4), C(4-6), B(6-8), A(8-10)
Target: >90% for consistent grade assignment.
📈 RMSE (Root Mean Squared Error)
What it measures: Square root of average squared
differences (penalizes large errors).
vs MAE: More sensitive to outliers than MAE.
Target: <0.6 points for robust grading.