Jul 11, 2025

Latest Update, July 10, 2025

Legit or Just Lucky?

Confusion Matrix

This matrix is encouraging. Most key predictions land cleanly on the diagonal — meaning the model is mostly getting things right. There’s some confusion between C and Eb, and a bit around Ab, but these are musically adjacent and share many pitches. That kind of blur is expected in real-world audio — not a red flag.

Importantly, we don’t see scattered misclassifications or random misfires. That suggests the model isn't overfitting or hallucinating. It's making sensible, musically interpretable errors.

Feature Importance

This bar chart shows the top 20 features driving XGBoost’s decisions. As expected, we see many strong contributions from individual CQT bins — likely centered around musically active registers. The presence of tonnetz_4 (a harmonic dimension) is a great sign that tonal structure is helping.

Even better: the new register-level summary features don’t dominate, but they make it into the top 20, confirming they’re useful without overwhelming the model.

Conclusion

These plots don’t look random, noisy, or overfit. They look like what I’d hope to see from a frequency- and harmony-aware key recognition system — grounded in both pitch content and musical logic.