Best Metrics for ERP-Based BCI Spelling Rate Accuracy

Okba Bekhelifi, Naoual El Djouher Mebtouche· July 2, 2026 View original

Summary

This research identifies the most suitable metrics for evaluating spelling rate accuracy in Event-Related Potential (ERP)-based Brain-Computer Interfaces (BCIs), which often have imbalanced data. The study, using two datasets, favors the Brier score, MCC, ROC AUC, PR AUC, Average Precision, and partial AUC as best reflecting user spelling performance.

In Brain-Computer Interface (BCI) systems, particularly those based on Event-Related Potentials (ERPs), the spelling rate—the number of correctly selected characters—is a more critical performance indicator than traditional loss or accuracy metrics. This is because spelling rate directly influences the information transfer rate (ITR) and overall spelling performance. Furthermore, ERP-based BCIs typically involve imbalanced data class distributions, necessitating the use of metrics that can robustly handle such imbalances, like the area under the receiver operating characteristic curve (ROC AUC). This study systematically investigates the correlation between the spelling rate and 13 different performance metrics to determine which ones best reflect user spelling performance and how they are affected by trial repetition. The research utilized two distinct datasets: a private LARESI ERP dataset and the public OpenBMI ERP dataset, ensuring a broad evaluation. The findings strongly suggest that the Brier score, Matthews Correlation Coefficient (MCC), and several metrics designed for imbalanced binary classification—specifically ROC AUC, area under the Precision-Recall curve (PR AUC), Average Precision (AP), and partial AUC (pAUC)—are the most reliable indicators. These metrics are encouraged for reporting in future ERP-based BCI experiments, providing a more accurate and comprehensive assessment of system performance.

Why it matters

For professionals developing or researching Brain-Computer Interfaces, selecting the correct performance metrics is crucial for accurately assessing system effectiveness, especially in applications like communication where spelling rate is paramount and data is often imbalanced.

How to implement this in your domain

1Adopt the Brier score, Matthews Correlation Coefficient (MCC), ROC AUC, PR AUC, Average Precision, and partial AUC as primary evaluation metrics for ERP-based BCI systems.
2Re-evaluate existing BCI models using these recommended metrics to gain a more accurate understanding of their true spelling performance.
3Incorporate these metrics into the design and optimization phases of new ERP-based BCI algorithms, particularly when dealing with imbalanced datasets.
4Standardize reporting of these metrics in research and development to facilitate better comparison and progress in the BCI field.

Who benefits

HealthcareMedical DevicesAssistive TechnologyNeuroscience ResearchAI/ML Development

Key takeaways

Spelling rate is the most critical metric for ERP-based BCI performance.
Imbalanced data in BCIs requires specific metrics like ROC AUC and PR AUC.
Brier score, MCC, ROC AUC, PR AUC, AP, and pAUC are recommended for BCI evaluation.
These metrics provide a more accurate reflection of user spelling performance.

Original post by Okba Bekhelifi, Naoual El Djouher Mebtouche

"arXiv:2607.00794v1 Announce Type: new Abstract: For predictive models, the often-reported performance metrics are the loss and accuracy. In synchronous Brain- Computer Interface (BCI) systems, these metrics are informative for most BCI paradigms; however, for Event-Related Potent…"

View on X

Originally posted by Okba Bekhelifi, Naoual El Djouher Mebtouche on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

Best Metrics for ERP-Based BCI Spelling Rate Accuracy

Why it matters

How to implement this in your domain

Who benefits

Key takeaways

Want to go deeper?

More in AI Research

Human Feedback Guides Generative Meta-Learning for Robust Generalization.

Valdi: Value Diffusion World Models for MPC

Task-Aware LLM Quantization Improves Efficiency and Performance.