“After being approved by the U.S. Food and Drug Administration (FDA), the first batch of wearable digital health monitors are currently on the market and integrated in consumer products such as smart watches. The continuous and rapid development of medical sensor technology has enabled small, economical, and increasingly high-precision physiological sensors to be used in existing wearable devices.
After being approved by the U.S. Food and Drug Administration (FDA), the first batch of wearable digital health monitors are currently on the market and integrated in consumer products such as smart watches. The continuous and rapid development of medical sensor technology has enabled small, economical, and increasingly high-precision physiological sensors to be used in existing wearable devices.
Cutting-edge machine learning and artificial intelligence algorithms are one of the driving forces of this transformation. They can extract and interpret valuable information from massive amounts of data. These data often contain noise and imperfect signals (such as ECG data on smart watches), and are corrupted by various false signals. Traditional algorithms are often based on rules and accuracy, so it is difficult to properly process this type of data.
Until recently, it was still very difficult, sometimes impossible, to unravel the secrets of the physiological signals sent by these sensors and make sufficiently accurate decisions to be accepted by the reporting regulatory agency. The advancement of machine learning and artificial intelligence algorithms is enabling engineers and scientists to overcome many of these challenges.
Through this article, let us take a closer look at the overall architecture of the physiological signal processing algorithm, understand the calculation process behind it, and transform it into a realistic engineering technology established after decades of research.
The development of machine learning algorithms mainly includes two steps (Figure 1).
The first step is feature engineering, which extracts specific numerical/mathematical features from the corresponding data set.
The second step is to input the extracted features into a well-known statistical classification or regression algorithm, such as a support vector machine or a properly set traditional neural network (the trained model can be used to predict new data sets). The model is iteratively trained using a reasonably labeled data set. After a satisfactory accuracy is achieved, it can be used as a prediction engine on a new data set in a production environment.
Figure 1. A typical machine learning workflow includes training and testing phases.
So, how is this workflow implemented for the classification of ECG signals?
In this case, we used the 2017 PhysioNet Challenge dataset, which used real single-lead ECG data. The goal is to divide the patient’s ECG signals into four categories: normal, atrial fibrillation, other heart rhythms, and excessive murmurs.
The entire process and steps of dealing with this problem in MATLAB are shown in Figure 2.
Figure 2. MATLAB’s workflow for developing machine learning algorithms for ECG signal classification.
Preprocessing and feature engineering
Feature engineering may be the hardest part of developing a robust machine learning algorithm. This type of problem cannot be simply regarded as a “data science” problem, because when exploring solutions, it is very important to master professional knowledge in the field of biomedical engineering and understand different types of physiological signals and data.
Tools such as MATLAB provide domain experts with data analysis and advanced machine learning capabilities, allowing them to more easily apply “data science” capabilities (such as advanced machine learning capabilities) to the problems they are solving, thereby focusing on feature engineering. In this example, we use advanced wavelet technology to process the signal to remove noise and gradual trends in the data set, such as breathing artifacts, and extract various features that need attention from the signal.
Develop classification model
The classification learning application in the statistics and machine learning toolbox is a particularly effective entry point for engineers and scientists who are not familiar with machine learning.
Once enough useful and relevant features are extracted from the signal, we can use this application to quickly explore various classifiers and their performance, thereby narrowing the scope of model selection for further optimization. These classifiers include decision trees, random forests, support vector machines, and K nearest neighbors (KNN). You can try and choose the strategy that can provide the best classification performance for the feature set (usually evaluated using indicators such as confusion matrix or AUC). In the example, we only used this method to quickly achieve an overall accuracy of about 80% in all categories (the winning project score in this competition was about 83%). Note that we did not spend too much time on feature engineering or classifier debugging, because the focus is on verification methods.
Generally, spending time on feature engineering and classifier debugging can significantly improve classification accuracy. More advanced technologies such as deep learning can also be applied to such problems. Among them, feature engineering, feature extraction, and classification steps are integrated into a single training step. However, compared with traditional machine learning techniques, this method usually requires a lot of Many training data sets to achieve the desired effect.
Challenges, regulations and commitment to the future
Although many common wearable devices cannot completely replace FDA-approved and medically verified corresponding devices, all technologies and consumer trends are clearly pointing in this direction. The FDA has begun to play an active role in many aspects, such as simplifying regulations, encouraging the development of management science, and modeling and simulation of equipment development through initiatives such as the “Digital Health Software Pre-certification Program”.
It is hoped that the human physiological signals collected from daily wearable devices will be converted into a new type of digital biomarker to fully reflect our health status. Today, this vision is more real than ever, thanks in large part to advances in signal processing, machine learning, and deep learning algorithms. The workflow supported by tools such as MATLAB enables experts in the medical device field to adopt and utilize data science techniques such as machine learning even without becoming a data scientist.