Comprehensive clinical application analysis of artificial intelligence-enabled electrocardiograms for screening multiple valvular heart diseases

Background: Valvular heart disease (VHD) is becoming increasingly important to manage the risk of future complications. Electrocardiographic (ECG) changes may be related to multiple VHDs, and (AI)-enabled ECG has been able to detect some VHDs. We aimed to develop five deep learning models (DLMs) to identify aortic stenosis, aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation. Methods: Between 2010 and 2021, 77,047 patients with echocardiography and 12-lead ECG performed within 7 days were identified from an academic medical center to provide DLM development (122,728 ECGs), and internal validation (7,637 ECGs). Additional 11,800 patients from a community hospital were identified to external validation. The ECGs were classified as with or without moderate-to-severe VHDs according to transthoracic echocardiography (TTE) records, and we also collected the other echocardiographic data and follow-up TTE records to identify new-onset valvular heart diseases. Results: AI-ECG adjusted for age and sex achieved areas under the curves (AUCs) of >0.84, >0.80, >0.77, >0.83, and >0.81 for detecting aortic stenosis, aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation, respectively. Since predictions of each DLM shared similar components of ECG rhythms, the positive findings of each DLM were highly correlated with other valvular heart diseases. Of note, a total of 37.5–51.7% of false-positive predictions had at least one significant echocardiographic finding, which may lead to a significantly higher risk of future moderate-to-severe VHDs in patients with initially minimal-to-mild VHDs. Conclusion: AI-ECG may be used as a large-scale screening tool for detecting VHDs and a basis to undergo an echocardiography.


INTRODUCTION
The cardiac blood flow direction from one area to another is related to heart valves, including the aortic valve, pulmonary valve, tricuspid valve, and mitral valve.In the United States, 2.5% of patients suffered from moderate or severe valvular heart disease, [1] and more than half were asymptomatic.Moreover, aortic stenosis was also a significant valvular heart disease and was present in 0.4% of patients, [1] and valvular heart diseases were more common in elderly individuals.Severe valvular heart disease may lead to heart failure and sudden death, [2] and immediate intervention is necessary to manage the risk of complications [3].Currently, most asymptomatic patients with valvular heart disease are identified by advanced health examination, including echocardiography.Due to the characteristics of echocardiography of expensiveness and requirement of indispensable specialists, it cannot be used as a wider screening tool, and a universally available alternative to screen for potential valvular diseases is needed.
Since valvular heart diseases are related to ventricular hypertrophy, atrium enlargement, atrial fibrillation, atrial premature complex and ventricular premature complex, changes in electrocardiography (ECG) were observed in patients with those conditions.With the revolution of deep learning models (DLMs), artificial intelligence (AI)-enabled ECG may extract subtle rhythm abnormalities beyond those extracted by human experts to identify diverse cardiac diseases [4].Previous studies have already developed a DLM to identify left ventricular hypertrophy, [5] left atrium enlargement, [6] and arrhythmia [7] using available large annotation databases.We hypothesized that AI-ECG would allow for the detection of valvular diseases in individuals with at least cardiac structure or rhythm changes.
Previous studies have developed DLMs for detecting aortic stenosis with an AUC >0.86 using 12-lead ECG and demography; [8,9] aortic regurgitation with an AUC >0.80 using 12-lead ECG and demography; [10] and mitral regurgitation with an AUC >0.81 using 12lead ECG [11].However, the low positive predictive value, which may cause anxiety and inconvenience for patients, was the major concern for direct application of these DLMs in clinical practice.Previous research on false-positive prediction by AI-ECG of left ventricular dysfunction (LVD) with more than 4-fold risk of newonset LVD, [4] also showed that false-positive cases might be considered at high risk of new-onset aortic stenosis [8] and mitral regurgitation [11].However, the mechanism of this phenomenon is still unclear, leading to a lack of strategies for intervention.Moreover, a growing number of DLMs for detecting more valvular diseases in one AI-ECG report may lead to confusion.A comprehensive clinical application analysis to simultaneously consider multiple valvular heart diseases should be conducted before using Al-ECG in real-world clinical practice.
This study has three objectives: (i) to extensively explore the ability of AI-ECG to detect more valvular heart diseases; (ii) to develop a strategy to interpret the AI-ECG results with multiple predictions for further intervention recommendations; and (iii) to assess the prognostic performance of AI-ECG in individuals without significant valvular heart diseases.

RESULTS
Table 1 shows the patient characteristics in the development set, tuning set, internal validation set and external validation set.The prevalence rates of moderate-to-severe aortic stenosis, aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation in the internal/external validation sets were 0.6%/0.9%,4.8%/5.5%,1.9%/2.1%,12.6%/ 13.6%, and 10.8%/11.1%,respectively.In summary, the patients in the external validation set were older and had more comorbidities than those in the internal validation set.
The algorithms performed well in identifying each valvular disease in the validation datasets (Table 2).The DLM using ECG alone achieved AUCs of 0.768-0.847 in internal validation and 0.763-0.827 in external validation.The AUCs were 0.002 to 0.034 higher using age and sex.Using the operating point with equal sensitivity and specificity for the integration of age, sex, and ECG, Figure 1 shows sensitivities of 63.1-71.4% and specificities of 84.3-85.5% for detecting moderate-to-severe aortic stenosis; sensitivities of 58.1-63.5% and specificities of 79.1-82.4% for detecting aortic regurgitation; sensitivities of 54.6-58.2% and specificities of 82.0-82.9%for detecting pulmonary regurgitation; sensitivities of 62.2-63.3% and specificities of 85.3-86.6% for detecting tricuspid regurgitation; and sensitivities of 59.5-63.5% and specificities of 84.1-85.5% for detecting mitral regurgitation.Compared with the near perfect negative predictive values of more than 94.3% in each analysis, the related low positive predictive values ranging from 3.1% to 40.6% may be the major concern in AI-ECG for screening valvular diseases.Table 3 presents the analysis results for different levels of disease severity.We found that AI-ECG exhibits higher sensitivities in detecting 5 severe valvular diseases compared to moderate valvular diseases, ranging from 5.2% to 31.4%.After incorporating sex and age information, this increased range of sensitivities from −5.6% to 47.4%.All AUCs for detecting severe cases are higher compared to detecting moderate cases, indicating that AI-ECG is less likely to miss severe cases.The results of the stratified analysis based on demography and disease history are presented in Table 4.We found that AI-ECG exhibits a reduction of over 10% in AUC for detecting aortic stenosis in individuals with a history of chronic kidney disease (CKD) and atrial fibrillation (Afib).Additionally, there is a 10% decrease in AUC for aortic regurgitation and mitral regurgitation in patients over 65 years old.This stratified analysis reveals limitations in the application of AI-ECG for elderly with those histories.
Figure 2A shows the relationship between ECGscreened valvular diseases and ECG rhythms.Positive ECGs had a lower prevalence of sinus rhythm and a higher prevalence of atrial fibrillation/flutter, atrioventricular block, left bundle branch block, right bundle branch block, left atrial enlargement, left ventricular hypertrophy, prolonged QT interval, atrial premature complex, and ventricular premature complex than those classified as negative by each DLM.High consistency in the rhythm difference in the DLM for detecting each valvular disease revealed similar components to identify positive ECGs, which implies that these positive predictions may be related to other valvular diseases.Figure 2B validated the hypothesis that predictions between DLMs for detecting each valvular disease were highly correlated (ranging from 0.584 to 0.836), although the correlation was low in actual valvular diseases (ranging from 0.057 to 0.486).Moreover, more abnormal ECG rhythms in positive cases may also be related to other cardiac comorbidities and complications.
We further analyzed echocardiographic abnormalities in false-positive cases compared to the true negative cases in Figure 3.For the patients without aortic stenosis in the internal validation set, the false-positive cases had a 2.9-to-3.9-foldrisk of presenting other valvular diseases, and there were 41.5% false-positive cases with at least 1 other valvular disease.Moreover, these false-positive cases also presented worse cardiac function and more anomalies: 8.7-fold risk of low ejection fraction, 4.7-fold risk of high pulmonary artery systolic pressure, 4.4-fold risk of left atrial enlargement, 5.5-fold risk of larger left ventricular end-diastolic diameter, and 3.2-fold risk of significant pericardial effusion.In summary, more than 50% of false-positive cases presented at least 1 significant echocardiographic abnormality, and this phenomenon was also validated in external validation.Similar trends were shown in all false-positive ECGs by DLM for detecting aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation.Falsepositive cases had a higher risk of every kind of echocardiographic abnormality, and more than 37.5% of them had more than 1 significant echocardiographic abnormality, which revealed the importance of conducting echocardiography for AI-identified positive cases.
We followed more than 3,300/4,300 initially echonegative patients with ≥2 ECG-TTE pairs in the internal/ external validation sets (Figure 4).For patients in the internal validation sets without corresponding valvular diseases initially and more than 9 years of followup, the cumulative incidence rates in positive/negative cases for new-onset aortic stenosis, aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation were 4.6%/

DISCUSSION
This was the first study to simultaneously develop DLMs for detecting multiple valvular diseases, and we found that these DLMs shared similar components to establish predictions.Therefore, they should be considered for integration into a single report for AI-ECG analysis.The sum of the positive prediction value and proportion of at least 1 significant echocardiographic finding was more than 50% in each DLM, which revealed the importance of an echocardiographic examination in those with any positive predictions by AI-ECG.Moreover, the significantly high risk of new-onset valvular diseases also reminds physicians to intensively monitor progression.These results emphasize the importance of AI-ECG as an initial screening test for managing valvular diseases.
This study achieved state-of-the-art performance in detecting aortic stenosis, [8,9] aortic regurgitation, [10] and mitral regurgitation [11] compared to recently published retrospective studies using the same conditions.Moreover, we also demonstrated the feasibility of using AI-ECG for detecting pulmonary regurgitation (AUC >0.77) and tricuspid regurgitation (AUC >0.83), and all DLMs were not worse than the screening tests already implemented on a large scale, such as breast cancer screening (AUC = 0.78) [12] and AGING fecal occult blood tests (AUC = 0.71) [13].Since the characteristics of ECG are inexpensive, ubiquitous, and commonly used, asymptomatic valvular diseases may be detected early by AI-ECG with acceptable accuracy.In current clinical practice, patients are usually under the management of valvular heart disease when they become symptomatic.However, the symptoms are subjective, and a lack of symptoms is not benign.For example, sudden death without preceding symptoms occurred in 4.1% of patients with aortic stenosis [14].Since the long-term results of prompt intervention in the asymptomatic stage are excellent, [15] AI-ECG may become popular for application in the early diagnosis of potential valvular diseases.
Our results emphasized that false-positives of valvular diseases by AI-ECG may be related to other echocardiographic abnormalities, and the correlation between predictions of each DLM was highly correlated.This correlation may not be sourced from the original relationship between each valvular disease, but it may be sourced by the similar ECG presentation in each valvular disease.The ECG findings may be nonspecific in valvular diseases.Due to the chronic pressure overload of the left ventricle, left ventricular hypertrophy secondary to aortic stenosis may be present on ECG [16].Moreover, signs of left ventricular hypertrophy were also observed in aortic regurgitation and mitral regurgitation [17].
Right ventricular hypertrophy was also demonstrated on ECG in patients with tricuspid regurgitation and pulmonic regurgitation [18].Moreover, because chronic valvular diseases are usually accompanied by cardiomyopathy, low ejection fraction or heart conduction abnormalities, [19] AI-ECG might use this information to construct valvular disease predictions.AI-ECG has already been validated to accurately detect echocardiographic abnormalities, such as left atrium enlargement, [6] low ejection fraction, [20] high left ventricle diameter, [21] and pulmonary AGING hypertension [22].Since the appropriate features may be learned by DLM on the basis of data rather than manual engineering, our AI-ECG maximally discovered the indirect relationship between ECG and valvular diseases.We can utilize this indirect relationship to identify patients with worse cardiac function and abnormal heart structure, and an echocardiography examination for positive cases may be cost-effective.
Previous studies have shown that AI-ECG has the ability to identify disease predictors [4].This phenomenon was also observed in DLMs for detecting  The plots display the abnormal prevalence in the two groups, including positive and negative findings based on ECG.The ≥1 of valvular diseases was defined as at least 1 moderate-to-severe valvular disease, and the ≥1 of significant findings was defined as at least 1 abnormal echocardiographic finding.The relative risk (RR) was calculated as (pAI-positive/pAI-negative) and is presented with the associated 95% confidence interval.AGING aortic stenosis [8] and mitral regurgitation, [11] and this study validated those findings and expanded the use of DLMs to more valvular diseases.We also further recognized that the positive predictions of our AI-ECG were based on a series of ECG changes, and patients with abnormal ECGs tended to have a higher risk of future cardiovascular events.A previous study mentioned that false-positive predictions of dyskalemia were formed by the combination of abnormal ECG rhythms, which led to a higher risk of mortality and hospitalization [23].Therefore, even without considering the underlying heart structural changes implied by the abnormal ECG, these ECG rhythms might also need further intervention.For example, atrial fibrillation was associated with an increased risk of stroke and should be treated [24].Moreover, left atrium enlargement has been found to be an independent risk factor for new-onset mitral regurgitation [25], and left atrium enlargement-related rhythms were also recognized to be related to stroke and prehypertension.Considering that screen-detected atrial fibrillation might have the same results as incidentally detected atrial fibrillation in regards to reducing the risk of stroke and death [26], physicians should attach importance to positive AI-ECG predictions to provide active management and not limit the use of Al-ECG to only identifying new-onset valvular diseases.
One of the most potentially impactful applications of AI-ECG in valvular heart diseases is opportunistic screening, which primarily originates from radiology.This refers to instances where patients occasionally benefit from radiologic imaging tests conducted for other reasons, thereby discovering potential signs of illness [27].Previous cost-effectiveness study has found the advantages of using AI-ECG for opportunistic screening of asymptomatic left ventricular dysfunction [28], and it is plausible that valvular heart diseases could also be suitable for opportunistic screening.
Considering the daily performance of up to three million ECG examinations worldwide [29], reanalyzing these already conducted tests with an AI model could potentially reduce the cost of screening for valvular heart diseases.Therefore, hospitals or clinics that routinely perform many ECG examinations should consider implementing AI-ECG to identify patients with A significantly higher risk of future moderate-to-severe valvular diseases was present when the AI algorithm defined the ECG as positive compared with patients with minimalto-mild valvular diseases who were classified as having a negative finding by the ECG network.The analyses were conducted in both internal and external validation sets.The table shows the at-risk population and cumulative risk for the given time intervals in each risk stratification.AGING valvular heart diseases, which may also contribute to delivering a higher standard of patient care.
Several limitations in this study should be mentioned.First, moderate and severe mitral stenosis was also present in more than 0.1% of patients.However, our echocardiographic database did not record it using structure format.Second, echocardiography was only conducted in patients with evidence of cardiovascular diseases in current clinical practice, and the prevalence of echocardiographic abnormalities in this study might be overestimated compared to the prevalence in asymptomatic people.However, this may not matter because positive ECGs still presented a higher risk of cardiovascular diseases and progression than negative ECGs.Third, some important clinical information, such as cardiac murmur, is lacking in this large-scale study due to the difficulty of reviewing all medical records.The performance of AI-ECG in asymptomatic people was unclear.Finally, there was still no clinical impact analysis of how many people may benefit from this screening system.An additional randomized controlled trial of our AI-ECG is currently planned.

CONCLUSION
The AI-enabled 12-lead ECG may become a powerful screening tool for the detection of patients with moderate to severe valvular diseases.According to the existing evidence, an additional echocardiography examination for patients with a positive prediction by AI-ECG may be important.An appropriate treatment should be initiated once the true positive finding is validated, and unexpected significant echocardiographic findings may also remind physicians to manage the potential risk of cardiovascular diseases.The higher risk of new-onset valvular diseases should also be emphasized in patients with positive AI-ECG results to manage the related adverse events.

Data source and population
This research was a retrospective study with ethics approval by the institutional review board without individual consent in the Tri-Service General Hospital, Taipei, Taiwan (IRB No. C202105049).Two separate institutions in the Tri-Service General Hospital system provided research data from Jan 2010 to Sep 2021.An academic medical center (Nei-Hu General Hospital) was named hospital A, and a community general hospital (Ting-Zhou Branch Hospital) was named hospital B in this study.Patients who had at least one ECG and echocardiography examination within 7 days were included.
Figure 5 shows the generation process of the development, tuning, and validation sets.There were 77,047 patients with ECGs and corresponding transthoracic echocardiography (TTE) annotations in this study period from hospital A. The 77,047 patients were divided into three groups: 102,085 ECG records from 61,734 patients as the development set, 20,643 ECG records from 7,676 patients as the tuning set, and 7,637 ECG records from 7,637 patients as the internal validation set.Importantly, we only used the first ECG in the validation sets to avoid patient dependency.The external validation set included 11,800 ECGs from 11,800 patients from hospital B. These validation sets were used to validate the performance of the DLM for predicting valvular diseases.

Electrocardiographic signal
All 12-lead ECGs were recorded at the time of the acquisition in a Philips system ® .There were 5,000 voltage-time trace signals for each lead (500 Hz sampling frequency for 10 seconds) to establish a 12 by 5,000 matrix as the DLM input.The Philips system ® also provided an automatic analysis for each ECG, and the statements were extracted by the basis of the key phrases as follows: sinus rhythm, atrial fibrillation/ flutter, atrioventricular block, left bundle branch block, right bundle branch block, left atrial enlargement, left ventricular hypertrophy, prolonged QT interval, atrial premature complex, and ventricular premature complex.

Present and new-onset valvular heart diseases
Comprehensive 2D echocardiograms were recorded at the date of the acquisition in a Philips image system®.TTE data were used to grade patients with minimal, mild, moderate, and severe valvular diseases using published guidelines [30].The definition of moderate aortic stenosis was a jet velocity of 3.0-4.0m/s, a mean gradient of 20-49 mmHg, or an aortic valve area (AVA) of 1.1-1.5 cm 2, and severe aortic stenosis was defined as a jet velocity ≥4.0 m/s, a mean gradient ≥40 mmHg, a DVI ≤0.We followed patients with more than or equal to 2 TTE examinations for AI-ECG previvor analysis.For each analysis only patients with an initially minimal-to-mild corresponding valvular disease were used.This did not ensure that the included patients were completely normal; they may still have presented certain echocardiographic abnormalities, including other valvular diseases.The follow-up periods were started from the index TTE date to the corresponding events or the end of this study.Moreover, the follow-up data were censored at the last known TTE examination to limit bias from incomplete records.

Deep learning and integration model
The architectures were consistent in DLMs for detecting each valvular disease developed previously [32].In summary, it is a convolutional neural network using the 12 by 5,000 matrix of raw ECG signals as the input, and the output was a sigmoid output ranging from 0 to 1 to describe the binary outcomes.The training details were also mentioned previously.A batch size of 32, a weight AGING decay of 10 −4 , and an initial learning rate of 0.001 with the Adam optimizer and standard hyperparameters were used, which decayed by a factor of ten each time the loss on the tuning set plateaued after an epoch.Early stopping was performed by saving the network after every epoch and selecting the saved network with the lowest loss on the tuning set.In this study, we trained 5 DLMs for detecting moderate-to-severe aortic stenosis, aortic regurgitation, pulmonary regurgitation, tricuspid regurgitation, and mitral regurgitation.To add age and sex to enhance the DLM performance, we used the XGB model to integrate DLM prediction and the training process mentioned previously [33].We only performed prediction once using the best model based on the tuning set in the internal and external validation sets.

Statistical analysis
Patient characteristics were expressed as numbers of patients, percentages, means, and standard deviations where appropriate.DLM performance for detecting each valvular disease was tested by receiver operating characteristic (ROC) curve analysis, and the outcome was defined as the moderate-to-severe group compared to the minimal-to-mild group.Indicators including the area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) with 95% confidence interval (95% CI) were used to express the DLM accuracy.We also added age and sex as additional input features of the DLM to compare with previous studies.
We explored differences in ECG features of AIidentified positive and negative ECGs.The differences in echocardiographic characteristics were also compared in those two groups.We performed Kaplan-Meier survival analysis with the follow-up data for each valvular disease stratified by the DLM prediction.The data were censored on the basis of the most recent echocardiography.Cox proportional hazards models were also fit, and hazard ratios (HRs) with 95% CIs were used to compare the prognostic performances.All statistical analyses were carried out using the R language (version 3.4.4).

Figure 1 .
Figure 1.ROC curve analysis for VHD from a DLM based on age, sex, and ECG voltage-time traces.The receiver operating characteristic (ROC) curve (x-axis = specificity and y-axis = sensitivity) and area under the ROC curve (AUC) were calculated using the internal validation set (A) and external validation set (B).The operating point was selected based on the maximum Youden's index in the tuning set, which was used for calculating the corresponding sensitivities and specificities in the two validation sets.

Figure 2 .
Figure 2. The components of AI predictions for detecting each valvular disease.(A) Relationship between ECG-screened valvular diseases and ECG rhythms.The plots display two groups, positive (AI-positive) and negative (AI-negative) findings, by the ECG networks using ECG alone.Sinus rhythm is associated with AI-negative (green bar), and other abnormal rhythms are associated with AI-positive (red bar).Abbreviations: * p < 0.05; ** p < 0.01; *** p < 0.001.The +/− demonstrates the positive/negative relationship.(B) The relationship between each valvular disease in actual status and prediction.The values in each cell are the Spearman correlation coefficients.

Figure 3 .
Figure 3. Prevalence (p) of echocardiographic abnormalities in patients stratified by each AI classification using ECG alone.

Figure 4 .
Figure 4. Long-term incidence of developing severity stratified by AI classification using ECG alone.Long-term incidence ofdeveloping each moderate-to-severe valvular disease in patients with initially minimal-to-mild valvular diseases stratified by AI classification using ECG alone.Long-term outcome of patients with echocardiographic minimal-to-mild valvular diseases at the time of initial classification, stratified by the initial network classification.The ordinate shows the cumulative incidence of developing moderate-tosevere valvular diseases, and the abscissa indicates years from the time of index ECG-TTE evaluation.A significantly higher risk of future moderate-to-severe valvular diseases was present when the AI algorithm defined the ECG as positive compared with patients with minimalto-mild valvular diseases who were classified as having a negative finding by the ECG network.The analyses were conducted in both internal and external validation sets.The table shows the at-risk population and cumulative risk for the given time intervals in each risk stratification.

Figure 5 .
Figure 5. Development, tuning, internal validation, and external validation set generation and ECG labeling of VHD.Schematic of the dataset creation and analysis strategy, which was devised to assure a robust and reliable dataset for training, validating, and testing of the network.Once a patient's data were placed in one of the datasets, that individual's data were used only in that set, avoiding 'cross-contamination' among the training, validation, and test datasets.The details of the flow chart and how each of the datasets was used are described in the Methods.

Table 3 . Stratified analysis of disease severity for detecting each valvular disease.
All analyses shared the same control group (minimal to mild) for each valvular disease.It is important to emphasize that due to the exclusion of certain severity cases, the positive predictive value and negative predictive value hold no significance in this analysis.Abbreviations: AUC: area under receiver operating characteristic curve.

Table 4 . Stratified analysis of demography and disease history for detecting each valvular disease using ECG alone.
All results were presented using area under receiver operating characteristic curve (AUC).Abbreviations: BMI: body mass index; DM: diabetes mellitus; HTN: hypertension; HLP: hyperlipidemia; CKD: chronic kidney disease; CAD: coronary artery disease; HF: heart failure; Afib: atrial fibrillation; COPD: chronic obstructive pulmonary disease.