Basic characteristics of studies
As shown in Fig. 1, after excluding duplicates and non-experimental studies, 350 references were identified. Full-text review on 72 original articles eligible for detailed evaluation were conducted after we excluded non-relevant references. A total of 36 articles were further removed because of insufficient information to construct a 2 × 2 table. Ultimately, the remaining 36 articles were selected for meta-analysis.
We listed the main features of the included studies in Table 1 and 2. Overall, 7,362 participants were included. Among the 36 included articles, 29 articles studied the diagnostic accuracy of WFA+-M2BP on liver fibrosis and 8 articles were on HCC. For the studies on fibrosis, we noticed that 3 articles enrolled both training group and validation group24,25,26, and 2 articles recruited patients with 2 different etiologies27,28. Thus, we considered them as individual studies when the calculation of diagnostic accuracy was conducted. Overall, 7 kinds of etiologies of liver fibrosis that include HBV (n = 12), HCV (n = 10), NAFLD (n = 3), NASH (n = 3), AIH (n = 1), BA (n = 2), and PBC (n = 2), as well as mixed etiologies (n = 1) were discussed. HCC was mainly caused by 3 etiologies here: HBV, HCV and NAFLD. All studies employed retrospective design and used lectin-Ab sandwich immunoassay to detect serum WFA+-M2BP levels.
On the basis of QUADAS-2 assessment, the overall quality of included studies was moderate. As shown in Supplementary Figs. 1, 2, in terms of patient selection, 13 studies had high risk of bias because of inappropriate exclusions or case–control designs. A total of 29 studies had high risk of bias in index test because of the awareness of reference standard result before conducting the index test. Five studies did not mention the use of blind method for index tests when explaining the reference standard results. Regarding flow and timing, 25 studies had high or unclear risk of bias because not all patients received the same reference standard or due to unclear interval between index test and reference standard. Moreover, we had significant concerns on 7 studies when evaluate the applicability of their patient selections.
Pooled predictive accuracy of WFA+-M2BP in liver fibrosis
Here, we summarized the predictive accuracy of WFA+-M2BP in each liver fibrosis. A total of 6 studies with 1,235 patients were evaluated for the performance of WFA+-M2BP on predicting mild fibrosis. The pooled sensitivity and specificity were 0.70 (95% CI 0.62–0.77) and 0.68 (95% CI 0.57–0.78), respectively (Fig. 2A). Besides, the pooled AUSROC was 0.75 (95% CI 0.71–0.78). Twenty studies with 3,602 patients were included in significant fibrosis. The pooled sensitivity, specificity and AUSROC were 0.71 (95% CI 0.65–0.76), 0.75 (95% CI 0.69–0.81), and 0.79 (95% CI 0.75–0.82), respectively (Fig. 2B). For predicting advanced fibrosis, 28 studies involving 4,427 patients were assessed. The pooled sensitivity, specificity and AUSROC were 0.75 (95% CI 0.69–0.79), 0.76 (95% CI 0.72–0.80), and 0.82 (95% CI 0.78–0.85), respectively (Fig. 3A). For cirrhosis, 21 studies with 3,449 patients were identified. As displayed in Fig. 3B, the pooled sensitivity and specificity were 0.77 (95% CI 0.69–0.84) and 0.86 (95% CI 0.79–0.90), respectively. The pooled AUSROC was 0.88 (95% CI 0.85–0.91). Those pooled results demonstrated that the predictive accuracy of WFA+-M2BP greatly increased with the progression of liver fibrosis. Its level could nicely reflect the presence of late fibrosis especially cirrhosis. High AUSROC indicated WFA+-M2BP could be applied as an alternative biomarker for biopsy when diagnosing cirrhosis.
Heterogeneity analysis, threshold effect and meta-regression
To investigate the heterogeneities of the included studies, threshold effect and overall heterogeneity were analyzed. Significant heterogeneities existed in each stage of liver fibrosis (Q = 23.11, I2 = 91%, P < 0.001; Q = 94.75, I2 = 98%, P < 0.001; Q = 50.32, I2 = 96%, P < 0.001; Q = 64.79, I2 = 97%, P < 0.001). However, no significant threshold effect was found. In four liver fibrosis stages from mild fibrosis to cirrhosis, the spearman correlation coefficients between sensitivities and specificities were − 0.94 (P = 0.88), − 0.01 (P < 0.01), − 0.02 (P < 0.01), and − 0.16 (P = 0.03), respectively.
Meta-regression analysis (at least 10 studies were requested) was performed to further discuss the cause of heterogeneity in the studies reported for significant fibrosis, advanced fibrosis, and cirrhosis. We investigated 10 factors that might be the potential sources of heterogeneity: year of publication, region, median age, male proportion, number of patients, etiology, histological system, liver biopsy length, interval between biopsy and blood test, and blind method. For identifying significant fibrosis, the accuracy of WFA+-M2BP could be influenced by age, male proportion, etiology, and blind method (P < 0.01, P = 0.07, P = 0.05, and P < 0.01, respectively). For advanced fibrosis, the performance of WFA+-M2BP could be affected by age, male proportion, etiology, and region (P < 0.01, P < 0.01, P = 0.01, and P < 0.01, respectively). For cirrhosis, the heterogeneity of WFA+-M2BP for the detection might be due to the heterogeneity of age, male proportion, region, etiology, and blind method (P < 0.01, P < 0.01, P = 0.02, P = 0.07, and P = 0.08, respectively).
Predictive accuracy of WFA+-M2BP in liver fibrosis stratified by etiology
As etiology was one of the main reasons of heterogeneities based on our meta-regression analysis, we further analyzed the predictive accuracy of WFA+-M2BP in liver fibrosis caused by various etiologies. We combined studies related to NAFLD and NASH together, and combined studies related to AIH, BA, PBC and mixed etiologies into the “other etiologies” category because of limited number of references. Intriguingly, WFA+-M2BP showed diverse diagnostic accuracies in different etiology groups (Table 3). In general, for the prediction of significant fibrosis, advanced fibrosis, and cirrhosis, WFA+-M2BP owned the best diagnostic accuracies among patients with AIH, BA, PBC or mixed etiologies by reaching the highest pooled sensitivity, specificity, PLR, DOR, AUSROC and lowest NLR, when the results were compared with patients in other etiology groups. Besides, for advanced fibrosis, heterogeneities dramatically dropped in different etiology groups except for HBV and HCV. And for cirrhosis, no heterogeneity was found in the subgroup of NAFLD or NASH (Q = 3.11, I2 = 36%, P = 0.106), indicating the accuracy of WFA+-M2BP was influenced by the etiology of disease. In Table 3, different weighted mean WFA+-M2BP values were observed in different etiologies, suggesting individual cutoff value of WFA+-M2BP should be applied to grade liver fibrosis in each etiology. In addition, we noticed that compared with significant fibrosis and advanced fibrosis, WFA+-M2BP possessed the highest AUSROCs in diagnosing cirrhosis regardless of the etiology.
Predictive accuracy of WFA+-M2BP versus non-invasive indicators for grading liver fibrosis
Due to limited number of studies containing the information of other non-invasive indicators in mild fibrosis, we compared WFA+-M2BP with other non-invasive indicators for predicting significant fibrosis, advanced fibrosis and cirrhosis. As shown in Table 4, for significant fibrosis, the AUSROC of WFA+-M2BP (0.79) was only greater than that of AST/ALT (0.74, P = 0.048). For the detection of advanced fibrosis, WFA+-M2BP yielded AUSROC (0.82) similar to those of APRI (0.78, P = 0.113), FIB-4 (0.79, P = 0.235), HA (0.82, P = 1.0), and FibroScan (0.81, P = 0.831). Greater AUSROC of WFA+-M2BP was only observed when it was compared with AST/ALT (0.67, P < 0.001) and PLT (0.69, P < 0.001). However, when determining cirrhosis, WFA+-M2BP surpassed 4 indicators (WFA+-M2BP = 0.88; APRI = 0.79, P < 0.001; FIB-4 = 0.83, P = 0.034; AST/ALT = 0.79, P < 0.001, PLT = 0.83, P = 0.021) except for HA and FibroScan (HA = 0.88, P = 1.0; FibroScan = 0.87, P = 0.644). Those results indicated that WFA+-M2BP owned the best performance for diagnosing cirrhosis by exceeding most of the widely used indicators.
Diagnostic accuracy of WFA+-M2BP for the prediction of HCC
For the prediction of HCC, a total of 8 studies with 2,240 participants were selected (Table 2). Among them, 4 studies reported the occurrence of HCC after antiviral treatment or HBeAg seroconversion49,50,55,56, one study discussed the reoccurrence of HCC after curative resection54, and 3 studies focused on the development of HCC51,52,53. The WFA+-M2BP levels here were pretreatment or basal levels. As several studies described the diagnostic information of APRI, FIB-4, and AFP, we compared the diagnostic accuracies of WFA+-M2BP with these three indicators for HCC. There was no significant threshold effect in included studies (r = − 0.7, P = 0.49). However, significant heterogeneity was observed (Table 5). In addition, among all the indicators, WFA+-M2BP yielded the highest pooled sensitivity (0.77, 95% CI 0.60–0.89) which surpassed APRI, FIB-4 and AFP. Although AFP had the highest pooled specificity (0.94, 95% CI 0.82–0.98), the AUSROCs of WFA+-M2BP and AFP were very similar (P = 0.671).
Publication bias and sensitivity analysis
As displayed in Supplementary Fig. 3, Deek’s funnel plots were almost symmetric for studies that reported mild liver fibrosis, significant fibrosis, and advanced fibrosis (P values = 0.1, 0.33, and 0.09, respectively), suggesting no evidence of publication bias. However, a significant publication bias was observed in studies on cirrhosis (P = 0.03). For studies on the prediction of HCC, there was no publication bias (P = 0.83) (Supplementary Fig. 4).
Through sensitivity analysis, we observed outlier studies existed in each stage of liver fibrosis (Supplementary Fig. 5). Surprisingly, after the removal of outlier studies, the heterogeneity in mild fibrosis disappeared (Supplementary Table 1) and the publication bias in studies on cirrhosis was diminished (Supplementary Fig. 6). However, Supplementary Table 1 indicated that the summary results were not significantly affected by individual studies. Also, as displayed in Supplementary Fig. 7, no outlier study was found in HCC.