Keep It or Leave It? Comparison of Preoperative Scoring as Mortality Predictor Post-Pancreaticoduodenectomy

Background: Pancreaticoduodenectomy (PD) is the common procedure in resectable periampullary malignancies. However, the postoperative mortality rate for PD is relatively high. Mortality scoring system helps surgeons to decide patients’ eligibility for surgery to minimize mortality risk. This study aimed to compare four scoring systems for mortality prediction after PD in the Indonesian population. Methods: In this cross-sectional study, data were retrospectively collected from medical records for patients who underwent PD due to periampullary malignancy between January 2010 and January 2022. We assessed scoring accuracy, cut-off, sensitivity, specificity, negative predictive value, positive predictive value, and area under the curve (AUC) of Naples prognostic score (NPS), Whipple-ABACUS (WA), modified Pitt score (MPS), and Pitt score. Result: Of the 116 patients who met the criteria, the mortality rate was 12.1%. Mean age was 51.64 ± 10.22 years consist of 75.9% group <60 years and 24.1% ≥60 years, with 46.6% male and 53.4% female. The AUC from highest to lowest were Pitt Score 0.890 (p<0.001), MPS 0.775 (p 0.001), WA 0.627 (p 0.123), and NPS 0.505 (p 0.949) with the level of accuracy of each score were Pitt Score and MPS 67.2%, WA 50.0%, and NPS 59.5%. Conclusion: Pitt and MPS scores have the highest accuracy of all the scoring systems in this study. MPS has the advantage of having fewer components, making it easy to implement. MPS can replace the role of the Pitt Score in predicting post-procedure PD mortality in Indonesia. Further studies that include the intraoperative factors are needed to increase the scoring accuracy.


Introduction
Pancreaticoduodenectomy (PD) is a preferred procedure to treat resectable periampullary malignancy. Both patient risk factors and the complexity of PD can increase the number of mortality and morbidity (Parikh et al., 2010;Lalisang, 2012;Sung et al., 2021). The incidence of mortality after pancreatoduodenectomy from several publications is about 1-5% (Cameron et al., 2006;House et al., 2008;Assifi et al., 2012;Diener et al., 2017;Hackert et al., 2018). The 30-day mortality after pancreatoduodenectomy in the United States was 2.7% (Gleeson et al., 2016). Meanwhile, the number at Dr. Cipto Mangunkusumo Hospital (CMH) reached 17.5%. The high number of mortalities after pancreatoduodenectomy in CMH, which is higher than in other countries, is the reason for finding the best accurate and specific scoring system to predict the mortality and the morbidity after pancreatoduodenectomy procedure in Indonesia (Kim et al., 2013;Nugroho and Lalisang, 2014;Gleeson et al.,

Keep It or Leave It? Comparison of Preoperative Scoring as Mortality Predictor Post-Pancreaticoduodenectomy
2016; Aoki et al., 2017) Jin et al., (2021) validated the Naples prognostic score (NPS) in periampullary malignancy patients after the pancreatoduodenectomy procedure in China by a routine preoperative blood examination. However, mortality is not the primary outcome of this scoring system but rather overall survival (OS) and recurrencefree survival (RFS). Then in 2006, Gleeson et al., (2016) published a specific scoring system to predict the mortality after pancreatoduodenectomy based on the United States population called Whipple-ABACUS (WA). Nevertheless, some of the parameters in the WA score are subjective (surgeon-dependent) and need advanced blood examination. All variables of WA score are also coming from the database of surgical outcomes belonging to the American College of Surgeons (ACS), which is not explicitly purposed only for pancreatoduodenectomy procedures (Gleeson et al., 2016). Later on, Nugroho et al. validated the modified Pitt's Score (MPS) as the first scoring system used in Indonesia to predict the mortality after pancreatoduodenectomy for the Indonesian population in 2014. The MPS uses only five of eight parameters in the original version of Pitt's score, but it has better specificity and sensitivity than the original one (Nugroho and Lalisang, 2014). However, no one has compared the scoring accuracy with other current scoring systems for the Indonesian patient population.
This study compared the accuracy of four scoring systems for predicting mortality among patients who underwent pancreatoduodenectomy. Those are the Naples prognostic score, the Whipple-ABACUS, the modified Pitt's score, and the original Pitt's score. This study aimed to find the best predictive scoring system applied to periampullary malignancy patients in Indonesia.

Study design and patients
This cross-sectional study included all patients who underwent pancreatoduodenectomy due to periampullary malignancies from January 2010 to January 2022 in Dr. Cipto Mangunkusumo Hospital (CMH). We collected all the data through electronic and paper-based medical records. We excluded patients' data with incomplete medical record examinations. This article has been reported accordingly to the Strengthening the Reporting of cohort, cross-sectional and case-control studies in Surgery (STROCSS) criteria. (Agha et al. 2019) Estimation of minimum sample size in this study was 73 subjects. It was calculated using formula: (Zα=1.96, P=5%, Q=95%, d=5%).

Data collection
The collected variables were based on the parameters (including data characteristics) from the Naples prognostic score, the Whipple-ABACUS, Pitt's score, and the modified Pitt's score. Those are age, gender, malignancies status, ASA score, some medical histories (hypertension, cardiac surgery, bleeding disorder, steroid use, preoperative systemic inflammatory response syndrome), hematocrit, serum albumin, serum creatinine, total serum bilirubin, total cholesterol, alkaline phosphatase, lymphocyteto-monocyte ratio (LMR), neutrophil-to-lymphocyte ratio (NLR), blood loss volume, operation time, and postoperative/in-hospital mortality. Those parameters are collected and calculated according to each scoring system shown in Figure 1.

Surgical procedures
The surgical procedures were done by two senior digestive surgeons with more than 20 years of experience in hepatopancreatobiliary surgery. All pancreatoduodenectomy procedures were performed in the standard manner. Upper abdominal midline incision was followed with exploring the abdominal cavity to find any sign of metastasis. After that, right to left medial visceral rotation with dissection of right mesocolon was followed by a wide Kocher maneuver to the level of the left renal vein. Dissection of the right gastroepiploic and superior mesenteric vein (SMV) will allow further cephalad dissection of the SMV leading into the portal vein (PV) and inferior end of the retro-pancreatic tunnel. Operators entered the lesser sac, dissecting the station 8 lymph node or hepatic artery lymph node to identify common hepatic artery (CHA), dissect the caudal to it to identify the PV between CHA and superior border of the pancreas. Identification of gastroduodenal and proper hepatic artery allowed further exposure of PV and access to the superior end of the retro-pancreatic tunnel. Transection of the pancreas, bile duct, stomach, and jejunum was followed by distally transecting the uncinate and skeletonizing the superior mesenteric artery. And last we performed reconstruction of pancreaticojejunostomy, hepaticojejunostomy, and gastrojejunostomy.

Statistical analysis
Data were analyzed using the IBM ® SPSS ® version 26.0 (IBM Corp., Armonk, N.Y., USA). We performed bivariate analysis using Chi-Square test or Fisher's exact test. The result is considered significant when the p-value <0.05. We presented the receiver operating characteristics (ROC) curve to determine each scoring system's area under the curve (AUC), cut-off point, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Best cut-off point was determined to find the best sensitivity and specificity. Then, the De-Long test was used to compare the AUC between all four scoring systems. Its result were also showed as z-value and p-value.

Subject's characteristics
From 2010 to 2022, 120 pancreatoduodenectomy procedures were performed in Dr. Cipto Mangunkusumo Hospital. A medical record search from our hospital's database resulted in 116 patients. Four other patients were excluded due to incomplete or unavailable data. Characteristics of our research subject can be seen in Table 1.
The average age was 51.64 ± 10.22 years, with 24.1% of the patients aged 60 or above, and more than half (53.4%) of the subjects were female. The mortality rate during the whole period was 12.1%, meaning that 14 out of 116 of our subjects didn't survive. The incidence of postoperative complications such as fistula, surgical site infection (SSI), and re-laparotomy during the 12 years were 8.6%, 9.5%, and 11.2%. No postoperative hemorrhage was reported. Some variables were significantly associated with post-pancreatoduodenectomy mortality, such as serum alkaline phosphatase ≥200 ng/mL (p=0.003), serum creatinine level >1.3 mg/dL (p=0.009), and hematocrit level <30% (p=0.011).
Based on body mass index (BMI), the highest mortality rate is in those with normal BMI, and those with grade II obesity have the lowest mortality rate. One hundred seven subjects with ASA 2 underwent a pancreatoduodenectomy procedure with a 13.1% mortality rate. The number of procedures performed increased by eight after year 2015. Less than half of the mortality were reported, compared to the previous period (2010)(2011)(2012)(2013)(2014)(2015).    Mortality and score distribution The distribution of WA, NPS, MPS and Pitt scores can be found in Figure 2. The minimum WA score in this study was 0, and the maximum score was 3, with no subject that underwent PD having an WA score of more than 3. Mortality rate in WA score of 0 to 3 were 7.7%, 11.6%, 26.3%, and 0%, respectively. For the NPS score, the highest score in patients that underwent PD was 4. Its mortality distribution from the score of 0 to 3 were 11.1%, 17.2%, 3.2%, 15.2%, and 0%, respectively. The mortality rate for subjects with MPS scores ranging from 0 to 3 were 1.5%, 26.5%, 33.3%, and 0%. For the last score measured, Pitt score, the maximum score in this study was 6 (with no mortality reported). Patients with a score of 5 had a 20% mortality rate, and patients with a score of 4 had the highest mortality rate of 62.5%. Table 2 and Figure 3 showed ROC curve analysis and AUC value, cut-off value, sensitivity, specificity, NPV, PPV, and accuracy. According to the output, the Pitt score had the highest AUC, followed by the MPS, WA, and (crossing the 50% standard line). Table 2 presented the sensitivity, specificity, NPV, PPV, and accuracy of the scores used for post-pancreatoduodenectomy mortality prediction. Each scoring system generally showed moderate values for specificity and PPV but higher values for sensitivity and NPV. NPS had the lowest sensitivity, and WA had the most insufficient specificity. According to AUC, among the four scores were used in this study, MPS and Pitt scores still had the highest scores for all calculations.

Comparison of scoring system to predict mortality
A pairwise comparison calculation was performed to gather relative weight between one scoring system and another, and the result is available in Table 3. Only the comparison of the Pitt score with NPS was statistically significant, although the difference in the AUC score between the Pitt score and NPS was more significant than the Pitt score and WA. No other score comparison was statistically significant.

Discussion
The scoring system is designed to help surgeons predict a procedure's outcome before a complex procedure with a high mortality rate, such as pancreatoduodenectomy NPS scores. NPS and WA scores had an AUC of under 0.65, meaning that both are inaccurate in predicting postpancreatoduodenectomy mortality. According to the ROC curve in Figure 3, the Pitt score had the best performance among all scores measured in this study. MPS score performance was the closest to Pitt scores, shown by an AUC value comparable to Pitt scores and higher than Naples or ABACUS scores. Meanwhile, the Naples score had the worst performance, shown by curve inconsistency     Pitt et al., 1981). This study is an external validation of four scoring systems and is made to determine which score is the most viable to be applied to the Indonesian population.
In this study, only two scores (Pitt score and MPS) reached an AUC of >65% (p <0.05). WA and NPS scores have an AUC of <65%, meaning that these scores failed to predict post-pancreatoduodenectomy mortality. Pitt's score has the highest AUC score of 89.0% (p<0.001). This score is routinely used in Dr. Cipto Mangunkusumo Hospital (CMH) before a pancreatoduodenectomy procedure or other management, such as handling infection, administering nutritional support, and determining whether or not biliary drainage is needed. The calculation of the Pitt score in this study is an evaluation of our subjects after an intervention to measure surgical feasibility. This condition is why only a small number of patients with a high Pitt score ended up undergoing pancreatoduodenectomy. Intervention done to improve the patient's condition (marked by a lower

WA MPS NPS PITT
Pitt score) help to reduce the mortality rate in CMH, as shown by our in-hospital mortality rate (12.1%) which is lower than previous study, but it still two times higher than the world average. It may occur because of the learning curve that affected the decrease in the mortality event in the last decade and the difference time period of included subject between previous study (1995-2012) and present study (2010-2022).
Most postoperative deaths are due to circulatory failure, followed by sepsis (anastomosis leakage, pneumonia, deep surgical site infection). Regardless of the low number of postoperative pancreatic fistula (POPF) incidences, the low number of POPF can be considered because of low-grade POPF. Therefore, there was no biochemical leakage that was reported, and the fistula was spontaneously closed. Another possibility that could happen was the mortality of subject before we identified or realized the presence of a fistula. Several factors, such as the operator's skills and experience, quality of postoperative care, and standard operating procedures, have the potential to be confounders of the implementation validity of each score in the Indonesian population. However, in this study, all procedures were carried out by the same operators and received management from the same institution so that comparisons of validity between scores could still be ensured.
There are patients with initial low Pitt scores and immediately admitted to surgery. However, most patients initially came with cholangitis and malnutrition that caused a high Pitt score and needed conservative treatment before surgery. Therefore, a thorough evaluation is necessary to determine outcome differences between the identical Pitt scores with the two circumstances before surgery. Hypothetically, those two circumstances could precisely affect the prediction of post-pancreatoduodenectomy mortality. Thus, we could get an ideal time to use Pitt scores that accurately depict the patient's condition before pancreatoduodenectomy. Such evaluation is needed to explain our results; why there are patients with low Pitt scores but end up dying, why there are patients with high Pitt scores but end up surviving; and whether there are other components (intrinsic or extrinsic) that could affect the mortality rate in pancreatoduodenectomy.
In this study, sensitivity, specificity, NPV, PPV, and Pitt score accuracy do not differ significantly compared to MPS. Although it has the highest AUC score, the Pitt score is still less accurate than the MPS. The MPS score was created by Nugroho and Lalisang, 2014 at CMH and is designated as a simplified predictive score. This score has four similar points to the Pitt score (bilirubin ≥10mg/dL, HCt ≤30%, serum creatinine ≥1.3mg/dL, and albumin 3g/dL) and one additional parameter (ASA ≥3). MPS had an AUC of 97.4% (p <0.001) when the score was created, which was higher than Pitt's score of 94.9% (p <0.001). But here, AUC of MPS score decreased to 77.5%, lower than Pitt's score accuracy when used in the same hospital. Pancreatoduodenectomy is rarely performed in patients aged >60 years; perhaps this is why the Pitt score is still preferred to be used in CMH (compared to MPS), even though this study found no significant relationship between age and post-pancreatoduodenectomy mortality.
The Pitt score and MPS distribution showed that higher scores do not mean the mortality rate will increase. Mortality in patients with score <5 was as high as those with higher scores. In patients with the highest Pitt score of 6, the mortality rate is 67%. In the meantime, no deaths are found in patients with the highest MPS score of 3. This result concords with both scores' average sensitivity, specificity, and accuracy, so they poorly predict which patient will die (Nugroho and Lalisang, 2014).
Whipple-ABACUS score (the first available score for predicting post-pancreatoduodenectomy mortality) has an AUC, sensitivity, specificity, NPV, PPV, and accuracy lower than the Pitt score and MPS. In patients undergoing pancreatoduodenectomy, the highest score is 3 out of 12. Many WA score components, particularly preoperative steroid usage, cancer dissemination, and bleeding disorder, could not be measured in CMH. If the clinicopathological conference confirms spreading, pancreatoduodenectomy will not be performed. Likewise, pancreatoduodenectomy will not be continued when hepatic or peritoneal spreading is found during an operation. Therefore, the point will be 0 for every patient. The same also happened with the other nine points in the WA score, which were unavailable in patients undergoing pancreatoduodenectomy in CMH. This finding could be attributed to pancreatoduodenectomy patient selection that prioritizes patients with good performance and few or no comorbidity. Such preference is also shown in the ASA score. The WA score components are taken from the ACS-NSQIP database, which is not a set of specific risk factors for pancreatoduodenectomy, so its AUC, sensitivity, specificity, NPV, PPV, and accuracy are not viable enough to be used to predict post-pancreatoduodenectomy mortality in our subjects (Gleeson et al., 2016).
Naples's prognostic score has the lowest AUC of 50.5%. However, its specificity and accuracy are better than WA scores in predicting subjects faced with the probability of pancreatoduodenectomy. This score is the only score that involves systemic inflammation factors in post-pancreatoduodenectomy subjects. This score was initially developed as a prognostic score to evaluate the short-and long-term outcomes of postpancreatoduodenectomy patients, such as overall and recurrence-free survival (Jin et al., 2021). Similar to our result, NPS is not a good score in predicting postpancreatoduodenectomy mortality.
The mortality rate of pancreatoduodenectomy performed in CMH from 2010 to 2022 is 12.1%, less than the previous study's 17.5%. The post-pancreatoduodenectomy mortality rate is almost two times higher than the world average (Kim et al., 2013;Gleeson et al., 2016;Aoki et al., 2017;Lidsky et al., 2017). According to other studies, post-pancreatoduodenectomy mortality in high-volume centers ranges between 1-5%, especially with new operating techniques and improved surgeons' capability.
According to the procedure volume definition of center performing pancreatoduodenectomy, CMH is considered low-volume, or about 10-18 operations annually. Before COVID-19 (Red, 2019), CMH performed an average of 13-14 pancreatoduodenectomies per year. In the pandemic era, the number of pancreatoduodenectomies performed decreased to only 5-8 per year. Nevertheless, as a national referral center, CMH could be compared to high-volume centers serving pancreatoduodenectomy in Indonesia.
With the total sample size of only 120 within the last 12 years, and the subject enrolment that only selected those with good performance, fewer comorbidities, and good laboratory results before conducting pancreatoduodenectomy, this study concluded with an uneven score distribution, causing poor statistical significance. The scarce sample collection that underwent surgical procedures with extreme scores but ended up alive suggested the possibility of other factors besides preoperative factors involved in postoperative mortality prediction. Intraoperative factors such as the duration of the surgery, the amount of bleeding, fluid resuscitation, blood transfusion should be included in the scoring system. They can contribute to increasing the accuracy of postoperative mortality prediction (Romano et al., 2015;Lalisang et al., 2019;Mirrielees et al., 2020). We need to explore other factors of the patient with similar scores but with different outcomes.
Factors limiting this study are the low number of samples with high scores that confuse mortality prediction in patients with such scores. Data were also collected retrospectively from single center. Thus, we suggest performing the similar study in prospective way and multicenter in the future. Several interventions based on MPS parameter such as albumin and bilirubin are beneficial to improve the outcome of the subject when the investigation is conducted prospectively. Furthermore, preoperative evaluation for score calculation is not performed within the same timeframe, e.g., during the admission process or 24 hours before the operation. We also didn't have sufficient data about delayed gastric emptying as the most common complication after PD. However, this study is the first to compare the accuracy of scoring systems in predicting postpancreatoduodenectomy mortality in Indonesia, with a post-hoc power analysis was 84.6%.
As a conclusion, Pitt scores and MPS are the two scores that have the highest accuracy of all the scoring systems in this study. Both Pitt and MPS can be applied to predict post-procedure PD mortality in Indonesia. However, MPS has the advantage of having fewer components, making it easy to implement. MPS can replace the role of the Pitt Score when the status of malignancy is unknown, and the measurement of serum alkali phosphatase is not available. Further studies of scoring risk that include the intraoperative factors are needed to increase the scoring accuracy. Moreover, there need to be more studies of Pitt score implementation as a PD mortality predictor between patients with and without prior conservative treatment.

Author Contribution Statement
All authors contributed equally in this study.