Laparoscopic radical prostatectomy outcome data: how should surgeon’s performance be reported? A retrospective learning curve analysis of two surgeons

Objective To document the learning curve for the laparoscopic radical prostatectomy (LRP) procedure and discuss the optimal usage of prospectively documented outcome data for reporting a surgeon’s performance. Materials and methods Using prospectively collected data from the first series of patients to undergo LRP by two surgeons in the same institution, linear and logistic regression multivariate analyses per 25 patients were carried out to graphically represent the surgical learning curve for operative time, blood loss, complications, length of stay (LOS), and positive margins. Surgeon A carried out 275 operations between 2003–2009; Surgeon B carried out 225 between 2008–2012. Results Learning curves showing continuous improvement of each of the above outcomes were demonstrated for both cohorts. For surgeon A, a plateau was observed for LOS and T2 positive margins after 100 and 150 surgeries respectively. No such plateau was observed for surgeon B. Conclusion On documenting these learning curves and discussion of the reporting methods used, we concluded that the most informative outcome measure, with the least potential observer bias was T2 positive margins. Whether as a single measure or in combination with others, this has potential for use as an objective outcome representative of improvement in a surgeon’s skill over time.

Background Radical prostatectomy (RP) surgical outcome data is increasingly being reported in terms of a learning curve, as influenced by surgeonspecific factors [1]. There is no definitive definition though learning curves are generally expected to plateau as a surgeon masters a technique. With the potential for altering guideline recommendations, they should be interpreted and documented with caution and appropriate statistical analysis [2].
To date, learning curves in urology have demonstrated changes in outcomes-such as operative time, blood loss, LOS, positive surgical margin (PSM) rate, complications, and cancer recurrence-to be predictive of a surgeon's experience [3][4][5][6][7]. Specifically for laparoscopic RP (LRP) there is variation in the number of cases estimated to be required for competency: 51 cases according to complication rates; 110 cases according to blood loss, operative time and PSM rate [3]; and 250 according to recurrence and PSM [7,8].
With a recent push for public reporting of a surgeon's performance within urology [9], it is hoped that increased transparency and drive for improvement will follow. The influence of the learning curve on these reports, and the extent to which this is influenced by case mix will be of paramount consideration in documenting and interpreting a surgeon's data with application to surgical training and clinical practice. Most surgeons do not have facilities for collecting patient reported outcomes, though this is expected to change with the demand for published outcomes increasing. The current British radical prostatectomy data set relies on a surgeon's input data even though much of which is incomplete and difficult to verify [10,11].
Therefore, this study involved evaluation of the optimal usage of learning curves in review of a surgeon's performance. We aimed to use LRP outcome data from two surgeons and a discussion of the outcomes used, to determine how best to report changes in a surgeon's performance over time. Ultimately we wanted to identify whether there are surrogate measurements available that are independent of the resource (e.g. histopathology instead of surgeon-reported data).

Study cohort
We used data from the first cohort of cases of extraperitoneal LRP carried out by two surgeons at King's Health Partners. Both surgeons performed standard laparoscopic antegrade radical prostatectomies [32]. Patients data for surgeon A (275 patients) and surgeon B (225 patients) were retrieved from electronic patient records that had been recorded prospectively. Patient consent was not required as data collection was part of the clinical audit. Surgeon A operated between April 2003 and June 2009 and surgeon B between October 2008 and October 2012.The observation period and frequency of surgeries was different between the two surgeons as this reflects the uptake of radical prostatectomy in the UK with time. Both surgeons were new consultants with no independent experience so that our observations reflect a true learning curve. They were both UK trainees and completed their training with dedicated laparoscopic fellowships. Pre-operative variables included age, prostate-specific antigen (PSA), tumour stage, and Gleason score.

Outcomes
Surgeon-dependent variables of interest were classified as operative or postoperative. Operative variables included: operative time, blood loss, and complications. Complications were reported by the operating surgeon and subsequently classified according to the 2004 Clavien-Dindo Classification [12]. Postoperative variables included: complications, LOS, and surgical margin status. The latter was assessed by three consultant histopathologists with a specialist interest in urological pathology. Prior to 2011, partial embedding of the prostate in small cassettes was the routine practice. Since 2011, the lab was progressively able to support complete embedding of the prostate in megablocks [33].
Finally, we included information on surgical stage and Gleason score. Gleason scores were divided into three categories (≤ 6, 7, and ≥ 8) as has been done in similar studies [13]. Clinical and surgical stages were classed as T0, T1a/b, T1c, T2, T3, or T4 to allow clear distinctions in analyses between patients with organ-confined and non-organ confined disease.

Statistical analysis
For each cohort of patients the follow analyses were carried out: Patients were assigned a number according to ascending date of surgery as a representative value for a surgeon's experience. For initial analysis of the effects of a surgeon's experience on surgical outcomes, comparison was made between consecutive groups of 25 patients. Age, pre-operative PSA, Gleason category, and clinical stage were included in the multivariable regression model. All p-values were twosided and considered statistically significant < 0.05. Linear regression models were used to determine the association between consecutive group numbers and mean operative time, blood loss, and LOS. These associations were graphically represented using cubic splines. Using logistic regression, we determined the association between consecutive group number and presence or absence of complications (preoperative and postoperative), positive margins, T2 positive margins, and T3/T4 positive margins, all graphically represented.
Since there was no change in practice over the study period, stage migration and date of surgery did not need to be accounted for in these multivariate models.

Cohort details
Baseline characteristics of the patients in cohort A and B and their intra-and postoperative data are shown in Table 1. Median age was 61 (IQR 56-65) and 62 (57-66) respectively, and median presenting PSA 7 (5.6-10) and 8.3 (6.1-12.5). In both cohorts, T2 was the most common clinical stage (45% and 60% respectively) and this increased to 65% and 68% in each cohort at surgical staging. Mean operative time for surgeon A was 159 mins and for surgeon B was 225 mins; mean estimated blood loss was 268 mL and 339 mL respectively. For both surgeons mean length of a patient's hospital stay was two days. For each outcome, there was a downward trend of continuous improvement as number of surgeries increased. was 9% for surgeon B-though these associations were not statistically significant. Amongst the patients with T2 disease in surgeon A's cohort, a statistically significant association was found between PSM rate and number of surgeries performed after adjustment for Gleason, age, PSA, and clinical stage: 29% decrease per 25 surgeries (95% CI: 0.56-0.90). A decrease of 12% (95% CI: 0.80-1.05) was reported for surgeon B. Surgeon A's T2 positive margin rates resulted in a plateau after 150 cases (Figure 2a) whereas PSM rates were more consistent for surgeon B (Figure 2b). Such a pattern was less clear for the subgroup with T3/T4 disease.

Discussion
Our study shows that operative time, blood loss, complications, hospital stay, and T2 margin status illustrate continuous improvement as a surgeon's experience increases. We observed a plateau in LOS in hospital and rate of T2 PSMs.
The trends in operative time, blood loss, LOS, and complications described here parallel those which have previously been published [14,15]. Despite some conflicting results, a recent review concluded PSM rate for open, perineal, and laparoscopic prostatectomy procedures improved with a surgeon's skill across the learning curve [1]. Furthermore, PSM rates have been shown to be lower amongst more experienced higher volume surgeons [16,17].
Concerning the paucity of data on how a surgical learning curve should be documented in the current context of an upcoming surgeon's performance publication [18], it is necessary to take into account the way in which surgical outcomes are reported in an attempt to evaluate the optimal reporting method. For example, surgeon documented blood loss relies on volume estimations in fluids mixed with urine, saline, and wash, and the bleeding complications involve a method and cost-dependent balance between tissue sparing and bleeding. Furthermore, as acknowledged in the British Association of Urological Surgeons nephrectomy database, outcomes documented by operating surgeons may be subject to bias, relying on honest, complete, and consistent reporting [19]. While Hospital Episode Statistics data has advantages and can be used to verify data such as LOS, it too may be subject to certain inaccuracies when documented [20]. Within our own cohort, there was notable variation within blood loss data reported by each individual surgeon from case to case, as well as significant variation between the surgeons in reporting operative time (data not shown). Given this variation, the clinical relevance and utility of reporting changes per patient is questionable. Aside from variations in surgeon reporting, LOS also demonstrates the possible impact of other factors such as the day of the week on which a surgeon operates or the effects of other health care professionals involved in a patient's care following surgery [15], [21].
In contrast to the above, PSM status is an outcome acquired independently of surgeon self-reporting, sensitive to continuous monitoring of a surgeon's skill [22]. There is little evidence of significant inter-observer variability between pathologists [23]. A T2 positive margin is likely a result of iatrogenic capsular incision, being much less stage-dependent than non-organ confined cases. T3 prostate cancers are a very heterogenous group. T3 prostate cancer positive margins are more tumour dependent than surgeon dependent. The opposite is the case with T2 positive margins. Regardless of the volume of cancer within the prostate, if the gland is not incised and the cancer is organ confined, there will be a negative margin. Though there is conflicting evidence as to whether or not surgical margins status is a useful indicator of prognosis [8], [13], [24][25][26][27], we propose T2 margin rate to be most useful as described here. It is a marker of a surgeon's performance, improving with a surgeon's experience alongside various other surgical parameters. www.ecancer.org ecancer 2016, 10:651

b) blood loss (mL) and (c) LOS (days) per 25 consecutive patients operated on by surgeon A (blue) and surgeon B (red).
One of the strengths of this study is the division of cases into groups of 25 operations-in a similar manner to the plotting of case number [28] as opposed to groups of at least 50 [1]-thereby decreasing the risk of drawing misrepresentative conclusions regarding the number of cases required for competency [29]. Multivariate analysis accounted for the positive association found in these cohorts between a surgeon's experience and Gleason score and clinical stage. Heterogeneity between surgeons is highlighted by learning curve studies such as this, demonstrating the need for individual analysis of a surgeon's performance with implications for both practice and training [30]. While a follow-up study might be proposed which includes the five-year biochemical recurrence (BCR) rates and continence data, it is beneficial that this study has used outcomes immediately available for a continuous record of performance. This would also be facilitated by the use of new early markers such as ultrasensitive PSA [31].
Lack of standardised measures of continence and potency impedes analysis of functional outcomes [3] and retrospectively derived data can prove unreliable. Although the surgeons in this study underwent training in a different environment to current trainees, the same outcome measures are useful and relevant for current practice. Clearly documented learning curves are likely to have future implication for the training and equipping of inexperienced surgeons in order to minimise the learning curve in surgical practice. Finally, it is a limitation that we only compared two surgeons-and future studies comparing more surgeons will provide even more insight into the use of T2 positive margin status as a marker for a surgeon's experience.

Conclusion
Given the need for clarity and transparency of a surgeon's proficiency in reporting of surgeon-specific outcomes and learning curves, this study was undertaken. It shows that T2 positive margin status is the best objectively acquired parameter representative of a surgeon's proficiency which improves with experience. All outcomes have their importance especially in terms of long-term disease-free survival and functional well being. Until outcome analysis becomes a routine part of a clinical practice. dependent on staffing and funding, it would helpful to have a surrogate to gauge performance which is least subject to reporting bias.