|
Eric Harvey, PhD, Health Sciences, Constella Group, 2605 Meridian Parkway, Durham, NC 27713, 919-313-7725, eharvey@constellagroup.com, Patrick Crockett, PhD, Constella Health Sciences, Constella Group, 2605 Meridian Parkway, Durham, NC 27713, and Linda Goodwin, RN, BC, PhD, School of Nursing, Duke University, Box 3322 DUMC, Durham, NC 27710.
The Classification and Regression Tree (CART) algorithm is a hierarchical method for partitioning data into increasingly more homogenous groups. CART splits the data at each node in a tree using a rule which is selected to maximize the homogeneity of the two resultant groups. The order in which these rules are selected can offer valuable information about their relative importance that can lead to better decision making. In standard CART analyses, the data drives the rules selected at each splitting node. In some cases, information about the relative importance of the rules may be available from other sources. In a Bayesian setting, this is known as prior information. We propose a simple method based on weighted sums of squares which allows the inclusion of informative prior information in CART analyses. We demonstrate the method’s application with an example of data-assisted diagnosis, and we compare the weighted CART algorithm with the standard CART algorithm results. Including the prior information has a noticeable impact on the rules selected. One must be careful not to overweight the prior, as this could cause the analysis to recreate the prior rule hierarchy.
Learning Objectives:
Keywords: Statistics, Disease Management
Presenting author's disclosure statement:
I do not have any significant financial interest/arrangement or affiliation with any organization/institution whose products or services are being discussed in this session.