The
bump hunting, proposed by Friedman and Fisher, has become important
in many fields. Suppose that we are interested in finding regions
where target points are denser than other regions. Such dense regions
of target points are called the bumps, and finding them is called
bump hunting. By prespecifying a pureness rate in advance, a maximum
capture rate could be obtained. Then, a tradeoff curve between
the two can be constructed. Thus, to find the bump regions is equivalent
to construct the tradeoff curve. When we adopt simpler boundary
shapes for the bumps such as the union of boxes located parallel
to some explanation variable axes, it would be convenient to adopt
the binary decision tree. Since the conventional binary decision
tree will not provide the maximum capture rates, we use the genetic
algorithm (GA), specified to the tree structure, the treeGA. Assessing
the accuracy for the tradeoff curve in typical fundamental cases,
the proposed treeGA can construct the effective tradeoff curve
which is close to the optimal one comparing with that obtained
by using the PRIM (Patient Rule Induction Method) proposed by Friedman
and Fisher.
To apply this method in the medical field, we have to overcome the difficulty
of NP problem, i.e., the number of samples, N, is less than the number of feature
variables, P. For higher dimension, the tradeoff curve may be biased. In this
presentation, we demonstrate this kind of bias in typical cases, and we propose
a method to remove the bias. We have found that the proposed method worked as
long as the dimension for feature variables is not so large. 





