Tuesday, March 20, 2012

cluster discrimination scores

Dear all here:

I have a question about the cluster mining result.

In the discrimination tab shows the different between group 1 and group 2.

I just wonder how to calculate the discremination scores.

Do any expertise can answer the question?

Regards!

Jerry

This is calculated with statistical method. We assume each cluster constitutes a Gaussian/Multinomial distribution on a continuous/discrete attribute. The parameters of these distributions are then calculated for each cluster. For each specific attribute value, we can then evaluate its probability by combining the cluster distributions, the given attribute value (all other attribute are set to unknown). The discrimination score for each attribute value is calculated based on the above probability. Anyway, the math formulas used to calculate the final scores are very complicated and undocumented yet. So I won’t go into every detail here.

Thanks,


Yimin Wu, SQL Server Data Mining

No comments:

Post a Comment