最近才看到Tom Khabaza写的一篇很有份量的文章,阐述了数据挖掘的九大法则,在最后他以俳句方式进行了总结,可谓是字字珠玑。原文很长,只将俳句和各法则的纲要翻译放在这里。
First the business goal // Is all in data mining // This defines the field
Second is knowledge // Of business at every stage // This is the centre
Third prepare data // The form permits the question // Shape the problem space
Fourth is no free lunch / /By searching find the model // And the goal may change
Fifth is Watkins law / /There will always be patterns // Traces in data
Sixth is perception // Data mining amplifies // Thus we find insight
Seventh prediction // Generalised from data // New facts locally
Eighth is the value // All value comes from business // Not technical things
Ninth and last is change // No pattern lasts forever // We do not stand still
Miners know this truth / /How can business be better? / /If we see what is
Can the process change? // Technology can change it / /None has done so yet
1、商业目标是数据挖掘的起点
2、业务知识贯穿着数据挖掘的每个阶段
3、数据挖掘中的大部分工作是数据准备
4、对于给定的问题,只有不断尝试才能得到适合的模型
5、模式总会存在
6、数据挖掘能增强对业务的理解
7、通过归纳,预测模型增加了信息量
8、数据挖掘的价值并不取决于模型的准确或稳定
9、所有的模式都会改变