软件spss,K-Means分析时,初始中心类问题

软件spss,K-Means分析时,初始中心类问题

2个表,表1为应聘人员成绩情况,表2为招聘部门期望情况
表1:6个字段
应聘人员(string),笔试成绩(numeric),知识面(numeric),理解能力(numeric),应变能力(numeric),表达能力(numeric)
表2:6个字段
招聘部门(String),笔试成绩(numeric),知识面(numeric),理解能力(numeric),应变能力(numeric),表达能力(numeric)
以表2为初始类中心点,对表1进行K-Means分析
(注:所有字段选用英文标注)

但在分析结果时报错
操作过程如下:
Analyze→Classify→K-Means Cluster,
Variables:放入笔试成绩,知识面,理解能力,应变能力和表达能力5个变量;
Label Cases by:应聘人员;
Number of Clusters:4;
Method: Iterate and classify;
Cluster Centers: Read initial from→file:选择表2所在位置,点击continue。
当所有这些都做好后,点击ok按钮,运行K-Means分析,报错,内容如下:
The file referenced on the FILE subcommand does not have the proper format for QUICK CLUSTER initial cluster centers.
This command is not executed.

第1个回答  2013-01-29
和你遇到了同样的问题,在GOOGLE上搜到的答案。初始中心的文件,第一个变量就应该是cluster_,值为1-k(k就是类别数)。我按这个做了,就OK啦!

问:
I ran the SPSS Quick Cluster procedure for K Means cluster
analysis, specifying an SPSS file with the initial cluster
centers. I received error message # 14024 which states:
"The file referenced in the FILE subcommand does not have
the proper format for QUICK CLUSTER initial cluster centers."

My cluster center file includes all the variables that are
used in the Quick Cluster command and there is one case for
each of the centers. Am I missing something?

答:
The first variable in your cluster center file must be
named cluster_ . The values for cluster_ in the K rows will
be 1, 2, ... K, where K is the number of clusters. The absence
of cluster_ will trigger the improper format error message.
Other essential properties of the centers file include:
1. You must have at least as many cases in the center file
as the number of clusters specified in the QUICK CLUSTER command.
If there are K cases in the centers file and J (J< K) clusters
specified in the QUICK CLUSTER command, only the first J cases
from the centers file will be used.
2. All the variables that are included in the QUICK CLUSTER
command must be included in the centers file. Variable order
need not be identical in the QUICK CLUSTER command and the
centers file (provided cluster_ comes first in the centers file).
Variables in the centers file that are not in the QUICK CLUSTER
variable list will be ignored in the analysis.本回答被提问者和网友采纳
相似回答