FCM Clustering numeric data and csv/excel file

好久不见. 提交于 2019-12-18 09:29:17

问题


Hi I asked a previous question that gave a reasonable answer and I thought I was back on track, Fuzzy c-means tcp dump clustering in matlab the problem is the preprocessing stage of the below tcp/udp data that I would like to run through matlabs fcm clustering algorithm.My question:

1) how do i or what would be the best method to convert the text data in the cells to a numeric value? what should the numeric value be?

Edit: My data in excel looks like this now:

0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.

回答1:


Here is an example how I would read the data into MATLAB. You need two things: the data itself which is in comma-separated format, as well as the list of features along with their types (numeric,nominal).

%# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);

%# determine type of features
C{2} = regexprep(C{2}, '.$','');              %# remove "." at the end
attribNom = [ismember(C{2},'symbolic');true]; %# nominal features

%# build format string used to read/parse the actual data
frmt = cell(1,numel(C{1}));
frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number
frmt( ismember(C{2},'symbolic') ) = {'%s'};   %# nominal features: read as string
frmt = [frmt{:}];
frmt = [frmt '%s'];                           %# add the class attribute

%# read dataset
fid = fopen('kddcup.data','rt');
C = textscan(fid, frmt, 'Delimiter',',');
fclose(fid);

%# convert nominal attributes to numeric
ind = find(attribNom);
G = cell(numel(ind),1);
for i=1:numel(ind)
    [C{ind(i)},G{i}] = grp2idx( C{ind(i)} );
end

%# all numeric dataset
M = cell2mat(C);

You could also look into the DATASET class from the Statistics Toolbox.



来源:https://stackoverflow.com/questions/7712748/fcm-clustering-numeric-data-and-csv-excel-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!