Join Matrices in MATLAB

安稳与你 提交于 2020-01-09 07:38:05

问题


I have two matrices like the following ones:

'01/01/2010'          1
'02/01/2010'          2
'03/01/2010'          3
'05/01/2010'         11
'06/01/2010'         17

'01/01/2010'          4
'02/01/2010'          5
'04/01/2010'          6
'05/01/2010'          7

, and after doing a few tricky things in MATLAB, I want to create the following three matrices:

'01/01/2010'          1          4
'02/01/2010'          2          5
'03/01/2010'          3        NaN
'04/01/2010'        NaN          6
'05/01/2010'         11          7
'06/01/2010'         17        NaN


'01/01/2010'          1          4
'02/01/2010'          2          5
'05/01/2010'         11          7

Any idea on how to join these tables? Cheers.

EDIT: Really sorry for my typos, guys. I updated both the question and the input/output data. Please, feel free to provide suggestions.


回答1:


I believe what you are trying to achieve are called inner join, and full outer join in the database world.

First we start with the two datasets:

d1 = {
 '01/01/2010'          1
 '02/01/2010'          2
 '03/01/2010'          3
 '05/01/2010'         11
 '06/01/2010'         17
};
d2 = {
 '01/01/2010'          4
 '02/01/2010'          5
 '04/01/2010'          6
 '05/01/2010'          7
};

Here is the code to perform the two types of join:

%# get all possible dates, and convert them to indices starting at 1
[keys,~,ind] = unique( [d1(:,1);d2(:,1)] );

%# full outer join
ind1 = ind(1:size(d1,1));
ind2 = ind(size(d1,1)+1:end);

fullOuterJoin = cell(numel(keys),3);
fullOuterJoin(:) = {NaN};           %# fill with NaNs
fullOuterJoin(:,1) = keys;          %# union of dates
fullOuterJoin(ind1,2) = d1(:,2);    %# insert 1st dataset values
fullOuterJoin(ind2,3) = d2(:,2);    %# insert 2nd dataset values

%# inner join
loc1 = ismember(ind1, ind2);
loc2 = ismember(ind2, ind1);

innerJoin = cell(sum(loc1),3);
innerJoin(:,1) = d1(loc1,1);        %# intersection of dates
innerJoin(:,2) = d1(loc1,2);        %# insert 1st dataset values
innerJoin(:,3) = d2(loc2,2);        %# insert 2nd dataset values

Alternatively, we could have extracted the inner join from the outer join dataset by simply removing rows with any NaN values:

idx = all(~isnan(cell2mat(fullOuterJoin(:,2:end))), 2);
innerJoin = fullOuterJoin(idx,:);

Either way, the result:

>> fullOuterJoin
fullOuterJoin = 
    '01/01/2010'    [  1]    [  4]
    '02/01/2010'    [  2]    [  5]
    '03/01/2010'    [  3]    [NaN]
    '04/01/2010'    [NaN]    [  6]
    '05/01/2010'    [ 11]    [  7]
    '06/01/2010'    [ 17]    [NaN]

>> innerJoin
innerJoin = 
    '01/01/2010'    [ 1]    [4]
    '02/01/2010'    [ 2]    [5]
    '05/01/2010'    [11]    [7]



回答2:


In MATLAB, you cannot have strings as matrix elements. For that you need to use a cell array. This is a solution using cell arrays and containers.Maps.

FirstCellArray = {
'01/01/2010', 1;
'02/01/2010', 2;
'03/01/2010', 3;
'05/01/2010', 11;
'06/01/2010', 17
};

SecondCellArray = {
'01/01/2010', 4;
'02/01/2010', 5;
'04/01/2010', 6;
'05/01/2010', 7;
};

AllDatesCellArray = union(FirstCellArray(:,1), SecondCellArray(:,1));

% Create containers.Maps for both cell arrays. containers.Maps are hash tables.

DateToFirstNumberMap = containers.Map(FirstCellArray(:,1), FirstCellArray(:,2));
DateToSecondNumberMap = containers.Map(SecondCellArray(:,1), SecondCellArray(:,2));

WithNaNsCellArray = AllDatesCellArray;

for Index = 1:size(WithNaNsCellArray, 1)
    Key = AllDatesCellArray{Index, 1};
    try
        NumberOne = cell2mat(values(DateToFirstNumberMap, cellstr(Key)));
    catch
        NumberOne = NaN;
    end
    WithNaNsCellArray{Index, 2} = NumberOne;
    try
        NumberTwo = cell2mat(values(DateToSecondNumberMap, cellstr(Key)));
    catch
        NumberTwo = NaN;
    end
    WithNaNsCellArray{Index, 3} = NumberTwo;
end

WithoutNaNsCellArray = WithNaNsCellArray;
NaNIndicesVector = (isnan([WithNaNsCellArray{:,2}]) | isnan([WithNaNsCellArray{:,3}]));
WithoutNaNsCellArray(NaNIndicesVector == 1, :) = [];

Then WithNaNsCellArray contains the result with NaN rows and WithoutNaNsCellArray contains the result without NaN rows.

WithNaNsCellArray = 
'01/01/2010'    [  1]    [  4]
'02/01/2010'    [  2]    [  5]
'03/01/2010'    [  3]    [NaN]
'04/01/2010'    [NaN]    [  6]
'05/01/2010'    [ 11]    [  7]
'06/01/2010'    [ 17]    [NaN]

WithoutNaNsCellArray = 
'01/01/2010'    [ 1]    [4]
'02/01/2010'    [ 2]    [5]
'05/01/2010'    [11]    [7]



回答3:


The statistics toolbox contains a function called JOIN that basically does what you want.

http://www.mathworks.de/de/help/stats/dataset.join.html

Unfortunately, it probably can't handle strings and polytyped matrices. But you might be able to use JOIN to shorten the solutions proposed by the other answers.



来源:https://stackoverflow.com/questions/10981570/join-matrices-in-matlab

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!