Presence Absence Matrix

Given a cluster file, one can create a Presence-Absence matrix (PA map). With this self-explanatory simple matlab file it is easy to create one.

Input Format:

Cluster file.xlsx:
(1) A  B  C  D
(2) A  A  A
(3) B C
(4) D D A

List File.xlsx:
A
B
C
D

Output: (of course, the output will have the file with only numbers printed)

     A B C D
(1) 1  1  1  1
(2) 1  0  0  0
(3) 0  1  1  0
(4) 1  0  0  1



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
%Author = Arun Prasanna
%Create a presence absence matrix (PA map) from cluster information
clear; clc;
[mat1, mat] = xlsread('ClusterFile.xlsx','Sheet1'); clear mat1
[mat2, head] = xlsread('List.xlsx','Header'); clear mat2
new_head = head(:,col_val); %col_val = 2 => column that has unique sp/gene list
[rmat,cmat] = size(mat);
[rhead,chead] = size(new_head);
out = zeros(rhead,rmat);
for i = 1:rhead
    for j = 1:rmat
        for k = 1:cmat
            cmp = strcmp(new_head(i,1),mat(j,k));
            if (cmp ==1)
                out(i,j) = 1;
                break;
            end
        end
    end
end
OF = xlswrite('ClusterFile.xlsx',out,'MatOut');
if (OF ==1)
    disp('Program ends...output written')
else
    disp('Unable to write..File is huge')
end

Comments

Popular posts from this blog

Fasta Header Replacer V2.0

Map multiple annotations using pandas

Condense fasta header