Gene Copy Number Matrix
Given a cluster file, one can create a gene copy number matrix (GCN). With this self-explanatory simple matlab file it is easy to create one.
Input Format:
Cluster file.xlsx:
(1) A B C D
(2) A A A
(3) B C
(4) D D A
List File.xlsx:
A
B
C
D
Output: (of course, the output will have the file with only numbers printed)
A B C D
(1) 1 1 1 1
(2) 3 0 0 0
(3) 0 1 1 0
(4) 1 0 0 2
Input Format:
Cluster file.xlsx:
(1) A B C D
(2) A A A
(3) B C
(4) D D A
List File.xlsx:
A
B
C
D
Output: (of course, the output will have the file with only numbers printed)
A B C D
(1) 1 1 1 1
(2) 3 0 0 0
(3) 0 1 1 0
(4) 1 0 0 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | %Author = Arun Prasanna %Create a gene copy number matrix from cluster information clear; clc; tic [mat1, mat] = xlsread('ClusterFile.xlsx','Sheet1'); clear mat1 [mat2, head] = xlsread('Organism_list.xlsx','Sheet1'); clear mat2 %species/gene name new_head = head(:,1)'; %transpose to make it as header [rmat,cmat] = size(mat); [rhead,chead] = size(new_head); out = zeros(rmat,chead); counter =0; for i = 1:chead i %print value of i to track progress for j = 1:rmat for k = 1:cmat cmp = strcmp(new_head(1,i),mat(j,k)); chk = strcmp(mat(j,k),''); %Check for empty field; if (cmp ==1 && chk ~= 1) counter = counter +1; else out(j,i) = out(j,i); end end out(j,i) = counter; counter =0; end end OF = xlswrite('ClusterFile.xlsx',out,'MatOut') disp('Program ends...output written') toc |
Comments
Post a Comment