Posts

Showing posts from May, 2016

Fasta_Header_Rename

A simple matlab code to rename the headers in fasta file. Self-explanatory variable names.

1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31%Author = Arun Prasanna%Rename the headers in fasta file to desired choice%For example: The input fasta file used here had header in >num_name format%strtok is used to strip and extract the required format clear; clc; tic; Path = 'Drive\Path\ToReadFile';% FileList = dir(Path); [rFL, cFL] = size(FileList); for i = 3:rFL %i of 1 & 2 are . & .. respectively     Fas_Fname{i-2,1} = FileList(i).name; %FileList is a structureend [rFas,cFas] = size(Fas_Fname); for i = 1:rFas     clear Header Seq ProtID Sp new_Header     OpenFile = cell2mat(strcat(Path,Fas_Fname(i)));     [Header, Seq] = fastaread(OpenFile);[rH,cH] = size(Header);     for j = 1:cH         [ProtID, Sp] = strtok(Header(1,j),'_'); %First Sp = _name         [Sp, rm] = strtok(Sp,'_'); %Second SP = name ! which we ne…

Fasta_Dupicate_Header

A simple, self-explanatory matlab code to identify duplicate headers in fasta files.

1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26%Simple matlab code to check for the duplicate header in fasta files%Store the size of header -> unique(header) ->size of new header%Copy the Table data in excel and compare the two values clear; clc; tic; Path = 'Drive\Path\FileName';% FileList = dir(Path); [rFL, cFL] = size(FileList); for i = 3:rFL %i of 1 & 2 are . & .. respectively     Fas_Fname{i-2,1} = FileList(i).name; %FileList is a structureend [rFas,cFas] = size(Fas_Fname); for i = 1:rFas     clear Header Seq Old_Header Unik_header     OpenFile = cell2mat(strcat(Path,Fas_Fname(i)));     [Header, Seq] = fastaread(OpenFile);[rH,cH] = size(Header);     Old_Header = length(Header);     Unik_header = length(unique(Header));     Table{i,1} = Fas_Fname(i);     Table{i,2} = num2str(Old_Header);     Table{i,3} = num2str(Unik_header);     fprintf('…