Posts

Showing posts from May, 2016

Fasta_Header_Rename

A simple matlab code to rename the headers in fasta file. Self-explanatory variable names. 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 %Author = Arun Prasanna %Rename the headers in fasta file to desired choice %For example: The input fasta file used here had header in >num_name format %strtok is used to strip and extract the required format clear; clc; tic; Path = 'Drive\Path\ToReadFile' ; % FileList = dir(Path); [rFL, cFL] = size(FileList); for i = 3:rFL %i of 1 & 2 are . & .. respectively     Fas_Fname{i-2,1} = FileList(i).name; %FileList is a structure end [rFas,cFas] = size(Fas_Fname); for i = 1:rFas     clear Header Seq ProtID Sp new_Header     OpenFile = cell2mat(strcat(Path,Fas_Fname(i)));     [Header, Seq] = fastaread(OpenFile);[rH,cH] = size(Header);     for j = 1:cH         [ProtID, Sp] = strtok(Header(1,j), '_' ); %First Sp = _name         [Sp, rm] = strtok(Sp, '_' ); %Se

Fasta_Dupicate_Header

A simple, self-explanatory matlab code to identify duplicate headers in fasta files. 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 %Simple matlab code to check for the duplicate header in fasta files %Store the size of header -> unique(header) ->size of new header %Copy the Table data in excel and compare the two values clear; clc; tic; Path = 'Drive\Path\FileName' ; % FileList = dir(Path); [rFL, cFL] = size(FileList); for i = 3:rFL %i of 1 & 2 are . & .. respectively     Fas_Fname{i-2,1} = FileList(i).name; %FileList is a structure end [rFas,cFas] = size(Fas_Fname); for i = 1:rFas     clear Header Seq Old_Header Unik_header     OpenFile = cell2mat(strcat(Path,Fas_Fname(i)));     [Header, Seq] = fastaread(OpenFile);[rH,cH] = size(Header);     Old_Header = length(Header);     Unik_header = length(unique(Header));     Table{i,1} = Fas_Fname(i);     Table{i,2} = num2str(Old_Header);     Table{i,3} = num2str(Unik_heade