split_dat.Rd
Splite the data frame of sequences based on the reference table of grouping.
split_dat(dat, ref_table)
dat | data frame generated by |
---|---|
ref_table | data frame with first column for the name of the sequence, second column for the group the sequence belongs to. |
Each group of sequences will be saved to a fasta file. Sequences not included in the ref_table will be saved in "Ungrouped.fasta"
This is a subroutine, there is no return value.
http://www.genomatix.de/online_help/help/sequence_formats.html
Jinlong Zhang <jinlongzhang01@gmail.com>
cat( ">seq_1", "--TTACAAATTGACTTATTATA", ">seq_2", "GATTACAAATTGACTTATTATA", ">seq_3", "GATTACAAATTGACTTATTATA", ">seq_5", "GATTACAAATTGACTTATTATA", ">seq_8", "GATTACAAATTGACTTATTATA", ">seq_10", "---TACAAATTGAATTATTATA", ">seq_11", "--TTACAAATTGACTTATTATA", ">seq_12", "GATTACAAATTGACTTATTATA", ">seq_13", "GATTACAAATTGACTTATTATA", ">seq_15", "GATTACAAATTGACTTATTATA", ">seq_16", "GATTACAAATTGACTTATTATA", ">seq_17", "---TACAAATTGAATTATTATA", file = "trnh.fasta", sep = "\n") sequence_name <- get.fasta.name("trnh.fasta") sequence_group <- c("group1","group1","group1","group1","group1", "group2","group2","group2","group3","group3","group3","group3") group <- data.frame(sequence_name, sequence_group) fasta <- read.fasta("trnh.fasta") split_dat(fasta, group)#> ungrouped.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference #> group1.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference #> group2.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference #> group3.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference #> splitted fasta files have been saved to: #> /Users/jinlong/Documents/github/phylotools/docs/reference