get.fasta.name.Rd
get the names of all the sequences of a fasta file, and perform cleaning of the names of the sequences
get.fasta.name(infile, clean_name = FALSE)
infile | character string representing the name of the fasta file. |
---|---|
clean_name | logical, representing cleaning of the names will be performed. |
a character vector containing the names of the sequences
http://www.genomatix.de/online_help/help/sequence_formats.html
Jinlong Zhang <jinlongzhang01@gmail.com>
Punctuation characters and white space be replaced by "_". Definition of Punctuation characters can be found at regex
.
cat( ">seq_2", "GTCTTATAAGAAAGAATAAGAAAG--AAATACAAA-------AAAAAAGA", ">seq_3", "GTCTTATAAGAAAGAAATAGAAAAGTAAAAAAAAA-------AAAAAAAG", ">seq_5", "GACATAAGACATAAAATAGAATACTCAATCAGAAACCAACCCATAAAAAC", ">seq_8", "ATTCCAAAATAAAATACAAAAAGAAAAAACTAGAAAGTTTTTTTTCTTTG", ">seq_9", "ATTCTTTGTTCTTTTTTTTCTTTAATCTTTAAATAAACCTTTTTTTTTTA", file = "trn1.fasta", sep = "\n") get.fasta.name("trn1.fasta")#> [1] "seq_2" "seq_3" "seq_5" "seq_8" "seq_9"