Cleaning the names of sequences for a fasta file. The punctuation characters and the white space will be replaced with "_".

clean.fasta.name(infile = NULL, outfile = "name_cleaned.fasta")

Arguments

infile

character string representing the name of the fasta file.

outfile

Character string representing the file name to be generated.

Details

Punctuation characters and white space will be replaced by "_". More information can be found at regex.

Value

This is a subroutine without a return value. A fasta file with all the names of sequences renamed will be saved to the working directory.

References

http://www.genomatix.de/online_help/help/sequence_formats.html

Author

Jinlong Zhang <jinlongzhang01@gmail.com>

See also

Examples

cat( ">seq_1*66", "--TTACAAATTGACTTATTATA", ">seq_2()r", "GATTACAAATTGACTTATTATA", ">seq_3:test", "GATTACAAATTGACTTATTATA", ">seq_588", "GATTACAAATTGACTTATTATA", ">seq_8$$yu", "GATTACAAATTGACTTATTATA", ">seq_10", "---TACAAATTGAATTATTATA", file = "matk.fasta", sep = "\n") clean.fasta.name(infile = "matk.fasta")
#> name_cleaned.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference #> name_cleaned.fasta has been saved to /Users/jinlong/Documents/github/phylotools/docs/reference
get.fasta.name("name_cleaned.fasta")
#> [1] "seq_1_66" "seq_2__r" "seq_3_test" "seq_588" "seq_8__yu" #> [6] "seq_10"
# Delete file unlink("matk.fasta") unlink("name_cleaned.fasta")