supermat.Rd
Build PHYLIP supermatrix and create RAxML partition file using aligned fasta or phylip files.
supermat(infiles, outfile = "supermat.out.phy", partition.file = "gene_partition.txt")
infiles | a character string vector for phylip or aligned fasta file. |
---|---|
outfile | the name of the PHYLIP supermatrix |
partition.file | partition data summary describing the genes. |
Supermatrix here means a phylip file with combined aligned sequences. The missing sequences should be replaced with either "?" or "-".
A list containing: (1)supermat.dat:a list containing all the data frames read by read.phylip or read.fasta (2)res.super.dat: a data frame containing the sequences and the names (3)partition.dat: summary for all the fasta or phylip files (4)partition.dat.vector: character string vector for the partition file for RAxML
Kress, W. J., Erickson, D. L., Jones, F. A., Swenson, N. G., Perez, R., Sanjur, O., & Bermingham, E. (2009). Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proceedings of the National Academy of Sciences, 106(44), 18621-18626.
de Queiroz, A.and Gatesy, J. (2007). The supermatrix approach to systematics. Trends in Ecology & Evolution, 22(1), 34-41.
https://github.com/stamatak/standard-RAxML
Jinlong Zhang <jinlongzhang01@gmail.com>
Punctuation characters and white space in the names of the sequences will be replaced by "_". More information can be found at regex
.
Type of the sequence in the RAxML partition file should be changed manually according to the manual of RAxML.
cat("6 22", "seq_1 --TTACAAATTGACTTATTATA", "seq_2 GATTACAAATTGACTTATTATA", "seq_3 GATTACAAATTGACTTATTATA", "seq_5 GATTACAAATTGACTTATTATA", "seq_8 GATTACAAATTGACTTATTATA", "seq_10 ---TACAAATTGAATTATTATA", file = "matk.phy", sep = "\n") cat("5 15", "seq_1 GATTACAAATTGACT", "seq_3 GATTACAAATTGACT", "seq_4 GATTACAAATTGACT", "seq_5 GATTACAAATTGACT", "seq_8 GATTACAAATTGACT", file = "rbcla.phy", sep = "\n") cat("5 50", "seq_2 GTCTTATAAGAAAGAATAAGAAAG--AAATACAAA-------AAAAAAGA", "seq_3 GTCTTATAAGAAAGAAATAGAAAAGTAAAAAAAAA-------AAAAAAAG", "seq_5 GACATAAGACATAAAATAGAATACTCAATCAGAAACCAACCCATAAAAAC", "seq_8 ATTCCAAAATAAAATACAAAAAGAAAAAACTAGAAAGTTTTTTTTCTTTG", "seq_9 ATTCTTTGTTCTTTTTTTTCTTTAATCTTTAAATAAACCTTTTTTTTTTA", file = "trn1.phy", sep = "\n") supermat(infiles = c("matk.phy", "rbcla.phy", "trn1.phy"))#> Supermatrix "supermat.out.phy" and RAxML partition file "gene_partition.txt" have been saved to: #> /Users/jinlong/Documents/github/phylotools/docs/referenceunlink(c("matk.phy", "rbcla.phy", "trn1.phy")) unlink(c("supermat.out.phy","gene_partition.txt")) cat( ">seq_1", "--TTACAAATTGACTTATTATA", ">seq_2", "GATTACAAATTGACTTATTATA", ">seq_3", "GATTACAAATTGACTTATTATA", ">seq_5", "GATTACAAATTGACTTATTATA", ">seq_8", "GATTACAAATTGACTTATTATA", ">seq_10", "---TACAAATTGAATTATTATA", file = "matk.fasta", sep = "\n") cat( ">seq_1", "GATTACAAATTGACT", ">seq_3", "GATTACAAATTGACT", ">seq_4", "GATTACAAATTGACT", ">seq_5", "GATTACAAATTGACT", ">seq_8", "GATTACAAATTGACT", file = "rbcla.fasta", sep = "\n") cat( ">seq_2", "GTCTTATAAGAAAGAATAAGAAAG--AAATACAAA-------AAAAAAGA", ">seq_3", "GTCTTATAAGAAAGAAATAGAAAAGTAAAAAAAAA-------AAAAAAAG", ">seq_5", "GACATAAGACATAAAATAGAATACTCAATCAGAAACCAACCCATAAAAAC", ">seq_8", "ATTCCAAAATAAAATACAAAAAGAAAAAACTAGAAAGTTTTTTTTCTTTG", ">seq_9", "ATTCTTTGTTCTTTTTTTTCTTTAATCTTTAAATAAACCTTTTTTTTTTA", file = "trn1.fasta", sep = "\n") supermat(infiles = c("matk.fasta", "rbcla.fasta", "trn1.fasta"))#> Supermatrix "supermat.out.phy" and RAxML partition file "gene_partition.txt" have been saved to: #> /Users/jinlong/Documents/github/phylotools/docs/reference