-
Dear IQ-TREE Developers, This method of partitioned phylogenetic analysis of morphological data and the command below don't work well. See the replies below. I am reanalyzing morphological datasets from published studies using maximum likelihood (ML), Bayesian, and maximum parsimony methods. For the ML analysis, I am using IQ-TREE v2.3.6 and trying to partition data by the number of states of each character, as done in MrBayes (see MrBayes Manual, p. 142) as it may increase model fit. This method seems to have been applied to IQ-TREE, for example, in Černý & Simanoff (2023). Characters recognized as invariant by IQ-TREE in the dataset were removed using Mesquite and my custom R script in order to be able to apply an ascertainment bias correction. A partition file for the dataset was generated by my custom R script. I ran IQ-TREE with the following command: The execution appeared to finish without any apparent issues, but my question is, whether IQ-TREE can automatically apply the number of states of each partition to each Mk model. I scrutinized the log file and the iqtree file, but I could not find any related information. This is a zip file of the folder for my reanalysis of Wang et al. (2023). I would appreciate it if you could answer this question. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
That's a good question. For MORPH data type IQ-TREE doesn't print the number of states in the model. But internally, the states are always encoded from 0, 1, 2,.... 9, A, B, ..., Z. (http://www.iqtree.org/doc/Substitution-Models#binary-and-morphological-models). The number of states will be determined by the largest state (in this order) in a partition. For example, if a partition has states 0, 1, 2; the number of states will be 3. If some states are missing, they will be technically removed from the model. For example, if a partition has states 0, 1, 3, 4, 6 (i.e., 2 and 5 do not occur in the alignment), then it will be considered as a five-state model, including only states that occur at least once in the alignment. Does that answer your question? |
Beta Was this translation helpful? Give feedback.
-
Hmm, that's a good point – for 2-state characters,
was always true. To test this, I've conducted an experiment using the dataset of Fonseca et al. (2024), which I'm currently using for an unrelated project. First, I split it by character type and state space size into separate Phylip files called
and conducted an "Analysis 1" as follows:
Then, I combined these files back together to obtain a single Phylip file called
and conducted an "Analysis 2" as follows:
As it turns out, Analysis 1 found ML trees with log likelihoods of around −25,900, while Analysis 2 only found trees with log likelihoods closer to −38,500. The huge difference is clearly due to the fact that rate matrices of different dimensions were being applied to the data. Unfortunately, this demonstrates that my original claim was correct: IQ-TREE doesn't determine rate matrix dimensions based on the largest number of states in a given partition, but rather the largest number of states in a given file. The only way to force IQ-TREE to employ rate matrices of the right dimension is to split the dataset into multiple files – partitioning it by other means is not enough. P.S. I'd be interested in your experience applying ModelFinder to morphological data. To me, it never seemed super useful, except maybe for choosing the rate heterogeneity model. My impression is that in morphology, many aspects of model choice are either determined by the background knowledge of the researcher (e.g., when to use |
Beta Was this translation helpful? Give feedback.
That's a good question. For MORPH data type IQ-TREE doesn't print the number of states in the model. But internally, the states are always encoded from 0, 1, 2,.... 9, A, B, ..., Z. (http://www.iqtree.org/doc/Substitution-Models#binary-and-morphological-models). The number of states will be determined by the largest state (in this order) in a partition. For example, if a partition has states 0, 1, 2; the number of states will be 3. If some states are missing, they will be technically removed from the model. For example, if a partition has states 0, 1, 3, 4, 6 (i.e., 2 and 5 do not occur in the alignment), then it will be considered as a five-state model, including only states that occur a…