Tuesday, October 25, 2011

Cladistics good, Aurorazhdarcho bad

Today I'm reporting on two papers, one good, one bad.  Both involve cladistics, but besides that are basically unrelated.

Tom Holtz notified the DML of a new paper by Brazeau (2011).  I highly recommend anyone making or examining a cladistic analysis read this work.  He basically outlines many of the problems I describe in the Evaluating Phylogenetic Analyses page of my website.
- Don't make "pseudo-ordered" characters of the form "bone x absent (0); bone x lacks feature A (1); bone x has feature A (2)", because if it's unordered PAUP has no reason to know to group all taxa with bone x together.  If it's ordered, it solves that problem, but has the probably undesired effect of assuming feature A is related to the loss of the bone.
- Don't have multiple characters implicitly coding for the same thing, with absence of that thing a state in addition to states coding for the presence/absence of a feature on the thing.  So "bone x absent (0); bone x present (1)" and "bone x absent (0); bone x present and without feature A (1); bone x present and with feature A (2)" should not both exist.  Have one character for the bones's absence/presence, and another character for each feature of the bone.  Just code taxa without the bone as inapplicable for characters about that bone's feature.  But be sure to set PAUP to collapse 0 length branches if you use inapplicable characters (TNT and NONA collapse them automatically).
- Don't make compound characters.  Each character should code for only one variable.
- Remember that "0" does not mean "primitive".  0 has to be a distinct state just like 1, 2 or any other number.  So don't make a character like "deltopectoral crest shape not described by any of the other states (0); crest round (1); crest triangular (2)", because there are lots of other shapes besides round and triangular, but PAUP could easily make state 0 synapomorphic for some clade.  That could end up grouping taxa with rectangular, pentagonal, etc. crests together as having the same condition, which is clearly not justified.
- As a consequence of this, making ordered multistate characters is better than making a series of less inclusive bistate characters.

The second paper was announced today- the description of a new taxon of pterosaur.   Frey et al. (2011) described Aurorazhdarcho, which is a damned cool name.  Unfortunately, the paper goes downhill from there.

First, they assign Aurorazhdarcho to the new family Protazhdarchidae.  Are there really people who still think you can make up a family-group name that's not eponymous with an existing genus?  Without a Protazhdarcho (which doesn't exist), there can be no Protazhdarchidae.  And Frey et al. can't use the excuse that Protazhdarchidae is "just a clade" since they explicitly say "nov. fam." and "we propose to erect a new family, the Protazhdarchidae..."  Tim Williams brought up the possibility on the DML that maybe the genus was originally named Protazdarcho and later changed, but the family name wasn't caught in time (though barring a VERY last minute change or editorial messiness I would hope the peer reviewers would still catch it), and if that's the case I apologize to the authors for this insulting paragraph.  Regardless, my insults in the next two paragraphs still apply. ;)

Second, Protazhdarchidae is monotypic, so is useless anyway.  Maybe I was too hasty in dismissing Jaime's suggestion for purely monotypic theropod families in the year 2100, since apparently it's not just Ji and other Chinese workers who are stuck in the archaic typological mindset.  The taxonomic world has moved beyond subjective difference being a reason to name a new clade/grade, please join the rest of us in the 21st century.

Third, Frey et al. include the highly flawed section "Problems with cladistic analysis".  Note they don't actually include Aurorazhdarcho in an analysis.  Why not?  "The main reason is that the low wing attachment is reason enough to align the specimen with the azhdachoid construction, which separates the group from all other Pterosauria."  I suppose Halloween IS a good time for Huene's ghost to rear its head, insisting on the importance of key characters.  We then get this lovely gem-

"If the low position of the glenoid fossa is regarded as original tetrapod, the azhdarchoid pterosaur construction has retained the low articulation of the front limbs and thus must have separated in the early history of the Pterosauria, possibly during the Triassic. Then, the high wing articulation could have evolved several times independently within the Pterosauria. If the low wing articulation is regarded as derived, the re-development of the primitive position of the glenoid fossa has to be explained. To resolve this question, a reinvestigation of the shoulder girdle of early Pterosauria would be necessary. For now, this problem remains unresolved pending an engineering approach concerning the consequences of low wing attachment, too. Hence, the character should be dismissed because of its evident functional impetus and unclear origin (Frey et al. 2003a)."

Did anyone else hear a distinguished gentleman in a sepia photograph read the above statements?  A single primitive character does not mean an entire clade is basal- we must examine the entire set of characters to determine which are more likely to be reversals or convergences.  We don't have to explain why any character evolved, nor should our ability to hypothesize why one state could evolve from another affect our choice in character polarity.  I'm very interested in what exactly all the characters we use were actually good for, but the analysis comes first THEN the evolutionary scenario.  Frey et al. are guilty of the same thing BADists are- wanting to know the scenario first and basing the phylogeny off that.  As for their last sentence, since every(?) character that's not the result of genetic drift has some functional importance (and how would we ever test that in extinct taxa?), that's not a reason to exclude them from analyses.  And since origins are only made clear once you run an analysis, excluding a character due to its 'unclear origin' is just nonsensical.

The rest of their "problems" are basically of the form "character x influences character y since both are parts of some functional whole, and until we know how these influences work, we shouldn't include either character in cladistic analyses."  So glenoid position influences deltopectoral crest shape and so on.  Frey et al. are fundamentally wrong in their demand to know function before phylogeny, and that anatomy alone isn't enough to know when characters are strictly correlated.  All you need to do is check the matrix to see if every taxon with character x also has character y, and if every taxon without x also lacks y.  Now if you do find exact correlation and it's logically impossible to have a condition with x and without y and vice versa, THEN you can delete the character.  Otherwise you might have a character complex like the paravian sickle claw where claw hyperextendability, size and curvature are certainly all functionally related, but should still be coded as separate characters since they're independent (e.g. Archaeopteryx lacks large size, Borogovia lacks strong curvature).  Now I suppose some characters might be correlated due to combinations of osteology that are only logically impossible once soft tissues are taken into account, and not just simple muscular biomechanics as Frey et al. suggest, but even such details as involving expression of the same gene at the same time.  Yet we'll never know most soft tissue anatomy for most fossil taxa (and even living taxa are poorly studied in this regard), so to rule out such correlation in our matrices is basically impossible.  We can either try to determine phylogeny now while excluding the logically correlated characters, or wait forever until we have fully examined a complete living growing example of each taxon to eliminate the possibility of correlation for each character.  I vote for the former.

Incidentally, given Frey et al.'s lack of a modern phylogenetic perspective, I don't trust their placement of Aurorazhdarcho in Azhdarchoidea.  Maybe it is, I'm not qualified to say, but I await the results of someone using a modern approach.

References- Brazeau, 2011. Problematic character coding methods in morphology and their effects. Biological Journal of the Linnean Society. 104, 489-498.

Frey, Meyer and Tischlinger, 2011. The oldest azhdarchoid pterosaur from the Late Jurassic Solnhofen Limestone (Early Tithonian) of Southern Germany. Swiss Journal of Geosciences. DOI: 10.1007/s00015-011-0073-1


  1. I don't think it's fair to mention the name of Frey in this respect because the article seems to be largely written by Meyer. Apart from the problems you mentioned there is the fact that the so-called diagnosis only contains symplesiomorphies. Also the limited description suggests that Mr Meyer is perhaps not intimately enough acquainted with pterosaur anatomy to be able to fully code this taxon, in which case he wisely abstained from it.

  2. If Frey didn't contribute, why is his name on the paper? If he did, then he is responsible for what's in it.

  3. Héctor Gómez de SilvaOctober 25, 2011 at 12:09 PM

    A further flaw in reasoning of the Aurorazhdarcho paper, if character x and character y are perfectly correlated, there is no reason to discard both characters ("shouldn´t include either character in cladistic analysis"), just reduce them to a single character.

  4. It's not obvious to me why perfect agreement between two characters should necessarily mean that they should be merged (or one deleted). Correlation is not causation.

  5. Anonymous (comment 1) - that's seems beyond absurd. Not the part about a co-author being on a paper and not contributing a section - that itself happens often these, and regardless, all authors take responsibility for the final product, as Mike said.

    The part that's absurd is that Frey would be the first author and not have written it in the main.

  6. "A further flaw in reasoning of the Aurorazhdarcho paper, if character x and character y are perfectly correlated, there is no reason to discard both characters ("shouldn´t include either character in cladistic analysis"), just reduce them to a single character. "

    I disagree. Correlation is not causation. Characters may have identical distribution but that does not mean they are part of the same functional complex.

  7. In a cladistic sense, there is no reason to separate two characters when they have identical distribution to the taxa included. Increasing the number of charatcers (just as increasing gene pairs) increases "distance" in the metric post-analysis. In character-scoring analyises, while I would generally favor differentiation and partitioning character states for the sake of refinement, I would also like to resolve the issue that in many cases, the chaarcter at hand can be part of a transitory sequence.

    Also, to the commenter who claimed Christian Meyer wrote the paper: Eberhard Frey is not just a coauthor, he is the LEAD author. When we cite within literature, it his his name that is indexed in the bibliography, it is his name that gets to be followed by "et al., ####," and it is his name we must refer in conversation as to what paper is what. If Frey is not the principle author, his name should not be primary.

    My thoughts are basic: I suspect the authors are realistically attempting to create names in the flavor of Family level taxa without strict adherence to the ICZN. I do not think, but may be wrong, that the name is based on an earlier coined name; rather, it is a "natural" clade name (coinced on the fly) with the "family" ending -idae tagged onto it. Thus, it -- like other non-ranked suprageneric taxa -- is a phylogenetic hypothesis, and nothing else. Still, I'd have preferred the authors followed the rules on this instead of ignoring them, especially with using the phrase "fam. nov.".

  8. I agree with others against "Anonymous #1". If you get first authorship on a paper despite not being primarily responsible for it (which is deplorable, as Jay said), you deserve to get the blame if it's bad, just like you'd get referenced if it was good (as Jaime said).

    Also, if authors aren't knowledgable enough about a group to code a taxon, then I'd argue they aren't knowledgable enough to describe a specimen either. I know I've turned down coauthorship on pterosaur descriptions before because I know they're outside my area of expertise.

    I disagree with Jaime's idea the authors were trying to make a non-ranked clade that just looks like a Linnaean family (which I would also complain about). They refer to the taxon as a new family no less than four times, and seem to have an overall archaic/traditional concept of phylogenetics. All instances of authors creating unranked pseudofamilies I can recall were done in a modern setting of phylogenetic nomenclature, apomorphy-based diagnoses, cladistics, non-redundant taxonomy and such.

    As for characters with identical distributions, it's somewhat complicated. First of all, it could just be due to the taxon sample. All theropods in Holtz's 1994 analysis that lack premaxillary teeth also lack maxillary teeth, but that's just because it didn't include Caudipteryx, Hesperornis and such. So if you used "premaxillary and maxillary teeth absent" as a character, it's actually a composite character that couldn't be accurately coded for all taxa. And as Brazeau noted, this is bad.

    But even if you have two perfectly correlated character distributions considering all taxa, they can easily be so anatomically distant that no plausible relationship exists. Like a narrow notch between the basal tubera and an enlarged metatarsal IV, both uniquely troodontid in most TWG matrices. And the anatomical distance between characters varies gradually from that to examples which involve two similar aspects of the same structure, like "coronoid process strongly projected dorsally" and "coronoid process strongly projected medially". So while Jaime would probably combine the latter into one coronoid projection character, there's no objective line he could draw for when not to do so. As the sickle claw character complex exemplifies, there are characters which are certainly functionally related, but which can still vary independently. The only exceptions I make are when two characters logically can not vary independently (humerus longer than femur vs. humerus longer than half femur length) or when they're both expressions of the same variable (humerus longer than scapula vs. humerus longer than femur; in which case it would be better to choose one standard metric which is roughly correlated to total size).

  9. Also, to the commenter who claimed Christian Meyer wrote the paper: Eberhard Frey is not just a coauthor, he is the LEAD author. When we cite within literature, it his his name that is indexed in the bibliography, it is his name that gets to be followed by "et al., ####," and it is his name we must refer in conversation as to what paper is what. If Frey is not the principle author, his name should not be primary.
    marble blast online