Investigating rbcL gene duplication in grasses

Project summary

A commonly sequenced gene region for molecular identification and phylogenetic comparison is rbcL, which codes for a portion of the primary photosynthetic enzyme Rubisco (ribulose-1,5-bisphosphate carboxylase/oxygenase), thought to be the most abundant protein on earth. Duplication of rbcL has been noted in other grasses such as maize (Zea) and rice (Oryza), and some stipoid specimens (sequenced at the Royal Botanic Gardens Melbourne), but the taxonomic extent of this phenomenon is presently unknown.

We are now investigating this phenomenon in stipoid grasses and across Poaceae, including isolating discrete rbcL sequences and using different genomic isolation methods to determine the genomic location of the copies. This gene is usually located on the plastid genome (found in the chloroplast), but copies have been found on the mitochondrial genome in other grasses. These copies could be non-functional (pseudogenes), or they may retain or develop new functions.

It is not known how many rbcL sequences currently on databases such as GenBank are true genes or pseudogenes, as submitting authors may have not been aware of the potential for duplication at this locus. This research will provide an ability to re-interpret the current GenBank accessions and to maximise the utility of rbcL for barcoding and phylogenetics.

Project team

  • Anna Syme (Royal Botanic Gardens Melbourne, 2010–2012)
  • Daniel Murphy (Royal Botanic Gardens Melbourne)
  • Stuart Gardner (Royal Botanic Gardens Melbourne, 2010–2012)


  • Helen McLellan Research Grant