Draft genome assembly of Piper divaricatum reveals genomic basis of eugenol biosynthesis and evolutionary position

Main Article Content

HAI THI HONG TRUONG
HAN NGOC HO
DAT TIEN NGUYEN
NHI THI HOANG HO
CUONG NGUYEN
CHUONG VAN HUYNH

Abstract

Abstract. Truong HTH, Ho HN, Nguyen DT, Ho NTH, Nguyen C, Huynh CV. 2026. Draft genome assembly of Piper divaricatum reveals genomic basis of eugenol biosynthesis and evolutionary position. Biodiversitas 27 (4): d270434. https://doi.org/10.13057/biodiv/d270434. Piper divaricatum is an aromatic member of the Piperaceae valued for its bioactive secondary metabolites, particularly eugenol, as well as its resistance to major plant pathogens. Despite its importance, genomic resources for this non-model species remain limited. In this study, we present a highly fragmented draft genome assembly generated from Illumina paired-end short reads and assembled with SPAdes. The resulting assembly spans approximately 743 Mb, representing 99.97% of the estimated genome size based on Piper nigrum, but exhibits low contiguity (N50 = 6 kb). Repetitive elements account for 78.46% of the genome, contributing substantially to fragmentation. BUSCO analysis recovered 79.3% of complete genes and 18.1% of fragmented genes, indicating that a large proportion of conserved gene content is captured despite assembly limitations. A total of 117,252 gene models were predicted, though this number is likely inflated due to fragmentation and repeat-induced gene splitting. Functional annotation assigned 2,026 genes to KEGG pathways, reflecting conserved metabolic and regulatory networks. Ks distribution analysis of paralogous gene pairs revealed a peak around 0.5, suggesting ancient large-scale duplication events, although confirmation of whole-genome duplication requires chromosome-level assemblies and synteny analysis. Phylogenomic reconstruction based on single-copy orthologs places P. divaricatum within Piperales and supports a sister relationship with Cinnamomum kanehirae, with divergence estimated at ~121.7 Mya. Additionally, candidate genes associated with the phenylpropanoid pathway, including partial EGS1-like fragments, were identified, providing preliminary insights that warrant further transcriptomic and biochemical validation. Overall, this draft genome provides a foundational resource for future functional and comparative genomic studies in the genus Piper.

Article Details

Section

Articles

References

Amborella Genome Project, Albert VA, Barbazuk WB et al. 2013. The Amborella genome and the evolution of flowering plants. Science 342 (6165): 1241089. https://doi.org/10.1126/science.1241089.

Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815. https://doi.org/10.1038/35048692.

Badouin H, Gouzy J, Grassa CJ et al. 2017. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546: 148-152. https://doi.org/10.1038/nature22380.

Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19 (5): 455-477. https://doi.org/10.1089/cmb.2012.0021.

Blanc G, Wolfe KH. 2004. Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16 (7): 1667-1678. https://doi.org/10.1105/tpc.021345.

Cai Z, Penaflor C, Kuehl JV, Leebens-Mack J, Carlson JE, dePamphilis CW, Boore JL, Jansen RK. 2006. Complete plastid genome sequences of Drimys, Liriodendron, and Piper: Implications for the phylogenetic relationships of magnoliids. BMC Evol Biol 6: 77. https://doi.org/10.1186/1471-2148-6-77.

Castresana J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 17 (4): 540-552. https://doi.org/10.1093/oxfordjournals.molbev.a026334.

Chaw S-M, Liu Y-C, Wu Y-W, Wang H-Y, Lin C-YI, Wu C-S, Ke H-M, Chang L-Y, Hsu C-Y, Yang H-T, Sudianto E, Hsu M-H, Wu K-P, Wang L-N, Leebens-Mack JH, Tsai IJ. 2019. Stout camphor tree genome fills gaps in understanding of flowering plant genome evolution. Nat Plants 5: 63-73. https://doi.org/10.1038/s41477-018-0337-0.

Chen J, Hao Z, Guang X et al. 2019. Liriodendron genome sheds light on angiosperm phylogeny and species-pair differentiation. Nat Plants 5: 18-25. https://doi.org/10.1038/s41477-018-0323-6.

Cheng C-Y, Krishnakumar V, Chan AP, Thibaud-Nissen F, Schobel S, Town CD. 2017. Araport11: A complete reannotation of the Arabidopsis thaliana reference genome. Plant J 89 (4): 789-804. https://doi.org/10.1111/tpj.13415.

D’Hont A, Denoeud F, Aury J-M et al. 2012. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488: 213-217. https://doi.org/10.1038/nature11241.

da Silva JKR, Andrade EHA, Guimarães EF, Maia JGS. 2010. Essential oil composition, antioxidant capacity and antifungal activity of Piper divaricatum. Nat Prod Commun 5 (3): 477-480. https://doi.org/10.1177/1934578x1000500327.

De Bodt S, Maere S, Van de Peer Y. 2005. Genome duplication and the origin of angiosperms. Trends Ecol Evol 20 (11): 591-597. https://doi.org/10.1016/j.tree.2005.07.008.

Denoeud F, Carretero-Paulet L, Dereeper A et al. 2014. The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science 345 (6201): 1181-1184. https://doi.org/10.1126/science.1255274.

Doyle JJ, Doyle JL. 1990. Isolation of plant DNA from fresh tissue. Focus 12 (1): 13-15.

Drummond AJ, Ho SYW, Phillips MJ, Rambaut A. 2006. Relaxed phylogenetics and dating with confidence. PLoS Biol 4 (5): e88. https://doi.org/10.1371/journal.pbio.0040088.

Drummond AJ, Rambaut A. 2007. BEAST: Bayesian Evolutionary Analysis by Sampling Trees. BMC Evol Biol 7: 214. https://doi.org/10.1186/1471-2148-7-214.

Emms DM, Kelly S. 2019. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol 20: 238. https://doi.org/10.1186/s13059-019-1832-y.

Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. 2020. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117 (17): 9451-9457. https://doi.org/10.1073/pnas.1921046117.

Gaikwad AB, Kaila T, Maurya A, Kumari R, Rangan P, Wankhede DP, Bhat KV. 2023. The chloroplast genome of black pepper (Piper nigrum L.) and its comparative analysis with related Piper species. Front Plant Sci 13: 1095781. https://doi.org/10.3389/fpls.2022.1095781.

Guo L, Winzer T, Yang X, Li Y, Ning Z, He Z, Teodor R, Lu Y, Bowser TA, Graham IA, Ye K. 2018. The opium poppy genome and morphinan production. Science 362 (6412): 343-347. https://doi.org/10.1126/science.aat4096.

Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: Quality Assessment Tool for genome assemblies. Bioinformatics 29 (8): 1072-1075. https://doi.org/10.1093/bioinformatics/btt086.

Ho SYW, Phillips MJ. 2009. Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Syst Biol 58 (3): 367-380. https://doi.org/10.1093/sysbio/syp035.

Hu L, Xu Z, Wang M, Fan R, Yuan D, Wu B, Wu H, Qin X, Yan L, Tan L, Sim S, Li W, Saski CA, Daniell H, Wendel JF, Lindsey K, Zhang X, Hao C, Jin S. 2019. The chromosome-scale reference genome of black pepper provides insight into piperine biosynthesis. Nat Commun 10: 4702. https://doi.org/10.1038/s41467-019-12607-6.

Hunter JD. 2007. Matplotlib: A 2D graphics environment. Comput Sci Eng 9 (3): 90-95. https://doi.org/10.1109/mcse.2007.55.

IRGSP [International Rice Genome Sequencing Project], Sasaki T. 2005. The map-based sequence of the rice genome. Nature 436: 793-800. https://doi.org/10.1038/nature03895.

Iorizzo M, Ellison S, Senalik D et al. 2016. A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution. Nat Genet 48: 657-666. https://doi.org/10.1038/ng.3565.

Jaillon O, Aury J-M, Noel B et al. 2007. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449 (7161): 463-467. https://doi.org/10.1038/nature06148.

Jaramillo MA, Manos PS. 2001. Phylogeny and patterns of floral diversity in the genus Piper (Piperaceae). Am J Bot 88 (4): 706-716. https://doi.org/10.2307/2657072.

Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS, Soltis DE, Clifton SW, Schlarbaum SE, Schuster SC, Ma H, Leebens-Mack J, dePamphilis CW. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473: 97-100. https://doi.org/10.1038/nature09916.

Käll L, Krogh A, Sonnhammer ELL. 2007. Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server. Nucleic Acids Res 35 (Suppl_2): W429-W432. https://doi.org/10.1093/nar/gkm256.

Kallberg Y, Oppermann U, Jörnvall H, Persson B. 2002. Short-chain Dehydrogenases/Reductases (SDRs): Coenzyme-based functional assignments in completed genomes. Eur J Biochem 269 (18): 4409-4417. https://doi.org/10.1046/j.1432-1033.2002.03130.x.

Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. 2021. KEGG: Integrating viruses and cellular organisms. Nucleic Acids Res 49 (D1): D545-D551. https://doi.org/10.1093/nar/gkaa970.

Koeduka T, Louie GV, Orlova I, Kish CM, Ibdah M, Wilkerson CG, Bowman ME, Baiga TJ, Noel JP, Dudareva N, Pichersky E. 2008. The multiple phenylpropene synthases in both Clarkia breweri and Petunia hybrida represent two distinct protein lineages. Plant J 54 (3): 362-374. https://doi.org/10.1111/j.1365-313X.2008.03412.x.

Krzywinski M, Schein J, Birol İ, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. 2009. Circos: An information aesthetic for comparative genomics. Genome Res 19: 1639-1645. https://doi.org/10.1101/gr.092759.109.

Lee J-H, Choi I-S, Choi B-H, Yang S, Choi G. 2016. The complete plastid genome of Piper kadsura (Piperaceae), an East Asian woody vine. Mitochondrial DNA A DNA Mapp Seq Anal 27 (5): 3555-3556. https://doi.org/10.3109/19401736.2015.1074216.

Letunic I, Bork P. 2021. Interactive Tree Of Life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids Res 49 (W1): W293-W296. https://doi.org/10.1093/nar/gkab301.

Li H-T, Yi T-S, Gao L-M et al. 2019. Origin of angiosperms and the puzzle of the Jurassic gap. Nat Plants 5: 461-470. https://doi.org/10.1038/s41477-019-0421-0.

Li L, Stoeckert Jr CJ, Roos DS. 2003. OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res 13: 2178-2189. https://doi.org/10.1101/gr.1224503.

Louie GV, Baiga TJ, Bowman ME, Koeduka T, Taylor JH, Spassova SM, Pichersky E, Noel JP. 2007. Structure and reaction mechanism of basil eugenol synthase. PLoS One 2 (10): e993. https://doi.org/10.1371/journal.pone.0000993.

Magallón S, Gómez-Acevedo S, Sánchez-Reyes LL, Hernández-Hernández T. 2015. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol 207 (2): 437-453. https://doi.org/10.1111/nph.13264.

Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. 2021. BUSCO update: Novel and streamlined workflows and broader phylogenetic coverage. Mol Biol Evol 38 (10): 4647-4654. https://doi.org/10.1093/molbev/msab199.

Michael TP, VanBuren R. 2015. Progress, challenges and the future of crop genomes. Curr Opin Plant Biol 24: 71-81. https://doi.org/10.1016/j.pbi.2015.02.002.

Ming R, VanBuren R, Liu Y et al. 2013. Genome of the long-living sacred lotus (Nelumbo nucifera Gaertn.). Genome Biol 14: R41. https://doi.org/10.1186/gb-2013-14-5-r41.

Ming R, VanBuren R, Wai CM et al. 2015. The pineapple genome and the evolution of CAM photosynthesis. Nat Genet 47: 1435-1442. https://doi.org/10.1038/ng.3435.

Parham JF, Donoghue PCJ, Bell CJ et al. 2012. Best practices for justifying fossil calibrations. Syst Biol 61 (2): 346-359. https://doi.org/10.1093/sysbio/syr107.

Pavithra B. 2014. Eugenol—A review. J Pharm Sci Res 6 (3): 153-154.

Peden JF. 1999. Analysis of codon usage. [PhD Thesis]. University of Nottingham, Nottingham, UK.

Pichersky E, Lewinsohn E. 2011. Convergent evolution in plant specialized metabolism. Ann Rev Plant Biol 62: 549-566. https://doi.org/10.1146/annurev-arplant-042110-103814.

Potato Genome Sequencing Consortium, Xu X, Pan S et al. 2011. Genome sequence and analysis of the tuber crop potato. Nature 475 (7355): 189-195. https://doi.org/10.1038/nature10158.

Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R. 2005. InterProScan: Protein domains identifier. Nucleic Acids Res 33 (Suppl_2): W116-W120. https://doi.org/10.1093/nar/gki442.

Rasphone S, Dang LT, Ho NTH, Nguyen CQ, Truong HTH. 2022. Phylogenetic analysis of black piper (Piper spp.) population collected in different locations of Viet Nam based on the ITSU1-4 gene region. Res J Biotechnol 17 (7): 1-9.

Roach MJ, Schmidt SA, Borneman AR. 2018. Purge Haplotigs: Allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinformatics 19: 460. https://doi.org/10.1186/s12859-018-2485-7.

Schmutz J, Cannon SB, Schlueter J et al. 2010. Genome sequence of the palaeopolyploid soybean. Nature 463: 178-183. https://doi.org/10.1038/nature08670.

Schnable PS, Ware D, Fulton RS et al. 2009. The B73 maize genome: Complexity, diversity, and dynamics. Science 326 (5956): 1112-1115. https://doi.org/10.1126/science.1178534.

Scrucca L, Fop M, Murphy TB, Raftery AE. 2016. mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. R J 8 (1): 289-317. https://doi.org/10.32614/rj-2016-021.

Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31 (19): 3210-3212. https://doi.org/10.1093/bioinformatics/btv351.

Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH, Zheng C, Sankoff D, de Pamphilis CW, Wall PK, Soltis PS. 2009. Polyploidy and angiosperm diversification. Am J Bot 96 (1): 336-348. https://doi.org/10.3732/ajb.0800079.

Stamatakis A. 2014. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30 (9): 1312-1313. https://doi.org/10.1093/bioinformatics/btu033.

Suyama M, Torrents D, Bork P. 2006. PAL2NAL: Robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34: W609-W612. https://doi.org/10.1093/nar/gkl315.

Tarailo-Graovac M, Chen N. 2009. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics 25 (1): 4.10.1-4.10.14. https://doi.org/10.1002/0471250953.bi0410s25.

Teufel F, Armenteros JJA, Johansen AR, Gíslason MH, Pihl SI, Tsirigos KD, Winther O, Brunak S, von Heijne G, Nielsen H. 2022. SignalP 6.0 predicts all five types of signal peptides using protein language models. Nat Biotechnol 40: 1023-1025. https://doi.org/10.1038/s41587-021-01156-3.

Tiley GP, Barker MS, Burleigh JG. 2018. Assessing the performance of Ks plots for detecting ancient whole-genome duplications. Genome Biol Evol 10 (11): 2882-2898. https://doi.org/10.1093/gbe/evy200.

Tomato Genome Consortium. 2012. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485: 635-641. https://doi.org/10.1038/nature11119.

Truong HTH, Rasphone S, Nguyen BLQ, Ho HN, Nguyen CQ, Tran TT, Hoang TX, Duong TT. 2023. Identification of Piper species that are resistant to Phytophthora capsici, Meloidogyne incognita, and waterlogging in Vietnam. Plant Pathol 72 (9): 1615-1625. https://doi.org/10.1111/ppa.13784.

Tuskan GA, DiFazio S, Jansson S et al. 2006. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313 (5793): 1596-1604. https://doi.org/10.1126/science.1128691.

Van de Peer Y, Mizrachi E, Marchal K. 2017. The evolutionary significance of polyploidy. Nat Rev Genet 18: 411-424. https://doi.org/10.1038/nrg.2017.26.

Vanneste K, Baele G, Maere S, Van de Peer Y. 2014. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous-Paleogene boundary. Genome Res 24: 1334-1347. https://doi.org/10.1101/gr.168997.113.

Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. 2010. KaKs_Calculator 2.0: A toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteomics Bioinformatics 8 (1): 77-80. https://doi.org/10.1016/s1672-0229(10)60008-3.

Wang M-T, Wang J-H, Zhao K-K, Zhu Z-X, Wang H-F. 2018. Complete plastome sequence of Piper laetispicum (Piperaceae): An endemic plant species in South China. Mitochondrial DNA B Resour 3 (2): 1035-1036. https://doi.org/10.1080/23802359.2018.1511850.

Wang X, Wang H, Wang J et al. 2011. The genome of the mesopolyploid crop species Brassica rapa. Nat Genet 43: 1035-1039. https://doi.org/10.1038/ng.919.

Wendel JF, Jackson SA, Meyers BC, Wing RA. 2016. Evolution of plant genome architecture. Genome Biol 17: 37. https://doi.org/10.1186/s13059-016-0908-1.

Most read articles by the same author(s)