The genome was sequenced at 90X coverage using PacBio technology and was assembled with Canu. Contigs were error-corrected with Quiver and Pilon and scaffolded with Hi-C. Gaps were filled with PacBio sequence using PBJelly.
contig total length: 1,254,763,366 bp
no contigs: 17,507
longest contig: 3,446,189 bp
contig N50: 139,475 bp
final assembly total length: 1,269,715,057 bp
no sequences: 3,096
final assembly N50: 93,894,821 bp
longest sequence: 133,820,975 bp
89.7% of the assembly is found in 12 scaffolds
Augustus and snap were trained to use as predictors in the Maker pipeline. Sequence data used as input includes: PacBio IsoSeq, RNA-seq from leaves, fruit, and flowers, proteins from S. lycopersicum, S. pennellii, and refseq. A total of 37,938 genes were predicted, 34,240 of these are located on pseudomolecules.