Data from: Has gene expression neofunctionalization in the fire ant antennae contributed to queen discrimination behavior?

  • Viet Dai Dang (Contributor)
  • Silvia Fontana (Contributor)
  • Amir B. Cohanim (Contributor)
  • John Wang (Contributor)
  • Eyal Privman (Contributor)

Dataset

Description

Queen discrimination behavior in the fire ant Solenopsis invicta maintains its two types of societies: colonies with one (monogyne) or many (polygyne) queens, yet the underlying genetic mechanism is poorly understood. This behavior is controlled by two supergene alleles, SB and Sb, with ~600 genes. Polygyne workers, having either the SB/SB or SB/Sb genotype, accept additional SB/Sb queens into their colonies but kill SB/SB queens. In contrast, monogyne workers, all SB/SB, reject all additional queens regardless of genotype. Because the SB and Sb alleles have suppressed recombination, determining which genes within the supergene mediate this differential worker behavior is difficult. We hypothesized that the alternate worker genotypes sense queens differently because of the evolution of differential expression of key genes in their main sensory organ, the antennae. To identify such genes, we sequenced RNA from four replicates of pooled antennae from three classes of workers: monogyne SB/SB, polygyne SB/SB, and polygyne SB/Sb. We identified 81 differentially expressed protein-coding genes with 13 encoding potential chemical metabolism or perception proteins. We focused on the two odorant perception genes: an odorant receptor SiOR463 and an odorant binding protein SiOBP12. We found that SiOR463 has been lost in the Sb-genome. In contrast, SiOBP12 has an Sb-specific duplication, SiOBP12b’, which is expressed in the SB/Sb worker antennae, while both paralogs are expressed in the body. Comparisons with another fire ant species revealed that SiOBP12b’ antennal expression is specific to S. invicta and suggests that queen discrimination may have evolved, in part, through expression neofunctionalization.,Tux-gnH_genes.gtf A gtf file containing genes assembled by mapping 12 RNA-seq libraries onto the fire ant genome version H (Si_gnH), following the Tuxedo Suite method. We first controlled the quality of the sequence reads of the 12 antennal cDNA libraries (Nugen amplified) by retaining reads passing a base-quality threshold of 20 and minimum length of 30 bases using cutadapt 1.16. Next, we used TopHat v2.0.11 to map the quality-controlled reads from all 12 libraries onto Si_gnH with the guidance of the OGS (i.e., -G). Subsequently, we ran Cufflinks v2.2.1 to assemble a gene set for each library followed by Cuffmerge v1.0.0 to generate Tux-gnH, resulting in 44,521 putative genes. OGS-plus_genes.fasta We added only potential protein-coding transcripts, defined here as those encoding peptides with an open reading frame of at least 50 amino-acids and having similarity to genes in the NCBI non-redundant reference (NCBI nr), from the Tux-gnH_genes to the OGS. This added 4,114 genes to the OGS – for a total of 19,230 putative protein coding genes (“OGS-plus”). Among these, 628 genes are located in the supergene, which is defined following the genetic map (Wang et al. 2013), the right supergene boundary (Huang et al. 2018), and five additional scaffolds (Pracana et al. 2017). Tux-genes_BWA_HTseqCountTable.txt A summary table of the Tux-gnH gene expression estimated by HTSeq-count. We first controlled the quality of the sequence reads of the 12 antennal cDNA libraries (Nugen amplified) by retaining reads passing a base-quality threshold of 20 and minimum length of 30 bases using cutadapt 1.16. Next, we used BWA mem v.0.7.12 to map the quality-controlled reads from all 12 libraries onto Si_gnH. We used HTSeq-count v0.6.1p1 to estimate gene expression with the predicted genic regions using Tux-gnH_genes.gtf. This analysis was used to compare with the transcriptome-mapping analysis for choosing the appropriate analysis method. JH1to12-vs-sinvgnHB-Oksana-and-Dai-cDNA.genes.results.matrix OGS-plus gene expression matrix generated by RSEM based on Bowtie2 mapping results. We first controlled the quality of the sequence reads of the 12 antennal cDNA libraries (Nugen amplified) by retaining reads passing a base-quality threshold of 20 and minimum length of 30 bases using cutadapt 1.16. We mapped the quality-controlled reads from all 12 libraries onto OGS-plus using Bowtie 2 v2.2.6. Next we used RSEM v1.3.0 to estimate gene expression levels from the mapped reads. We identified DEGs using the EBSeq package v1.2.0. Tux-gnH_BWA_HTSeq_edgeR.tar.gz Test of differentially expressed genes of the Tux-gnH gene set using the edgeR package, based on BWA mem mapping results and HTSeq-count gene expression estimation. We first controlled the quality of the sequence reads of the 12 antennal cDNA libraries (Nugen amplified) by retaining reads passing a base-quality threshold of 20 and minimum length of 30 bases using cutadapt 1.16. Next, we used BWA mem to map the quality-controlled reads from all 12 libraries onto Si_gnH. We used HTSeq-count to estimate gene expression with the gene predicted region using Tux-gnH_genes.gtf. We identified DEGs using the edgeR package. JH1to12-vs-sinvgnHB-Oksana-and-Dai-cDNA.genes.results.matrix.ebseq-results Summary results of differentially expressed genes in the OGS-plus gene set using the EBSeq package. We first controlled the quality of the sequence reads of the 12 antennal cDNA libraries (Nugen amplified) by retaining reads passing a base-quality threshold of 20 and minimum length of 30 bases using cutadapt 1.16. We mapped the quality-controlled reads from all 12 libraries onto OGS-plus using Bowtie 2 v2.2.6. Next we used RSEM to estimate gene expression levels from the mapped reads. We identified DEGs using the EBSeq package. SiOBP12bprime.fasta Full length cDNA sequence of SiOBP12b' obtained by combining 5' and 3' RACE results. blastn_17k3OBPcluster_gnH_sgemt3p_fmt11 Blastn output shows that the 17.3 kb OBP gene cluster containing Gp-9 (Gp-9, OBP4, OBP13, OBP12) is conserved in S. invicta (Si_gnH.scaffold00008:424,900-442,200) and S.geminata (draft PacBio genome, version t2p, unpublished). littleb_t2p_000102F.fa The Sb sequence used in the Mauve genome alignment. Figure 3 was constructed using the SiOBP12b' region only (from nucleotide 621,784 to 824,165). This sequence is from a draft PacBio Sb genome version t2p. BigB_100kb_around_OBP12bprime The SB sequence, which is homologous to the SiOBP12b' region, used in the Mauve genome alignment (figure 3). This sequence was manually constructed (figure S8) using PacBio SB whole genome sequencing reads (table S7). Sgem_000075F The Solenopsis geminata sequence used in the Mauve genome alignment. Figure 3 was constructed using the region homologous to SiOBP12b' only (from nucleotide 950,000 to 1,122,875). This sequence is from a draft PacBio S. geminata genome version t3p. blastn_OR163_in_BigBt4p_fmt11 Blastn output showing that SiOR463 is absent in the draft Pacbio Sb genome, version t2p. Also, that SiOR462 has an insertion of a RTEX-1_BPa transposon in the 6th exon. blastn_OR163_in_littlet2p_fmt11 Blastn output showing high similarity between SgOR463, SgOR462 and SgOR163, compared to SiOR163 (NCBI RefSeq XM_026133769) in the draft Pacbio S. geminata genome, version t3p (unpublished). blastn_OR163_in_Sgemt3p_fmt6 Blastn output shows high similarity between SgOR463, SgOR462 and SgOR163, compared to SiOR163 (NCBI RefSeq XM_026133769) in the draft Pacbio S. geminata genome, version t3p (unpublished). blastn_SiOR163_sgemt3p_fmt6 Sgem_mono_t3p_045F.fa The S. geminata sequence used in the Mauve genome alignment. Figure S4 was constructed using the region homologous to SiOR163 only (from nucleotide 1,194,856 to 1,234,698). This sequence is from a draft PacBio S. geminata genome version t3p. Sinv_BigBt4p_892F.fa The S. invicta sequence used in the Mauve genome alignment. Figure S4 was constructed using the SiOR163 region only (from nucleotide 61,791 to 101,792). This sequence is from a draft PacBio SB genome version t4p. littleb_t2p.000200F The Sb sequence used in the Mauve genome alignment (figure S4). This sequence is from a draft PacBio Sb genome version t2p. OBP12_Sanger_rep1_RJ1.tar.gz Sanger sequencing trace profiles of the expressed cDNA of SiOBP12 in different SB/Sb worker body parts: antennae (RJ1A), head (RJ1H) and body (RJ1B). The body parts are from the same individuals of polygyne colony RJ1. The SiOBP12 cDNAs (SiOBP12B and SiOBP12b') were amplified using the OBP12Bb*-3'UTR and the OBP12Bb*-5'UTR primers and sequenced using the OBP12-F227R primer without purification. OBP12_Sanger_rep2_RJ6.tar.gz Sanger sequencing trace profiles of the expressed cDNA of SiOBP12 in different SB/Sb worker body parts: antennae (RJ6A), head (RJ6H) and body (RJ6B). The body parts are from the same individuals of polygyne colony RJ6. The SiOBP12 cDNAs (SiOBP12B and SiOBP12b') were amplified using the OBP12Bb*-3'UTR and the OBP12Bb*-5'UTR primers and sequenced using the OBP12-F227R primer without purification. These results were used to construct the figure S10. OBP12_Sanger_rep3_RJ8.tar.gz Sanger sequencing trace profiles of the expressed cDNA of SiOBP12 in different SB/Sb worker body parts: antennae (RJ8A), head (RJ8H) and body (RJ8B). The body parts are from the same individuals of polygyne colony RJ8. The SiOBP12 cDNAs (SiOBP12B and SiOBP12b') were amplified using the OBP12Bb*-3'UTR and the OBP12Bb*-5'UTR primers and sequenced using the OBP12-F227R primer without purification. OBP12_Sanger_ref10B.1b.tar.gz SiOBP12B and SiOBP12b' cDNAs were cloned into plasmids and diluted to the same stock concentration. We sequenced a mixture of 10X SiOBP12B to 1X SiOBP12b'. The sequence trace profile shows that SNPs representing SiOBP12B were more prevalent (higher trace peaks). OBP12_Sanger_ref1B.10b.tar.gz SiOBP12B and SiOBP12b' cDNAs were cloned into plasmids and diluted to the same stock concentration. We sequenced a mixture of 1X SiOBP12B to 10X SiOBP12b', using the OBP12-F227R primer. The sequence trace profile shows that SNPs representing SiOBP12b' were more prevalent (higher trace peaks). OBP12_Sanger_ref1B.4b.tar.gz SiOBP12B and SiOBP12b' cDNAs were cloned into plasmids and diluted to the same stock concentration. We sequenced a mixture of 1X SiOBP12B to 4X SiOBP12b', using the OBP12-F227R primer. The sequence trace profile shows that SNPs representing SiOBP12b' were more prevalent (higher trace peaks), but peak heights did not reflect the ratio of the two plasmids (compared to the trace profiles of other dilution ratios). OBP12_Sanger_ref1B.1b.tar.gz SiOBP12B and SiOBP12b' cDNAs were cloned into plasmids and diluted to the same stock concentration. We sequenced a mixture of 1X SiOBP12B to 1X SiOBP12b', using the OBP12-F227R primer. The sequence trace profile showed that SNPs representing SiOBP12B and SiOBP12b' were at similar levels. This result was used to construct figure S10 (as reference).,
Date made available2019 Nov 1
PublisherUnknown Publisher

Cite this