(A) GC content variance around CO breakpoints (blue dots and line). The window 0 on the x-axis is the GC content of the breakpoints and the negative and positive ilove values represent the distance away from the breakpoints. Each of these windows is defined as 2 kb sequence and the GC content is calculated for each window. The red dots and line are one of the GC content random samples simulated like the numbers of CO breakpoints (blue dot and line). After 10,000 repeats, not one of random samples is as extreme as the observed (blue line) (P <0.0001). (B) Relationship between recombination and GC content. When the chromosomes are dissected into 10 kb non-overlapping regions, recombination rate (cM/Mb) and GC content can be obtained for each of them. After the bins are sorted by the GC content, the windows are divided into 31 groups based on GC content (approximately 20% to 51%, 1% interval), and the average (and s.e.m.) recombination rates reported for each group.
In both we dissect the genome into 10 kb non-overlapping windows of which there are 19,297. First, we ask about the raw correlation between GC% and cM/Mb for these windows, which as expected is positive and significant (Spearman’s rho = 0.192; P <10 -15 ). Second, we wish to know the average effect of increasing one unit in either parameter on the other. Given the noise in the data (and given that current recombination rate need not imply the ancestral recombination rate) we approach this issue using a smoothing approach. We start by rank ordering all windows by GC content and then dividing them into blocks of 1% GC range, after excluding windows with more than 10% ‘N'. The resulting plot is highly skewed by bins with very high GC (55% to 58%) as these have very few data points (Additional file 1: Figure S10E) (the same outliers likely effect the raw correlation too). Removing these three results in a more consistent trend (Additional file 1: Figure S10F). This also suggests that below circa 20% GC the recombination rate is zero (Additional file 1: Figure S10F). Removing those with GC <20% and, more generally, any bins with fewer than 100 windows (all bins with GC < 20% have fewer than 100 windows) leaves 18,680 (96.8%) of the windows, these having a GC content between approximately 20% and 51%.
Matchmaking ranging from recombination and you may GC-blogs
By the observance, we estimate you to normally a-1 cm/Mb increase in recombination rates try on the a boost in GC articles of around 0.5%. Having said that a 1% rise in GC articles represents an approximately dos cM/Mb increase in recombination speed. I stop you to given the apparent rareness of NCO gene sales, at least on bee genome, extrapolation out of GC blogs in order to average crossing-more rates hence appears to be justifiable, at least getting GC posts more 20%. We mention as well that at the high GC information the newest recombination price may be over or underestimated. This could mirror a good discordance anywhere between current and you will earlier in the day recombination cost.
These are used to construct Profile 4B, hence merchandise a somewhat audio-totally free (once smoothing) monotonic relationship between them variables
Crossing-more than rates is also associated with the nucleotide diversity, gene occurrence, and you will duplicate amount version places (Figure S11-S13 in More document step 1) . Considering all of our elimination of hetSNPs out-of analysis aforementioned outcome is perhaps not trivially a CNV relevant artifact. The okay-size analyses reveal a positive correlation ranging from nucleotide range and recombination rates after all new bills of 10, one hundred, two hundred, or 500 kb series screen (Profile S11 in Even more document 1). So it bolsters prior analyses, one of and this advertised the development however, found it are non-extreme, when you are several other stated a development anywhere between population genetic estimates from recombination and you can genetic variety. The fresh new development accords toward insight that recombination causes shorter Hill-Robertson disturbance therefore enabling significantly lower rates away from hitchhiking and you will background choice, thus helping higher variety. We in addition to find a powerful bad relationship anywhere between recombination and you will gene density (Profile S12 inside Even more file 1) and you may an effective self-confident relationship ranging from recombination in addition to length of multi-backup regions within individuals windows designs (Figure S13 into the Even more document step 1). New relationship with CNVs is in keeping with a job having non-allelic recombination creating duplications and you may deletions thru irregular crossing-over .