Hey Daniel,
I tried out your code examples below. I did have some
surprising observations but there are different from what
you where reporting. I replaced some of your functionality.
I adapted the select_event function to simply return the
complement of a given base. I removed the randomness factor
to select the index and simply used every index to be
converted. I loaded the chr22 sequence of the human genome
(~50 Mb) and measured the time of running 50 times a) the
replicate function and b) the inner loop with the
assignment. I did the experiments with the
seqan::String<Dna5>, std::vector<Dna5> ,
std::basic_string<Dna5> and std::string. I also
implemented a replicate3 function which performs best as it
reduces the number of copying whole Strings.
I did the parsing over the index with an c++11
range-based for loop and the standard for loop.
Here are my results built in release mode on a 2.3 GHz
Core i7.
All times are the sum of 50 experiments.
C++11 style:
Seqan String Time: 11.18 s. Inner
Loop: 2.58064 s.
STL Vector Time: 10.9798 s. Inner Loop: 2.53835 s.
STL Basic String Dna5 Time: 10.6501 s. Inner Loop: 3.94554 s.
STL Basic String Char Time: 11.4799 s. Inner Loop: 4.85506 s.
replicate3 Time:
8.67172 s. Inner Loop: 2.52474 s.
C++98 style
Seqan String Time: 11.0828 s. Inner
Loop: 2.49667 s.
STL Vector Time: 10.9178 s. Inner Loop: 2.54614 s.
STL Basic String Dna5 Time: 10.9048 s. Inner Loop: 4.20024 s.
STL Basic String Char Time: 12.3184 s. Inner Loop: 5.61231 s.
repliacte3 Time: 9.55719 s. Inner Loop: 3.30052 s.
As you can see the replicate3 function outperforms the
other versions, however the inner loop gets slower when
using the standard for loop, and I am not quite sure that I
completely understand why, because I can't observe the same
performance drop in the replicate2 function.
However, when comparing results with the C++11 version
the assignment of the seqan::String is like the std::vector
and faster than the std::string versions.
Can you please give us some information about the
dimension of you problem. How many sequences are you
replicating? How long are the sequences?
Please consider the following performance boosters.
Always prefer passing parameters by const-reference over
passing them by copy (as long as you are sure these are not
just simple types). Copying a big container with many values
is slower than copying a 4/8 Byte reference :).
I also appended the benchmark file. So maybe you can run
the tests on your machine and report your experience.