04-24-2012 09:03 PM
Maybe you can increase the lenght of the string subset by (m-1) while keeping the index positions the same (where "m" is the lenght of the matched string). 😉
04-25-2012 03:43 AM
Maybe you can increase the lenght of the string subset by (m-1)
while keeping the index positions the same (where "m" is the lenght of the matched string)
I do not understand what you mean ![]()
"increase the lenght of the string subset by (m-1), while keeping the index positions the same"
where do that ? in the code of "Count or parralel Count" ?
in any cases, this will not prevent to split the main string in the middle of a substring
I search and I count the substrint "ABCD" ...
main string : zzzzzzzzzzABCDzzzzzzzzzzzz
split : zzzzzzzzzzAB | CDzzzzzzzzzzzz (ABCD will not be found)
sorry, i don't understand your comment (Maybe you can increase...)
04-25-2012 04:18 AM - edited 04-25-2012 04:18 AM
You simply make each subset longer so there is a small overlap. If the search string is 4 characters, make the subset 3 characters longer (characters shown in red). Now you get exactly one match (underlined), no matter where the cut is (either in the first or in the second substring).
For example like this:

Here are some examples.
zzzzzzzzzzzz|ABC
|ABCDzzzzzzzzzzzzzz
zzzzzzzzzzzzA|BCD
|BCDzzzzzzzzzzzzzz
zzzzzzzzzzzzAB|CDz
|CDzzzzzzzzzzzzzz
zzzzzzzzzzzzABC|Dzz
|Dzzzzzzzzzzzzzz
zzzzzzzzzzzzABCD|zzz
|zzzzzzzzzzzzzzz
04-25-2012 04:38 AM
ok, understood.
an overlap of (m-1) ... yes, of course, obviously ...smart idea !
thank you altenbach
04-25-2012 02:22 PM
Q6600 - 4 core - 2.6Ghz
main string : 50e6 char
count the number of "abc"
4 core => N/C=4
N=4 C=disable 64ms cpu 67% 0
N=4 C=1 64ms cpu 67% -0
N=8 C=2 62ms cpu 64% -2
N=16 C=4 60ms cpu 64% -2
N=32 C=8 53ms cpu 60% -7 max delta
N=64 C=16 48ms cpu 60% -5
N=128 C=32 44ms cpu 55% -4
N=256 C=64 43ms cpu 53% -1 max speed
N=512 C=128 43ms cpu 53% -0
04-25-2012 03:02 PM
by cons, with your "Count_Parallel" code, the speed does not change
if I change "reshape array / dimesion size" and "For Loop / C" ... it's curious.
04-25-2012 03:28 PM
@ouadji wrote:
by cons, with your "Count_Parallel" code, the speed does not change
if I change "reshape array / dimesion size" and "For Loop / C" ... it's curious.
This post does not provide sufficient information to be useful.