This setting allows to set the GPU intensity value directly without any modifiers, it does not
get any more raw than this! Look at the xintensity description raw for examples of regular
intensity values. You can also set this value through the ncurses interface by pressing:
G -> A -> select device id -> enter value.
Minor xintensity code cleanup as well.
Conflicts:
driver-opencl.c
miner.h
sgminer.c
alexkarnew: (for driver 13.4 and newer, and cgminer 3.3.1)
alexkarold: (for older drivers than 13.4, and cgminer 3.3.1)
https://litecointalk.org/index.php?topic=4082.0
> I was able to optimize the code of cgminer's scrypt.cl.
> It gives 0-3% increase, depending on the drivers and hardware.
> 1. Without optimization, when "CO" is used, every time
> z+x*zSIZE+y*xSIZE*zSIZE is calculated.
> I have created "CO" variable, and made so that x*SIZE is calculated only
> once. Now, when "CO" is used, every time z+y*xSIZE*zSIZE is calculated.
> In one case, variable y is incremented by 1 after 8 "CO" calculations.
> I have created "CO_tmp" variable, where contains result of xSIZE*zSIZE.
> And after 8 "CO" calculations I add "CO_tmp" to "CO".
> Now, when "CO" is used, every time only z is calculated. It is faster as
> z+x*zSIZE+y*xSIZE*zSIZE :)
> In other case when "CO" is used, every time z+y*xSIZE*zSIZE is
> calculated, but it faster than z+x*zSIZE+y*xSIZE*zSIZE too.
> 2. I have replaced multiplication by 2 with bit rotation - it is faster.
> For 7xxx cards you can try to set --thread-concurrency equal to (2^n + 1).
> It may give a little more mining speed.
> For example: 16385 (it is 2^14 + 1), 8193 (2^13 + 1), or 4097 (2^12 + 1).
> I have almost no information, how it works on other series.
> LMqRcHdwnZtTMH6c2kWoxSoKM5KySfaP5C
Changed encoding to UTF-8.
Will not build with sgminer (fix in next commit).
http://www.reddit.com/r/dogecoin/comments/1ui3bx/increase_such_hashrate_1_to_5_scrypt_tweaking/ceir5na
> It is pretty much stock, except that I have removed all the #pragma
> unrolls, and optimized the inner scrypt_core loop. #pragma unroll does
> not give any speedup here.
> The idea is to move the "if (j&1)" comparison to outside of the lookup
> loops. Then, if j&1 happens to be zero, the V[z] and X[z] loops can be
> combined to a single loop, which gives the speedup!
> This loop and the salsa function are the most important places in the
> entire source, it probably spends over 90% of time in here.. There's
> very little to be gained outside of these, I think.
> Donations: DQj4t2DFMQtXofhstouyZw1sYUKWUJn4wv
https://github.com/veox/sgminer/issues/4#issuecomment-32753290
> Most of these optimized kernels (including mine), have fixed
> lookup-gap=2. However, I have never seen anyone use any other value, for
> any GPU, so I think you could just remove the configurable value.
> Or with some #if LOOKUP_GAP==2 magic it is of course possible to make
> such source that allows any value.
> Some users have reported slightly slower hashrate with my kernel as
> well, but this could be some misconfiguration also.. If scrypt kernel
> becomes faster, you may need to lower the GPU engine clock to get full
> speed. Same as if you increase GPU clock too high, you will get a drop
> in hash rate.
> My source is free to use in sgminer. And if you diff to original you
> will see that the changes are not very big.
> Removing of #pragma unrolls helps in any GPU, in my opinion.. Current
> compilers know better when unrolling helps.
...and ensure we don't change do something weird with the fan when initially setting user-defined speed flag.
Fixes crash on R9 cards. Thanks to tkg for hints on what was wrong!
RPM preferred over percent since writing RPM back is always supported (while percent is not in some cases).
Conflicts:
adl.c
Changing `intensity` and to it from `xintensity` works fine.
Changing `xintensity` sometimes fails to enqueue kernel.
For example, starting with --xintensity=128 (on a 5850) and
then changing to 64, 42, or 100 is reliable. However, changing to 127 is
not, and produces
[04:13:01] Error -54: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)
[04:13:01] GPU 0 failure, disabling!
Manually enabling the disabled GPU is successful, but the GPU no longer
submits shares.
This might be a hardware limitation.
All of this is credited to ArGee of RGMiner, he did the initial ground work for this setting.
This new setting allows for a much finer grained intensity setting and also opens up for dual gpu threads on devices not previously able to. Note: make sure to use lower thread-concurrency values when you increase cpu threads.
Intensity is currently used to spawn GPU threads as a simple 2^value setting.
I:13 = 8192 threads
I:15 = 32768 threads
I:17 = 131072 threads
I:18 = 262144 threads
I:19 = 524288 threads
I:20 = 1048576 threads
Notice how the higher settings increase thread count tremendously.
Now enter the xintensity setting (Yes, I am a genius with my naming convention!).
It is simply a shader multiplier, obviously based on the amount of shaders you got on a card, this should allow the same value to scale with different card models.
6970 with 1536 shaders: xI:64 = 98304 threads
R9 280X with 2048 shaders: xI:64 = 131072 threads
R9 290 with 2560 shaders: xI:64 = 180224 threads
R9 290X with 2816 shaders: xI:64 = 163840 threads
6970 with 1536 shaders: xI:300 = 460800 threads
R9 280X with 2048 shaders: xI:300 = 614400 threads
R9 290 with 2560 shaders: xI:300 = 768000 threads
R9 290X with 2816 shaders: xI:300 = 844800 threads
It's now much easier to control thread intensity and it potentially allows for a uniform way of setting the intensity on your system. I'm very interested in constructive feedback, as I do not have access to a lot of different card models.
This change has been tested on 6970, R9 290, R9 290X - all with equal or a little better speeds than regular intensity setting after a little tuning, but your mileage may vary. Don't fret it, if this doesn't work for you, the regular intensity setting is still available.
Conflicts:
driver-opencl.c
sgminer.c
The names will be used throughout the display in the program, when not set "Pool 1" will
simply be used instead. The names are not exposed through API yet, it's on my TODO list.
Use "poolname" like this:
{
"poolname" : "Example pool",
"url" : "stratum+tcp://example.com:8080",
"user" : "y",
"pass" : "x"
},
Conflicts:
sgminer.c
util.c