Con Kolivas
|
db873ba202
|
Add worksize and vector attribute hints to the poclbm kernel.
|
13 years ago |
Con Kolivas
|
325246d309
|
Spaces for non-aligned variables in poclbm.
|
13 years ago |
Con Kolivas
|
ab3e63ff42
|
More tidying of poclbm.
|
13 years ago |
Con Kolivas
|
0af7bbd74f
|
Swap Vals and W variables where they can overlap in poclbm.
|
13 years ago |
Con Kolivas
|
00796251e8
|
More tidying of poclbm.
|
13 years ago |
Con Kolivas
|
e1d580be70
|
Tidy up first half of poclbm.
|
13 years ago |
Con Kolivas
|
338f6d5788
|
Clean up use of any() by diablo and poclbm kernels.
|
13 years ago |
Con Kolivas
|
a7a9dbcf90
|
Minor variable symmetry changes in poclbm.
|
13 years ago |
Con Kolivas
|
c297f63e5e
|
Put additions on separate lines for consistency in poclbm.
|
13 years ago |
Con Kolivas
|
3fa8613557
|
Consolidate last use of W11 into Vals4 in poclbm.
|
13 years ago |
Con Kolivas
|
4885b02e32
|
Add last value in vectors in diablo and poclbm kernel for consistency with original code.
|
13 years ago |
Con Kolivas
|
40b18d5d01
|
Use the unrolled option for no vectors return code.
|
13 years ago |
Con Kolivas
|
d8f14fd666
|
Cluster Vals7 for use on output.
|
13 years ago |
Con Kolivas
|
76d0554d76
|
Get rid of extra char which is just truncated in poclbm kernel.
|
13 years ago |
Con Kolivas
|
d32cd583ac
|
Reinstate the old output mechanism setting output[FOUND] per vector.
|
13 years ago |
Con Kolivas
|
e9889a384d
|
Revert kernels that are designed for newer hardware and SDKs to 2.3.0 release style.
|
13 years ago |
Con Kolivas
|
fb077c6d59
|
Pass vectors * worksize to kernel to avoid one op.
|
13 years ago |
Con Kolivas
|
70e8ade54f
|
Revert behaviour to old nonce init code.
|
13 years ago |
Con Kolivas
|
bce47064b6
|
Revert use of any() in output code in poclbm kernel. Slower.
|
13 years ago |
Con Kolivas
|
b0a01be319
|
Revert use of any() in output code in poclbm kernel. Slower.
|
13 years ago |
Con Kolivas
|
df58517626
|
Extra byte was being unused and leading to failure on some platforms.
|
13 years ago |
Con Kolivas
|
93459839c8
|
Explicitly type the constants in poclbm kernel as uint.
|
13 years ago |
Con Kolivas
|
0bde957912
|
Update all kernel version names.
|
13 years ago |
Con Kolivas
|
8f08a775ad
|
Use any() in kernel output code and revert breakage of diakgcn kernel.
|
13 years ago |
Con Kolivas
|
145f3c0b1d
|
Put the nonce for each vector offset in advance, avoiding one extra addition in the kernel.
|
13 years ago |
Con Kolivas
|
5e31785e7b
|
Increase poclbm version number.
|
13 years ago |
Con Kolivas
|
49c28b3929
|
Use PreVal4addT1 instead of PreVal4 in poclbm kernel.
|
13 years ago |
Con Kolivas
|
5c4df1309a
|
Import PreVal4 and PreVal0 into poclbm kernel.
|
13 years ago |
Con Kolivas
|
f5c296785f
|
Import more prepared constants into poclbm kernel.
Conflicts:
poclbm120213.cl
|
13 years ago |
Con Kolivas
|
734dfecec5
|
Keep variables in one array but use Vals[] name for consistency with other kernel designs.
|
13 years ago |
Con Kolivas
|
3f9e34a53c
|
Replace constants that are mandatorily added in poclbm kernel with one value.
|
13 years ago |
Con Kolivas
|
b941146c29
|
Remove addition of final constant before testing for result in poclbm kernel.
|
13 years ago |
Con Kolivas
|
81cb584586
|
Hand optimise variable addition order.
|
13 years ago |
Con Kolivas
|
dc2d553d5b
|
Hand optimise first variable declaration order in poclbm kernel.
|
13 years ago |
Con Kolivas
|
f39fac9e4d
|
Third pass reorder.
|
13 years ago |
Con Kolivas
|
b754fb8f4e
|
2nd pass radical reorder.
|
13 years ago |
ckolivas
|
e2b3c85d59
|
Radical reordering machine based first pass to change variables as late as possible, bringing their usage close together.
|
13 years ago |
Con Kolivas
|
57dad38d04
|
Unroll all additions to enable further optimisations.
|
13 years ago |
Con Kolivas
|
64acb9dae7
|
Increase version numbers of modified kernels.
|
13 years ago |
Con Kolivas
|
210fe9d5b9
|
Constify nonce in poclbm.
|
13 years ago |
Con Kolivas
|
60f8ccb313
|
Use local and group id on poclbm kernel as well.
|
13 years ago |
Con Kolivas
|
8be9d13ff2
|
Further generic microoptimisations to poclbm kernel.
|
13 years ago |
Con Kolivas
|
cad84c6f2c
|
Change poclbm version number.
|
13 years ago |
Con Kolivas
|
4f1676f67f
|
One array is faster than 2 separate arrays so change to that in poclbm kernel..
|
13 years ago |
Con Kolivas
|
f5903e609d
|
Microoptimisations to poclbm kernel which increase throughput slightly.
|
13 years ago |
Con Kolivas
|
2fa142d1ce
|
One array is faster than 2 separate arrays so change to that in poclbm kernel..
|
13 years ago |
Con Kolivas
|
1355859742
|
Microoptimisations to poclbm kernel which increase throughput slightly.
|
13 years ago |
Con Kolivas
|
ebaa2be1df
|
Update poclbm kernel for better performance on GCN and new SDKs with bitalign support when not BFI INT patching.
Update phatk kernel to work properly for non BFI INT patched kernels, providing support for phatk to run on GCN and non-ATI cards.
|
13 years ago |
Con Kolivas
|
3567b69e5e
|
Remove fragile source patching for bitalign, vectors et. al and simply pass it with the compiler options.
|
13 years ago |
Con Kolivas
|
6d10ef2f6e
|
Bump version numbers of kernels to indicate slightly different versions.
|
13 years ago |