polygon offset complicate. so i looked for a solution to remove this offset.
i changed the shader, also the filtering (default: on) use now a correct 3x3 filter:
1 0 1
0 2 0
1 0 1
div: 6
of course a better one would be
1 2 1
2 4 2
1 2 1
div: 16
but this isn't as performant as the simple filter above is. because we need only 5 texture lookups instead of 9, and the result is still good, if you wish we can add a enum to change the pcf filter type once, if there is a need.
testet on NVidia Quatro 570M and on ATI Radeon X1600
"
unfort. i couldn't test it on any ATI system. may there will be still another problem there. if there are still some artefacts. we should try out better fZOffSet value
"
if(
(!win || win->getVisibilityMode() == Window::VM_PARTIAL) &&
!win->isPointerXYWithinVisible(x, y)
) continue;
But it probably should be
if (!win || (win->getVisibilityMode() == Window::VM_PARTIAL) && !win->isPointerXYWithinVisible(x, y)))
continue;
The effect of the bug is to segfault if a non-osgWidgets::Window node hasn't been excluded from picking via NodeMask."
set.
The optimization is based on the observation that matrix matrix multiplication
with a dense matrix 4x4 is 4^3 Operations whereas multiplication with a
transform, or scale matrix is only 4^2 operations. Which is a gain of a
*FACTOR*4* for these special cases.
The change implements these special cases, provides a unit test for these
implementation and converts uses of the expensiver dense matrix matrix
routine with the specialized versions.
Depending on the transform nodes in the scenegraph this change gives a
noticable improovement.
For example the osgforest code using the MatrixTransform is about 20% slower
than the same codepath using the PositionAttitudeTransform instead of the
MatrixTransform with this patch applied.
If I remember right, the sse type optimizations did *not* provide a factor 4
improovement. Also these changes are totally independent of any cpu or
instruction set architecture. So I would prefer to have this current kind of
change instead of some hand coded and cpu dependent assembly stuff. If we
need that hand tuned stuff, these can go on top of this changes which must
provide than hand optimized additional variants for the specialized versions
to give a even better result in the end.
An other change included here is a change to rotation matrix from quaterion
code. There is a sqrt call which couold be optimized away. Since we divide in
effect by sqrt(length)*sqrt(length) which is just length ...
"
they are written to disk, either inline or as an external file. Added support for
this in the .ive plugin. Default of WriteHint is NO_PREFERNCE, in which case it's
up to the reader/writer to decide.
The vertical separation not actually displayed as it is set. So some
display the up and down stereo images style will not be correct.
Someone may forget to change the "Horizontal" to "Vertical" after
copying and pasting the code from above HORIZONTAL_SPLIT code segment.
I've attached the file. By replacing the incorrect "Horizontal" to
"Vertical", the bug is gone.
"
occlusion query result is not ready for retrieval when the app tries to
retrieve it. This fix adds an application-level wait loop to ensure the
result is ready for retrieval. This code is not compiled by default; add "-D
FORCE_QUERY_RESULT_AVAILABLE_BEFORE_RETRIEVAL" to get this code.
Full, gory details, to the best of my recollection:
The conditions under which we encountered this issue are as follows: 64-bit
processor, Mac/Linux OS, multiple NVIDIA GPUs, multiple concurrent draw
threads, VRJuggler/SceneView-based viewer, and a scene graph containing
OcclusionQueryNodes. Todd wrote a small test program that produces an almost
instant crash in this environment. We verified the crash does not occur in a
similar environment with a 32-bit processor, but we have not yet tested on
Windows and have not yet tested with osgViewer.
The OpenGL spec states clearly that, if an occlusion query result is not yet
ready, an app can go ahead and attempt to retrieve it, and OpenGL will
simply block until the result is ready. Indeed, this is how
OcclusionQueryNode is written, and this has worked fine on several platforms
and configurations until Todd's test program.
By trial and error and dumb luck, we were able to workaround the crash by
inserting a wait loop that forces the app to only retrieve the query after
OpenGL says it is available. As this should not be required (OpenGL should
do this implicitly, and more efficiently), the wait loop code is not
compiled by default. Developers requiring this work around must explicitly
add "-D FORCE_QUERY_RESULT_AVAILABLE_BEFORE_RETRIEVAL" to the compile
options to include the wait loop."