Update: UltraVNC 1.4.3.6 and UltraVNC SC 1.4.3.6: https://forum.uvnc.com/viewtopic.php?t=37885
Important: Please update to latest version before to create a reply, a topic or an issue: https://forum.uvnc.com/viewtopic.php?t=37864
Join us on social networks and share our announcements:
- Website: https://uvnc.com/
- GitHub: https://github.com/ultravnc
- Mastodon: https://mastodon.social/@ultravnc
- Facebook: https://www.facebook.com/ultravnc1
- X/Twitter: https://x.com/ultravnc1
- Reddit community: https://www.reddit.com/r/ultravnc
- OpenHub: https://openhub.net/p/ultravnc
Important: Please update to latest version before to create a reply, a topic or an issue: https://forum.uvnc.com/viewtopic.php?t=37864
Join us on social networks and share our announcements:
- Website: https://uvnc.com/
- GitHub: https://github.com/ultravnc
- Mastodon: https://mastodon.social/@ultravnc
- Facebook: https://www.facebook.com/ultravnc1
- X/Twitter: https://x.com/ultravnc1
- Reddit community: https://www.reddit.com/r/ultravnc
- OpenHub: https://openhub.net/p/ultravnc
Scrolling Performance Improvements?
Scrolling Performance Improvements?
Hi,
How is the scrolling performance improvement between RC18 versus RC20? I have the VNC Server installed (RC18) and it is indeed much faster. I am debating upgrading to RC20, but only if it improves the scrolling performance;
I notice that VNC doesn't seem to accelerate scrolling as a bit-block transfer like dragging windows around. It would be nice to see vertical scrolling accelerated in the same manner. Or does RC18 already support this with the driver?
If this is not a feature that exists in RC18, which version of VNC has scroll acceleration (bitblock-transfer)?
How is the scrolling performance improvement between RC18 versus RC20? I have the VNC Server installed (RC18) and it is indeed much faster. I am debating upgrading to RC20, but only if it improves the scrolling performance;
I notice that VNC doesn't seem to accelerate scrolling as a bit-block transfer like dragging windows around. It would be nice to see vertical scrolling accelerated in the same manner. Or does RC18 already support this with the driver?
If this is not a feature that exists in RC18, which version of VNC has scroll acceleration (bitblock-transfer)?
Re: Scrolling Performance Improvements?
Hello
i guess you're missing the video hook driver?
install it and other WinVNC product wouldn't be even comparable choice.
cheers
it must be one of Ultr@VNC's features.Mark Rejhon wrote:If this is not a feature that exists in RC18, which version of VNC has scroll acceleration (bitblock-transfer)?
i guess you're missing the video hook driver?
install it and other WinVNC product wouldn't be even comparable choice.
cheers
Lizard
Got it, oh yes, the DDI driver is indeed much faster! I should have done that first.
I was thinking of a driverless method of scroll zone detection that used relatively little CPU, but yes, driver hooking is much more efficient. That would be useful for systems where installing a DDI driver wasn't feasible for security reasons, etc.
As a computer programmer, the way I thought of possible driverless operating-system-independent scroll zone detection, would be to quickly scan the screen every 32 horizontally and vertically and compare to, say about 100 possible offsets per pixel (total 3,840,000 pixel comparisions, done maybe 5 or 10 times a second, this wouldn't use much CPU on a modern fast system in a well-optimized C routine). If potential scrollzones are detected, finer compares would automatically be done every 16 pixel steps, 8 pixel steps, 4 pixel steps, in an attempt to eliminate scrollzones more quickly, until the potential scrollzone region still existed at the 4 pixel steps, then a full pixel-by-pixel compare using a optimized row-by-row memcmp() routine would be executed. And then a bit blit would execute if a perfect or near-perfect vertical scroll was detected. This would be kinda brute force, but the quick scan (skipping every 32 pixels) would eliminate the CPU penalty of the brute force method, and would allow driverless scroll detection with a reasonable CPU penalty...
Obviously, the driver method would be more elegant, but it would be nice if the VNC server could support efficient operating-system-independent driverless scroll detection, like the above.
(This suggestion is hereby placed in GNU copyleft. Just mention my name if this ever makes a VNC release )
I was thinking of a driverless method of scroll zone detection that used relatively little CPU, but yes, driver hooking is much more efficient. That would be useful for systems where installing a DDI driver wasn't feasible for security reasons, etc.
As a computer programmer, the way I thought of possible driverless operating-system-independent scroll zone detection, would be to quickly scan the screen every 32 horizontally and vertically and compare to, say about 100 possible offsets per pixel (total 3,840,000 pixel comparisions, done maybe 5 or 10 times a second, this wouldn't use much CPU on a modern fast system in a well-optimized C routine). If potential scrollzones are detected, finer compares would automatically be done every 16 pixel steps, 8 pixel steps, 4 pixel steps, in an attempt to eliminate scrollzones more quickly, until the potential scrollzone region still existed at the 4 pixel steps, then a full pixel-by-pixel compare using a optimized row-by-row memcmp() routine would be executed. And then a bit blit would execute if a perfect or near-perfect vertical scroll was detected. This would be kinda brute force, but the quick scan (skipping every 32 pixels) would eliminate the CPU penalty of the brute force method, and would allow driverless scroll detection with a reasonable CPU penalty...
Obviously, the driver method would be more elegant, but it would be nice if the VNC server could support efficient operating-system-independent driverless scroll detection, like the above.
(This suggestion is hereby placed in GNU copyleft. Just mention my name if this ever makes a VNC release )
Actually, my 3,840,000 number is way too high. I redid the calculations. It's actually ONLY 120,000 pixel copmarisions. That'd make scrollzone detection viable even on slower computersm even 500 Mhz computers. My numbers are based on the following following numbers:
Resolution: 1280 x 1024
Horizontal/vertical pixel step: 32
Number of pixels at 32 pixel step: 40 x 30 = 1200 pixels
Number of pixel comparisions per pixel: 100
Total number of pixel comparisions: 1200 x 100 = 120,000
100 comparisions would apply to for all possible vertical scroll steps -50 through +50 pixel scroll. Basically a grid of pixels spaced vertically and horizontally 32 pixels apart, at 1280x1024, would be a total of 1200 pixels to compare at 100 different offsets, for a grand total of 120,000 pixel ccompares.
A compare of the screen could be quickly done at 32 step, and if a potential scroll is detected, a somewhat more CPU-intensive re-compare would occur at 16 step. If a potential scroll zone is not ruled out, execute an 8 or 4 step compare (only at the potentially detected scroll offsets during the 16 step compare). Finally, do a pixel-by-pixel compare (only at the potentially detected offsets during the 8 or 4 step compare).
Several optimizations can be achieved:
Resolution: 1280 x 1024
Horizontal/vertical pixel step: 32
Number of pixels at 32 pixel step: 40 x 30 = 1200 pixels
Number of pixel comparisions per pixel: 100
Total number of pixel comparisions: 1200 x 100 = 120,000
100 comparisions would apply to for all possible vertical scroll steps -50 through +50 pixel scroll. Basically a grid of pixels spaced vertically and horizontally 32 pixels apart, at 1280x1024, would be a total of 1200 pixels to compare at 100 different offsets, for a grand total of 120,000 pixel ccompares.
A compare of the screen could be quickly done at 32 step, and if a potential scroll is detected, a somewhat more CPU-intensive re-compare would occur at 16 step. If a potential scroll zone is not ruled out, execute an 8 or 4 step compare (only at the potentially detected scroll offsets during the 16 step compare). Finally, do a pixel-by-pixel compare (only at the potentially detected offsets during the 8 or 4 step compare).
Several optimizations can be achieved:
- Do an initial pixel-compare between the screens (previous screen state and current screen state) and not bothering scanning for scroll zones in the screen area of no change to graphics. Use the pixel comparision pattern between the screens (previous screen state and current screen state) to narrow the area of concentration for scroll zone detection. There's no point in bothering to scan for scroll zones in stationary areas. We can automatically calculate the outer bounds of graphics changes and then concentrate our scroll zone scanning in these regions. This will eliminate concentrating on uselessly scanning for scrolls happening in blank areas (such as empty pages, or margins, where the pixels are going to be the same anyway, and does not need to be transmitted)
- When narrowing scroll detection scanning to smaller steps (16 pixelx->8 pixels for example), only scan in the area that wasn't ruled out by the previous scroll detection scanning for the scroll steps that weren't ruled out. This will more quickly eliminate unnecessary scroll zone scanning.
- Use a separate optimized scan C routine for the stepped scanning (16->8->4) and for single-pixel scanning (which can instead be done using a row-by-row memcmp( ) which is much more efficient)
- Horizontal scroll detection would be ignored for now, but this could be later optionally added at a mere doubling in CPU impact, which may be fairly negligible on gigahertz-league systems.
- Diagonal scroll is not going to be efficient on today's computers (but thankfully, this rarely ever happens)
Hi Mark,
The process you're describing is basically what is done in the FastDetectChanges() function that I implemented to speed up the Fullscreen polling that was really too slow (in all Win32 VNCs) without the video driver.
The difference is that I don't use it to detect scrolling zones to only to detect "general" changes.
I'm using a "shifting-cycling" 32x32 pixels grid (one pixel every 32 pixel, 4 pixels shifting at each cycle, but these values can be easely changed). Once I've detected a changed pixel, I get the smallest window containing this pixel then apply the optimized CheckRect() function (that basically checks for 16*16 rects changes but still in a optimized manner).
Maybe you could use the already existing FastDetectChange() function then instead of searching for the smallest surrounding window you could implement your "scrolled area" detection routine.
Just my 2 cents... we are open to all interesting ideas and optimizations anyway
The process you're describing is basically what is done in the FastDetectChanges() function that I implemented to speed up the Fullscreen polling that was really too slow (in all Win32 VNCs) without the video driver.
The difference is that I don't use it to detect scrolling zones to only to detect "general" changes.
I'm using a "shifting-cycling" 32x32 pixels grid (one pixel every 32 pixel, 4 pixels shifting at each cycle, but these values can be easely changed). Once I've detected a changed pixel, I get the smallest window containing this pixel then apply the optimized CheckRect() function (that basically checks for 16*16 rects changes but still in a optimized manner).
Maybe you could use the already existing FastDetectChange() function then instead of searching for the smallest surrounding window you could implement your "scrolled area" detection routine.
Just my 2 cents... we are open to all interesting ideas and optimizations anyway
UltraSam
I used this scrollzone optimization in a remote control program I designed in year 1992 that could multitask remotely under Desqview.... I detected scrollzones in text buffers without BIOS hooks. (So it worked even if the software used assembly language directly on the screen buffer). I used VT100 scroll region codes to optimize vertical scrolls, in both directions. I wrote that scroll zone detector in assembly language, even though I was only scanning for scroll regions at 80x25 text display, rather than a 1280x1024 pixel display. But today's computers should now be fast enough for driverless OS-independent scroll zone detection now, thankfully... (This could be an option that is turned off by default, if only fast computers can handle it, but I am sure that this would still be efficient even on older 400 Mhz Pentium II systems)
Very impressive you're already using some kind of an optimizing algorithm like that. It sounds like your algorithm could be optimized to detect scrolls, with less changes than expected.
Maybe I'll tackle scroll region detection (since I've already done this before, scroll region detection in remote control software , if it will take me only one night to add scroll region detection. It might not be that quick, but now I am curious... Lemme check the SF site to see what the compiler requirements....wonder if UltraVNC compiles in my Visual Studio .NET 2002...
Very impressive you're already using some kind of an optimizing algorithm like that. It sounds like your algorithm could be optimized to detect scrolls, with less changes than expected.
Maybe I'll tackle scroll region detection (since I've already done this before, scroll region detection in remote control software , if it will take me only one night to add scroll region detection. It might not be that quick, but now I am curious... Lemme check the SF site to see what the compiler requirements....wonder if UltraVNC compiles in my Visual Studio .NET 2002...
-
- Posts: 3
- Joined: 2005-03-24 22:23
Okay, I've registered and cookied.
Just so you know, I'm not sure which weekend I'll *attempt* to look at UltraVNC. (I'll only do this scroll zone optimization if I can do it in 1 night).
I'll probably do the following:
1. Check development environment requirements. If reasonable, go to 2:
2. Grab a copy of the latest UltraVNC tree over CVS.
3. Attempt to compile and run the server.
4. If works without too much time lost, then I'll begin playing with it.
5. If the code is well made (easily maintainable, well documented, modular), I may be able to pull off scrollzone optimization in just one night. My advantage is that I've done this before (albiet 10 years ago!)
Obviously, the DDI driver would be still superior, but there are many cases I'm not even *allowed* to install a driver on some of the systems I access, anyway... Plus this may be useful for other platforms in the future (i.e. the other open source projects, such as VNC on MacOS and Linux)
Might not happen for a week or two, but if there are any problems with my approach, now is the time to comment before I decide to spontaneously commit an evening to this challenge...
PS -- Do you have a developer's forum anywhere?
[Edit: Damn, looks like I'll have to install BCC... Perhaps in May, when I have more time, or someone else might want to pick up my idea. I've already added this to the Feature Tracker on SourceForge, just in case.]
Just so you know, I'm not sure which weekend I'll *attempt* to look at UltraVNC. (I'll only do this scroll zone optimization if I can do it in 1 night).
I'll probably do the following:
1. Check development environment requirements. If reasonable, go to 2:
2. Grab a copy of the latest UltraVNC tree over CVS.
3. Attempt to compile and run the server.
4. If works without too much time lost, then I'll begin playing with it.
5. If the code is well made (easily maintainable, well documented, modular), I may be able to pull off scrollzone optimization in just one night. My advantage is that I've done this before (albiet 10 years ago!)
Obviously, the DDI driver would be still superior, but there are many cases I'm not even *allowed* to install a driver on some of the systems I access, anyway... Plus this may be useful for other platforms in the future (i.e. the other open source projects, such as VNC on MacOS and Linux)
Might not happen for a week or two, but if there are any problems with my approach, now is the time to comment before I decide to spontaneously commit an evening to this challenge...
PS -- Do you have a developer's forum anywhere?
[Edit: Damn, looks like I'll have to install BCC... Perhaps in May, when I have more time, or someone else might want to pick up my idea. I've already added this to the Feature Tracker on SourceForge, just in case.]
Last edited by Mark Rejhon on 2005-03-24 22:43, edited 7 times in total.
- Rudi De Vos
- Admin & Developer
- Posts: 6862
- Joined: 2004-04-23 10:21
- Contact:
I took a chance and just loaded up the project in Visual Studio .NET 2003.
There were lots of errors (authad, auth, authloginuser) and I forced it to work offline, and I had to do a library exclusion, but I got the VNC Server compiled. I even executed it in DEBUG mode and it seems to function as a VNC Server, so it looks like I'm "debug ready" now. Looks like I can begin to play with the source code already
However... Actual development may wait for another day, I've at least determined that I do have an appropriate development environment, at least to limp along on.
There were lots of errors (authad, auth, authloginuser) and I forced it to work offline, and I had to do a library exclusion, but I got the VNC Server compiled. I even executed it in DEBUG mode and it seems to function as a VNC Server, so it looks like I'm "debug ready" now. Looks like I can begin to play with the source code already
However... Actual development may wait for another day, I've at least determined that I do have an appropriate development environment, at least to limp along on.
- Rudi De Vos
- Admin & Developer
- Posts: 6862
- Joined: 2004-04-23 10:21
- Contact:
Ok, in that case, the only real mod needed to do to a default .NET 2003 install was exclude "libcd.dll" from the linker properties of the winvnc subproject. After I did that, I successfully compiled the server.
So I can confirm .NET 2003 seems to works fine for UltraVNC Server as far as I'm concerned (even though I never found any documentation telling me .NET 2003 works with UltraVNC).
So I can confirm .NET 2003 seems to works fine for UltraVNC Server as far as I'm concerned (even though I never found any documentation telling me .NET 2003 works with UltraVNC).
Oh, yes... and one change was needed to eliminate one fatal compile error:
#include <iostream.h>
into
#include <iostream>
(Bear in mind: I'm using a boilerplate .NET 2003 install, so there are probably alternative methods)
Anyway, I'm headed home now, I'll load it up in .NET 2002 and see if it's similiarly easy to compile in there too... I might even dare to attack this project tonight, if I'm not feeling too tired.
#include <iostream.h>
into
#include <iostream>
(Bear in mind: I'm using a boilerplate .NET 2003 install, so there are probably alternative methods)
Anyway, I'm headed home now, I'll load it up in .NET 2002 and see if it's similiarly easy to compile in there too... I might even dare to attack this project tonight, if I'm not feeling too tired.
Mark Rejhon
i would encourage everyone know coding and improving code UltraVNC are very welcome !
UltraVNC, much better with nice coders, helper
very very nice to doI am debating upgrading to RC20, but only if it improves the scrolling performance;
i would encourage everyone know coding and improving code UltraVNC are very welcome !
UltraVNC, much better with nice coders, helper
UltraVNC 1.0.9.6.1 (built 20110518)
OS Win: xp home + vista business + 7 home
only experienced user, not developer
OS Win: xp home + vista business + 7 home
only experienced user, not developer
Cool
I've not yet tried to compile Ultra with Visual Studio 2003, even if I'm working on it everyday
Thanks for your time. Feel also free to optimize the FastDetectChanges() function that was developped in a rush in about 2 hours in December 2001 after I got upset reading on the net an "performance comparizon" stating that r.admin was 30 times faster than VNC...
I've not yet tried to compile Ultra with Visual Studio 2003, even if I'm working on it everyday
Thanks for your time. Feel also free to optimize the FastDetectChanges() function that was developped in a rush in about 2 hours in December 2001 after I got upset reading on the net an "performance comparizon" stating that r.admin was 30 times faster than VNC...
Last edited by UltraSam on 2005-03-25 10:28, edited 1 time in total.
UltraSam
-
- Posts: 3
- Joined: 2005-03-24 22:23
Confirmed:
UltraVNC compiles in .NET 2002 and .NET 2003, with the aboveforementioned minor edits. (iostream without the .h extension was required only for 2003 not 2002)
One option I'd like to add is a 1-pixel subcycling (32 pixel pass, then a 4 pixel pass, then offset the whole shebang by 1 pixel when repeating the 32 and 4 pixel pass), since I occasionally notice some minor redraw artifacts. This will mean that tiny changes will be very slowly updated, but at least the update glitches will 'dissappear' eventually.
Another thing I might test, is test to see if a scanline based comparing (ala memcmp() ) is faster than scannling every 32 pixels or every 4 pixels horizontally in a single row of pixels. (Vertical stepping would remain the same original algorithm) This would be a further performance optimization. If so, then it may be a good idea to convert to comparing every 32 scanlines vertically. The CPU L1 data cache often preloads several pixels anyway (several bytes) when comparing single pixels, so there may be a performance benefit here doing a raw memcmp (Which often compares 64 bits of memory at a time using optimized memcmp routines). It may not be faster for 32 pixels, but it might be be faster than comparing pixels every 4 pixels horizontally.... (has anyone done this?) If so, then this will probably make my life easier, we'd compare rows of pixels (scanlines) every 64 pixels, shifting in 32 pixel vertical shift cycles, 16 pixel vertical shift sub-cycle, 8 pixel vertical shift sub-cycle, then 4 then 2 then 1, you get th4e idea. (a drivative binary search algorithm).
A 4,2,1 vertical shift cycle on a theoretical tiny 16-pixel-tall screen with all the attendant sub-cycles would be be scanning the scanlines (rows of pixels) in this order, by row number from top of the screen:
4,8,12,16,2,6,10,14,1,5,9,13,3,7,11,15
As you can see, that number pattern is very "binary-search" like, and might be more efficient and faster than your current algorithm, a theory worth performance-profiling and testing out....
It could even be a 32,4,1 sub-cycling, but I think this can be optimized even more to a scanline-based 128,64,32,16,8,4,2,1 sub-cycling, it will "hit" the screen changes faster than the current algorithm, I'd hope...
It might not be worth optimizing, since the current algorithm is very fast.... Scanline based would probably make my job easier, since reading a pixel will load several pixels in a row to the CPU L1 memory cache, so comparing two whole scanlines may be just as fast as comparing every 32 pixel steps horizontally.... I think my first test will be to determine if my theory is true (or not), then the first thing I'd do is convert your routine to compare rows of pixels every 32 pixels vertically, with a 4-pixel vertical shift. (And then add the faster and more efficient sub-cycling, to see if it yields any worthwhile further improvements). I may go for the "Don't Fix It If It Ain't Broke" and stick with your original 32 pixel and 4 pixel shifts, except convert it to pixel-row-based compares, since row-based compares would be a definite optimization for scroll zone detection for me... (makes my life easier) But I'll only do this if I determine that row-based compares are faster than horizontal steps of 32. (Vertical step would remain unchanged)
As I did not use CVS (I downloaded RC19, really), I should probably set that up next... Although I don't think there's been any changes to that routine between RC19 and RC20? If not, then I'll just supply plain old codediffs or source files, and someone else with the privelages could add to the master tree..
However, I am very tired after today, so I'll do actual playing with code later. However, I'm at least debug-ready
UltraVNC compiles in .NET 2002 and .NET 2003, with the aboveforementioned minor edits. (iostream without the .h extension was required only for 2003 not 2002)
One option I'd like to add is a 1-pixel subcycling (32 pixel pass, then a 4 pixel pass, then offset the whole shebang by 1 pixel when repeating the 32 and 4 pixel pass), since I occasionally notice some minor redraw artifacts. This will mean that tiny changes will be very slowly updated, but at least the update glitches will 'dissappear' eventually.
Another thing I might test, is test to see if a scanline based comparing (ala memcmp() ) is faster than scannling every 32 pixels or every 4 pixels horizontally in a single row of pixels. (Vertical stepping would remain the same original algorithm) This would be a further performance optimization. If so, then it may be a good idea to convert to comparing every 32 scanlines vertically. The CPU L1 data cache often preloads several pixels anyway (several bytes) when comparing single pixels, so there may be a performance benefit here doing a raw memcmp (Which often compares 64 bits of memory at a time using optimized memcmp routines). It may not be faster for 32 pixels, but it might be be faster than comparing pixels every 4 pixels horizontally.... (has anyone done this?) If so, then this will probably make my life easier, we'd compare rows of pixels (scanlines) every 64 pixels, shifting in 32 pixel vertical shift cycles, 16 pixel vertical shift sub-cycle, 8 pixel vertical shift sub-cycle, then 4 then 2 then 1, you get th4e idea. (a drivative binary search algorithm).
A 4,2,1 vertical shift cycle on a theoretical tiny 16-pixel-tall screen with all the attendant sub-cycles would be be scanning the scanlines (rows of pixels) in this order, by row number from top of the screen:
4,8,12,16,2,6,10,14,1,5,9,13,3,7,11,15
As you can see, that number pattern is very "binary-search" like, and might be more efficient and faster than your current algorithm, a theory worth performance-profiling and testing out....
It could even be a 32,4,1 sub-cycling, but I think this can be optimized even more to a scanline-based 128,64,32,16,8,4,2,1 sub-cycling, it will "hit" the screen changes faster than the current algorithm, I'd hope...
It might not be worth optimizing, since the current algorithm is very fast.... Scanline based would probably make my job easier, since reading a pixel will load several pixels in a row to the CPU L1 memory cache, so comparing two whole scanlines may be just as fast as comparing every 32 pixel steps horizontally.... I think my first test will be to determine if my theory is true (or not), then the first thing I'd do is convert your routine to compare rows of pixels every 32 pixels vertically, with a 4-pixel vertical shift. (And then add the faster and more efficient sub-cycling, to see if it yields any worthwhile further improvements). I may go for the "Don't Fix It If It Ain't Broke" and stick with your original 32 pixel and 4 pixel shifts, except convert it to pixel-row-based compares, since row-based compares would be a definite optimization for scroll zone detection for me... (makes my life easier) But I'll only do this if I determine that row-based compares are faster than horizontal steps of 32. (Vertical step would remain unchanged)
As I did not use CVS (I downloaded RC19, really), I should probably set that up next... Although I don't think there's been any changes to that routine between RC19 and RC20? If not, then I'll just supply plain old codediffs or source files, and someone else with the privelages could add to the master tree..
However, I am very tired after today, so I'll do actual playing with code later. However, I'm at least debug-ready
Last edited by Mark Rejhon on 2005-03-25 22:44, edited 8 times in total.