Update: UltraVNC 1.4.3.6 and UltraVNC SC 1.4.3.6: https://forum.uvnc.com/viewtopic.php?t=37885
Important: Please update to latest version before to create a reply, a topic or an issue: https://forum.uvnc.com/viewtopic.php?t=37864
Join us on social networks and share our announcements:
- Website: https://uvnc.com/
- GitHub: https://github.com/ultravnc
- Mastodon: https://mastodon.social/@ultravnc
- Facebook: https://www.facebook.com/ultravnc1
- X/Twitter: https://x.com/ultravnc1
- Reddit community: https://www.reddit.com/r/ultravnc
- OpenHub: https://openhub.net/p/ultravnc
Important: Please update to latest version before to create a reply, a topic or an issue: https://forum.uvnc.com/viewtopic.php?t=37864
Join us on social networks and share our announcements:
- Website: https://uvnc.com/
- GitHub: https://github.com/ultravnc
- Mastodon: https://mastodon.social/@ultravnc
- Facebook: https://www.facebook.com/ultravnc1
- X/Twitter: https://x.com/ultravnc1
- Reddit community: https://www.reddit.com/r/ultravnc
- OpenHub: https://openhub.net/p/ultravnc
Performance measurements
Performance measurements
I measured how long it takes to process rfbFramebufferUpdate update.
The measurements are done with UltraVNC viewer 1.0.5.6 and 1.0.6.4.
I simply start a timer before ReadscreenUpdate call and stop it when it returns.
On the server I have a small program which generates screen changes.
This program generates 512x512 'noise' images or 512x512 checkboard images.
I can also regulate the speed it performs the update.
For this measurement i set it to 1 or 0.5 fps. (So the server isn't stressed)
Basically I used zrle with 8 gray colors or 256 colors on a local network.
I tried also some other settings like full color and Hextil encoding but in general the times i measure are the same.
I'm running the viewer on 8 core Xeon E5420 @ 2.50 Ghz and a nVidia Quadro NVS 290 but target viewer systems are desktop and laptops and remote view with smaller bandwith (ZRLE, 8 gray colors, 256 colors)
What measure is a time between 225mSec up to 400 mSec to process an update (noise image).
I measure simular times when using check board images.
(Full collor, Hextile, Check board gives 10 mSec which is an exception)
I tried this also with the RealVNC viewer which results in a process time of 10mSec with zrle and 20 mSec when using raw encoding. Using check board images and zrle results in an update time of 2 mSecs
Of course is a 512x512 noise image is a kind of worst case but we control some systems which now and then generate noisy images quickly.
On the other hand this worst case situation makes it extremely visible.
Looking at the measurements the update time is very long resulting an update speed of only 4 fps. Compared this to real vnc is 20+ time faster.
I tried to find where the most of the time is spend and i found that a call to IMAGE_RECT uses most time. IMAGE_RECT can be broken down to a loop with SetPixelV. (RealVNC uses a loop with some mem copies).
Can someone explain why Ultra Viewer is that slow and even better what can be done to make the viewer faster.
Of course it could be that I missed somewhere something so feel free to point it out so my/our understanding increases.
The easy way to say 'use RealVNC' is an option but UltraVNC offers a lot of extra functionalit.
I hope that my findings can contribute to a better (and faster) UltraVNC
The measurements are done with UltraVNC viewer 1.0.5.6 and 1.0.6.4.
I simply start a timer before ReadscreenUpdate call and stop it when it returns.
On the server I have a small program which generates screen changes.
This program generates 512x512 'noise' images or 512x512 checkboard images.
I can also regulate the speed it performs the update.
For this measurement i set it to 1 or 0.5 fps. (So the server isn't stressed)
Basically I used zrle with 8 gray colors or 256 colors on a local network.
I tried also some other settings like full color and Hextil encoding but in general the times i measure are the same.
I'm running the viewer on 8 core Xeon E5420 @ 2.50 Ghz and a nVidia Quadro NVS 290 but target viewer systems are desktop and laptops and remote view with smaller bandwith (ZRLE, 8 gray colors, 256 colors)
What measure is a time between 225mSec up to 400 mSec to process an update (noise image).
I measure simular times when using check board images.
(Full collor, Hextile, Check board gives 10 mSec which is an exception)
I tried this also with the RealVNC viewer which results in a process time of 10mSec with zrle and 20 mSec when using raw encoding. Using check board images and zrle results in an update time of 2 mSecs
Of course is a 512x512 noise image is a kind of worst case but we control some systems which now and then generate noisy images quickly.
On the other hand this worst case situation makes it extremely visible.
Looking at the measurements the update time is very long resulting an update speed of only 4 fps. Compared this to real vnc is 20+ time faster.
I tried to find where the most of the time is spend and i found that a call to IMAGE_RECT uses most time. IMAGE_RECT can be broken down to a loop with SetPixelV. (RealVNC uses a loop with some mem copies).
Can someone explain why Ultra Viewer is that slow and even better what can be done to make the viewer faster.
Of course it could be that I missed somewhere something so feel free to point it out so my/our understanding increases.
The easy way to say 'use RealVNC' is an option but UltraVNC offers a lot of extra functionalit.
I hope that my findings can contribute to a better (and faster) UltraVNC
Last edited by MrScotty on 2009-11-22 13:23, edited 3 times in total.
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
Are the tests done
server 1064 -> viewer 1064 (slow)
server 1064 -> realvnc viewer (fast)
I just want to exclude that it is a server issue and the problem is only
related to the viewer, this already eliminate 50% of the code.
did 1056 and 1064 had a difference in performance ?
When i remember correct encoder "ultra" also use memcpy, does this encoder make a difference ?
server 1064 -> viewer 1064 (slow)
server 1064 -> realvnc viewer (fast)
I just want to exclude that it is a server issue and the problem is only
related to the viewer, this already eliminate 50% of the code.
did 1056 and 1064 had a difference in performance ?
When i remember correct encoder "ultra" also use memcpy, does this encoder make a difference ?
Re: Performance measurements
the server was 1.0.5.6
the client tested was 1.0.5.6, 1.0.6.4 and realvnc
The times for 1.0.5.6 and 1.0.6.4 are simular
realvnc is way faster.
As you stated this is a viewer problem. I suspect however also a server issue where it takes some time to handle an update request.
At the moment i have only a VMWare session as a server which i think is not representive to do performance measurements.
A test could be simply done by measuring the begin and end of vncClient::SendRectangle(const rfb::Rect &rect).
I just tried Ultra which results in an time of 30 mSec
the client tested was 1.0.5.6, 1.0.6.4 and realvnc
The times for 1.0.5.6 and 1.0.6.4 are simular
realvnc is way faster.
As you stated this is a viewer problem. I suspect however also a server issue where it takes some time to handle an update request.
At the moment i have only a VMWare session as a server which i think is not representive to do performance measurements.
A test could be simply done by measuring the begin and end of vncClient::SendRectangle(const rfb::Rect &rect).
I just tried Ultra which results in an time of 30 mSec
Last edited by MrScotty on 2009-07-29 12:27, edited 1 time in total.
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
A cycle is
capture changes + compress + send + decompress+show
If the vmware server is on the same pc as the viewer, send==0 and you
see a big difference between the setpixel and bitblt method. In a real situation send is the slowest part, the impact of show is smaller.
But, this test is indeed a good method to mesure performance without network impact.
capture changes + compress + send + decompress+show
If the vmware server is on the same pc as the viewer, send==0 and you
see a big difference between the setpixel and bitblt method. In a real situation send is the slowest part, the impact of show is smaller.
But, this test is indeed a good method to mesure performance without network impact.
Re: Performance measurements
To complete the picture of the test environment.
The vmware session runs on a different pc which means that in my test setup there is some real network traffic. In the setup the network connection is 100mb and ping says <1ms. The send time in this case, i assume, is less then the 200+ mSecs
The cycle starts with sending an updateRequest which is send after the show. Compared to realvnc you send it 200+ mSecs later.
The vmware session runs on a different pc which means that in my test setup there is some real network traffic. In the setup the network connection is 100mb and ping says <1ms. The send time in this case, i assume, is less then the 200+ mSecs
The cycle starts with sending an updateRequest which is send after the show. Compared to realvnc you send it 200+ mSecs later.
Re: Performance measurements
I recently chnaged my router from 100Mbps to 1Gbps and since then have noticed that version 1.0.6.4 is very slow in refreshing.
I have then installed the free version of Real VNC 4.13, it is quite fast as compared to UVnc
I have then installed the free version of Real VNC 4.13, it is quite fast as compared to UVnc
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
Running 1064,
100MBps ok
1GBps slow
Same config, only router change ?
Crazy...
If you wouldn't have mentioned that realvnc works fast, i would expect
a cable problem, network is faster, but packets get lost. This cause
retransmits and huge lag. VNC is a high network app, exposing errors like this...
But fast on Realvnc, so don't know.
MS building net stats
-----------------------
in a cmd tool you can run
netstat -es
or
netstat -es 5 ( redisplay every 5 sec)
while uvnc is connected and check if you get a high number
of errors or restransmits. (TCP)
100MBps ok
1GBps slow
Same config, only router change ?
Crazy...
If you wouldn't have mentioned that realvnc works fast, i would expect
a cable problem, network is faster, but packets get lost. This cause
retransmits and huge lag. VNC is a high network app, exposing errors like this...
But fast on Realvnc, so don't know.
MS building net stats
-----------------------
in a cmd tool you can run
netstat -es
or
netstat -es 5 ( redisplay every 5 sec)
while uvnc is connected and check if you get a high number
of errors or restransmits. (TCP)
Re: Performance measurements
vinitmodi and Rudi
maybe uvnc don't know manage jumbo packet enabled by default on some branded Network Interface card and properly managed by realvnc 4.1 free edition ?
I won't have 1000 Gbit/s network, so i can't verify my self.
for actual uvnc until uvnc 1.0.6.4? need to disable jumbo packet or jumbo frame to default value 1500)
maybe uvnc don't know manage jumbo packet enabled by default on some branded Network Interface card and properly managed by realvnc 4.1 free edition ?
I won't have 1000 Gbit/s network, so i can't verify my self.
for actual uvnc until uvnc 1.0.6.4? need to disable jumbo packet or jumbo frame to default value 1500)
UltraVNC 1.0.9.6.1 (built 20110518)
OS Win: xp home + vista business + 7 home
only experienced user, not developer
OS Win: xp home + vista business + 7 home
only experienced user, not developer
Re: Performance measurements
MrScotty
[topic=16009][/topic]
could you, please, try new test performance with winvnc 1.0.7.6.2 and vncviewer 1.0.6.4 , thank you.Rudi De Vos wrote: Vnc send packets of 8192 byte
I found this on the net.
-----------------
1.) The mtu for my network adapter is set to 9000.
2.) I can type "tracepath <ip>" and it returns a pmtu of 9000 to another computer.
3.) My buffer is 153600 bytes. So I need to send 18 (8192 byte) packets or 37 (4096 byte) packets.
4.) The rate at which data is being sent is 153600*60/second...so lots of data.
5.) Using the sendto command, I can send packets of size 576, 1400, 4088, and 4096, (and anything less than 1400) without many problems.
6.) I cannot use 4000, 8192, 9000,....and lots of others, except the above, when calling sendto or the program cannot keep up with sending and gathering data.
[topic=16009][/topic]
Last edited by redge on 2009-08-28 22:47, edited 1 time in total.
UltraVNC 1.0.9.6.1 (built 20110518)
OS Win: xp home + vista business + 7 home
only experienced user, not developer
OS Win: xp home + vista business + 7 home
only experienced user, not developer
Re: Performance measurements
Missed the request which explains the slow response.
It's not completly clear what you wan't to be retested so I performed simular tests and, as expected, the drawing speed of the viewer is still around 250 mSecs (512x512 Noise images etc...).
I don't have the posibility to play around with changing mtu's and packet sizes.
In cases as described the reading part of processing the framebuffer update could have some influence. As far as i could measured the main part of the 250 mSec goes to the drawing.
I also looked a little into what is happening in the server.
When polling is needed this is done every 100 mSec.
So the total time for a complete 'roundtrip' (FramebufferUpdateRequest, FramebufferUpdate and then a new FramebufferUpdate) could take 100 mSecs (poll time) + 250 mSec (viewer) + some grabbing and sending/recieving.
This means there is some 400 mSec wich gives you an update speed of 2.5 update per second.
Using the hook dll's and/or video driver increases the performance (eliminating the poll time) but the applications should use 'normal' windows painting behaviour.
I my case the image generator I use, uses SetDIBitsToDevice to paint, which doesn't generate a windows message. (The generator uses the CImg library)
When setting the title of the window the hook gets a message wich results in a screen update.
When making the title static no updates are transmitted when polling is off.
I did a small expirement with the 'Detour' libraries of Microsoft Research (http://research.microsoft.com/en-us/projects/detours/).
In the vnchook i 'detoured' SetDIBitsToDevice. In the detoured function i get the window handle belonging to the device context and call SendDeferredWindowRect.
It works, i get the updates without polling and the update speed goes up (at least with the 1.0.5.6 server).
The 'scary level' of this 'detouring' however is serious high.
Hope this (late) input is still of use.
It's not completly clear what you wan't to be retested so I performed simular tests and, as expected, the drawing speed of the viewer is still around 250 mSecs (512x512 Noise images etc...).
I don't have the posibility to play around with changing mtu's and packet sizes.
In cases as described the reading part of processing the framebuffer update could have some influence. As far as i could measured the main part of the 250 mSec goes to the drawing.
I also looked a little into what is happening in the server.
When polling is needed this is done every 100 mSec.
So the total time for a complete 'roundtrip' (FramebufferUpdateRequest, FramebufferUpdate and then a new FramebufferUpdate) could take 100 mSecs (poll time) + 250 mSec (viewer) + some grabbing and sending/recieving.
This means there is some 400 mSec wich gives you an update speed of 2.5 update per second.
Using the hook dll's and/or video driver increases the performance (eliminating the poll time) but the applications should use 'normal' windows painting behaviour.
I my case the image generator I use, uses SetDIBitsToDevice to paint, which doesn't generate a windows message. (The generator uses the CImg library)
When setting the title of the window the hook gets a message wich results in a screen update.
When making the title static no updates are transmitted when polling is off.
I did a small expirement with the 'Detour' libraries of Microsoft Research (http://research.microsoft.com/en-us/projects/detours/).
In the vnchook i 'detoured' SetDIBitsToDevice. In the detoured function i get the window handle belonging to the device context and call SendDeferredWindowRect.
It works, i get the updates without polling and the update speed goes up (at least with the 1.0.5.6 server).
The 'scary level' of this 'detouring' however is serious high.
Hope this (late) input is still of use.
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
Thanks for the feedback...
I only don't understand that the SetDIBitsToDevice doesn't generate
an update when the mirror driver is used. When i'm correct this is done using a drvbitblt(). The mirror driver update the mirror surface and add the rectangle used in drvbitblt to a ringbuffer. Vnc read the ringbuffer and use the mirror surface to capture the screen data.
The hookdll only hook windows messages...and this doesn't capture function calls. Yep Detour is a nice lib, but virus checkers gonna freak on it and possible get broken with each MS update. To risky for non developer usage.
Using the mirror driver i can watch video 640x480, 10-15 fps.
At least when the cpu throttle is disabled to allow max cpu usage.
MaxCpu=100 (ultravnc.ini)
edit: i also did a code change 100ms was replaced by 25ms
( to make sure driver was polled every 25ms)
But video is not noise...
512x512 noise images don't gonna make the compressors happy
zlib/jpeg/hextile just can't compress it.
They use a lot of time and compression gonna be minimal. For noise images the raw encoder gonna be faster.
I only don't understand that the SetDIBitsToDevice doesn't generate
an update when the mirror driver is used. When i'm correct this is done using a drvbitblt(). The mirror driver update the mirror surface and add the rectangle used in drvbitblt to a ringbuffer. Vnc read the ringbuffer and use the mirror surface to capture the screen data.
The hookdll only hook windows messages...and this doesn't capture function calls. Yep Detour is a nice lib, but virus checkers gonna freak on it and possible get broken with each MS update. To risky for non developer usage.
Using the mirror driver i can watch video 640x480, 10-15 fps.
At least when the cpu throttle is disabled to allow max cpu usage.
MaxCpu=100 (ultravnc.ini)
edit: i also did a code change 100ms was replaced by 25ms
( to make sure driver was polled every 25ms)
But video is not noise...
512x512 noise images don't gonna make the compressors happy
zlib/jpeg/hextile just can't compress it.
They use a lot of time and compression gonna be minimal. For noise images the raw encoder gonna be faster.
Last edited by Rudi De Vos on 2009-09-23 14:45, edited 1 time in total.
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
There is an error in
WaitForMultipleObjects(6,m_desktop->trigger_events,FALSE,100);
*I want that updates happen every 100ms ( polling)
*But this function wait 100ms even if an update took already 200ms to process and send.
The 100ms should actual be 100ms-processtime and in case of the driver
5ms is
----------------------------------
should do better the trick
WaitForMultipleObjects(6,m_desktop->trigger_events,FALSE,100);
*I want that updates happen every 100ms ( polling)
*But this function wait 100ms even if an update took already 200ms to process and send.
The 100ms should actual be 100ms-processtime and in case of the driver
5ms is
----------
Code: Select all
DWORD result;
newtick = timeGetTime();
int waittime;
waittime=100-(newtick-oldtick);
if (m_desktop->VideoBuffer() && m_desktop->m_hookdriver) waittime=20-(newtick-oldtick);
if (waittime<0) waittime=0;
if (waittime>100) waittime=100;
result=WaitForMultipleObjects(6,m_desktop->trigger_events,FALSE,waittime);
should do better the trick
Re: Performance measurements
I didn't use the mirror driver, only the vnchook.dll. In my specific case i can't use the mirror driver alway's.Rudi De Vos wrote:Thanks for the feedback...
I only don't understand that the SetDIBitsToDevice doesn't generate
an update when the mirror driver is used. When i'm correct this is done using a drvbitblt(). The mirror driver update the mirror surface and add the rectangle used in drvbitblt to a ringbuffer. Vnc read the ringbuffer and use the mirror surface to capture the screen data.
So this is No 'problem' with the mirror driver, but more a specific situation.
Hope this clears the don't understand
I completely agree. It's a kind ofRudi De Vos wrote: The hookdll only hook windows messages...and this doesn't capture function calls. Yep Detour is a nice lib, but virus checkers gonna freak on it and possible get broken with each MS update. To risky for non developer usage.
Maybe it's an idea to make the poll time configurable.Rudi De Vos wrote: Using the mirror driver i can watch video 640x480, 10-15 fps.
At least when the cpu throttle is disabled to allow max cpu usage.
MaxCpu=100 (ultravnc.ini)
edit: i also did a code change 100ms was replaced by 25ms
( to make sure driver was polled every 25ms)
Last edited by MrScotty on 2009-09-24 07:09, edited 1 time in total.
- Rudi De Vos
- Admin & Developer
- Posts: 6863
- Joined: 2004-04-23 10:21
- Contact:
Re: Performance measurements
Did some other tests...
There were some reports in the past that the viewer was slow.
Nobody ever mentioned that it was on system running aero....
I final can repeat it.
viewer on vista without aero is a lot faster...
There were some reports in the past that the viewer was slow.
Nobody ever mentioned that it was on system running aero....
I final can repeat it.
viewer on vista without aero is a lot faster...
Last edited by Rudi De Vos on 2009-09-25 11:22, edited 1 time in total.
Re: Performance measurements
Thank you found the bug.There were some reports in the past that the viewer was slow.
Nobody ever mentioned that it was on system running aero....
I final can repeat it.
viewer on vista without aero is a lot faster...
remark:
vncviewer on:
Windows Vista ultimate is Aero enabled default if GPU is supported !
Windows 7 ultimate is Aero enabled default if GPU is supported !
is mean, viewer is on computer with vncviewer with aero enabled, except if you forced your windows to work only as basic mode.
UltraVNC 1.0.9.6.1 (built 20110518)
OS Win: xp home + vista business + 7 home
only experienced user, not developer
OS Win: xp home + vista business + 7 home
only experienced user, not developer
Re: Performance measurements
fixed final release 1.0.8.0
UltraVNC 1.0.9.6.1 (built 20110518)
OS Win: xp home + vista business + 7 home
only experienced user, not developer
OS Win: xp home + vista business + 7 home
only experienced user, not developer