[winswitch] Using Xpra on multi-GPU laptops

Wed Oct 5 15:35:20 BST 2011

On 05/10/11 20:13, Eric Appleman wrote:
> The bottleneck is the client process.
I'll try to stress it a bit and see what can be done to optimize it.
>
> As for Xvfb, it's really hard to say. When the window (mplayer X11
> output) is displayed over VNC (is there a better way?), it's a slideshow. 
I am confused, what do you need VNC for here?
> When run through Xpra, the process is CPU-bound and performs no better
> than the real hardware X server.
OK, let me try to recap things:
* mplayer via xpra is (almost?) watchable, right? (albeit CPU-bound and
no faster than the onboard non-nvidia/slow graphics adapter - I think
this is what you called "real hardware X server"?)
* glxspheres via xpra shows lots of artifacts (as can be seen on your
youtube video)

I think that modern hardware should be able to cope with copying a
screen region from one card to the other at 30fps: 1024*1024*32bit*30fps
= 120MB/s
So I may just add a manual refresh mode just to test this hypothesis
(and it may be useful for stressing the client too), then if this works
ok we will know that the artifacts issue can be solved one way or another...

Looking ahead though, it seems to me that for your use case having a
full network stack and copying screen regions to and from python
structures is very inefficient. Things would be a hell of a lot faster
using direct memory access (shared memory) - are there any constraints
with your hardware on that front?

Antoine

> On 10/05/2011 03:14 AM, Antoine Martin wrote:
>> On 05/10/11 13:57, Eric Appleman wrote:
>>> Setting BATCH_EVENTS to False and adjusting those other settins didn't
>>> help. I didn't tinker too much since I'm not sure what they do.
>>>
>>> Is it possible that the problem boils down one of the Xpra processes
>>> taking 96% of my CPU in gnome-system-monitor. The second Xpra process
>>> uses 8% and mplayer2 uses an expected 24%.
>> Yes, clearly it is CPU bound. Can you please tell us which one it is,
>> client or server?
>>
>> I am really not sure why I can run mplayer without problems using Xvfb,
>> can you try that too?
>>
>> Antoine
>>
>>> Here's a screenshot: http://i.imgur.com/EGJx2.png
>>>
>>> If I run another mplayer2 process on the Intel screen without Xpra, it
>>> renders fine.
>>>
>>> On 10/05/2011 01:59 AM, Antoine Martin wrote:
>>>> On 05/10/11 01:49, Eric Appleman wrote:
>>>>> * VirtualGL doesn't expose the Nvidia hardware the way we'd like
>>>>> because VDPAU and NV17 are not accessible as video output.
>>>>> Furthermore, there is a significant performance penalty that we
>>>>> discovered is not present in more direct solutions such as
>>>>> hybrid-windump and Xpra.
>>>>>
>>>>> * I tried r202 last night with the following commands:
>>>>> xpra start --encoding=rgb24 -z 0 :8
>>>>> xpra attach --encoding=rgb24 -z 0 :8
>>>> That's right
>>>> (setting encoding and compression when "start"ing is not strictly
>>>> necessary if you do specify it when "attach"ing as this takes
>>>> precedence)
>>>>> Is it correct? I still had the screen update problem with PNG and
>>>>> RGB24.
>>>> OK, that's good to know as it means that the encoding performance is
>>>> probably not causing the problem.
>>>>> * I'm not familiar with Xvfb/Xdummy. Can you explain how to use it
>>>>> the
>>>>> relation to Xpra?
>>>> When you start Xpra the way you did above it will normally start an
>>>> Xvfb
>>>> server, not a real X11 display, which will not allow you to take
>>>> advantage of hardware acceleration. If you have an existing
>>>> accelerated
>>>> X display, you should use the "--use-display" switch. I guess that
>>>> ":8"
>>>> is your accelerated display? Which means starting the Xvfb failed and
>>>> xpra should have terminated with an error - will fix that..
>>>> Xdummy is just an alternative (probably not useful in your case since
>>>> you run against something else), you can find all about it here:
>>>> http://xpra.org/trac/ticket/10
>>>>> * glxgears is not the best visual test. Try glxspheres or
>>>>> mplayer2/smplayer through Xpra. The latter is pretty much unusable.
>>>> I've just tried glxspheres and I don't get any artifacts with either
>>>> Xvfb or Xdummy.
>>>> But it does render very very slowly (~5fps on a core2duo).
>>>> Watching a video with mplayer works well enough though.
>>>> Since I can't reproduce your problems, I have added a switch to
>>>> turn off
>>>> all damage update batching (since I suspect your hardware does more
>>>> than
>>>> 20fps) and upped the batching to not trigger until 30fps:
>>>> http://xpra.org/trac/changeset/203
>>>> Please try this, also with "BATCH_EVENTS = False" as well as
>>>> increasing
>>>> the MAX_EVENTS and decreasing the BATCH_DELAY. Hopefully, we can find
>>>> more optimal settings for your hardware.
>>>>> * The process using Xpra (and sometimes Xpra itself) is taking up a
>>>>> lot of CPU resources. I have 4 cores and 8 threads (i7-2630QM). It's
>>>>> taking up an entire core at times. I'm not sure what you mean by
>>>>> CPU-bound.
>>>> There should be 2 processes, one for the client and one for the
>>>> server,
>>>> which one is the bottleneck?
>>>> By CPU-bound I mean that CPU is the limiting factor.
>>>> It would also be interesting to know how much bandwidth is being used,
>>>> you should be able to work that out by using the tcp socket mode
>>>> ("--bind-tcp=").
>>>>> * The second Xserver has no physical video output. Some Optimus
>>>>> laptops have the HDMI ports wired directly to the Nvidia GPU, but we
>>>>> are not targeting these exceptions as a default case.
>>>> ok.
>>>>
>>>> Antoine
>>>>> Kind regards,
>>>>> Eric
>>>>>
>>>>> On 10/04/2011 10:57 AM, Antoine Martin wrote:
>>>>>> On 04/10/11 11:11, Eric Appleman wrote:
>>>>>>> Hi, I've been fiddling around with Xpra as a replacement to the
>>>>>>> Bumblebee Project's[1] VirtualGL backend. Bumblebee permits
>>>>>>> access to
>>>>>>> the Nvidia GPU on Optimus laptops. By using the X server it
>>>>>>> creates,
>>>>>>> we can start applications on the invisible Nvidia screen.
>>>>>>>
>>>>>>> I want Xpra to replace VirtualGL as the transport method, but I
>>>>>>> need
>>>>>>> the screen to display at a normal framerate.
>>>>>> Just out of curiosity, what makes you prefer Xpra over VirtualGL. At
>>>>>> first I thought VirtualGL would be a more natural fit, but..
>>>>>> maybe it
>>>>>> isn't?
>>>>>>> I've included a demonstration that compares rendering through Xpra
>>>>>>> and
>>>>>>> without it. Feel free to drop by #bumblebee-dev on Freenode if you
>>>>>>> have ideas or questions.
>>>>>> I've looked at the video and it certainly looks interesting and
>>>>>> challenging for Xpra...
>>>>>>
>>>>>> You haven't specified which command line options you have used or
>>>>>> which
>>>>>> version of Xpra you are using (trunk?).
>>>>>> I would suggest you try with trunk, turn compression off, try with
>>>>>> all 3
>>>>>> screen encoding options (rgb24, jpeg and png), with jpeg you may
>>>>>> want to
>>>>>> try different quality settings.
>>>>>>
>>>>>> It is a little bit strange how the screen updates seem a little
>>>>>> disjointed, but I guess that is because it is falling too far
>>>>>> behind the
>>>>>> real screen?
>>>>>> If we can come up with a test case to reproduce this (one that does
>>>>>> not
>>>>>> involve me buying a new laptop preferably!), then I am sure there
>>>>>> are
>>>>>> improvements that can be made. For example, there are also some real
>>>>>> issues with the python networking code which make it less than
>>>>>> optimal... Maybe tackling those would already make a noticeable
>>>>>> difference. Is the refresh rate and artifacts very different from
>>>>>> using
>>>>>> Xpra with Xvfb/Xdummy?
>>>>>> I've tried running glxgears in xpra, and although it's not super
>>>>>> smooth,
>>>>>> it looks ok-ish...
>>>>>>
>>>>>> I have recently added to trunk some code to rate-limit screen
>>>>>> updates at
>>>>>> 20ups, maybe you hit this limit?
>>>>>> Are you CPU bound? I don't really understand the architecture of
>>>>>> bumblebee, do you get a secondary X11 display which is not actually
>>>>>> bound to any video ports?
>>>>>>
>>>>>> Cheers
>>>>>> Antoine
>>>>>>
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> - Eric
>>>>>>>
>>>>>>> [1]https://github.com/Bumblebee-Project/Bumblebee
>>>>>>>
>>>>>>> Example:
>>>>>>> http://www.youtube.com/watch?v=IN8heIMqTa8
>>>>>>>
>>>>>>> 1. Start Bumblebee X server on :8
>>>>>>> optirun glxinfo
>>>>>>>
>>>>>>> 2.Start Xpra server
>>>>>>> xpra start :8
>>>>>>>
>>>>>>> 3. Attach :8 to :0, the Intel screen
>>>>>>> xpra attach :8
>>>>>>>
>>>>>>> 4. Start applications on Nvidia screen
>>>>>>> DISPLAY=:8 glxgears
>>>>>>>
>>>>>>> 5. Repeat step 4 as necessary
>>>>>>> _______________________________________________