Discussion:
Wayland/weston opens files in suspend/resume
Teemu K
2018-08-22 09:18:00 UTC
Permalink
Hi,

I have custom iMX6 based HW running image generated with Yocto 2.4
(Wayland 2.0.0) and Linux kernel 4.1.x that I've been testing with
suspend/resume cycles.

I noticed that after each suspend/resume cycle open file count
increases. In my testing I got around 490 suspend/resume cycles until
libwayland gave error that there is too many open files.
--
[14:29:52.385] libwayland: dup failed: Too many open files
[14:29:52.385] caught signal: 6
The Wayland connection broke. Did the Wayland compositor die?
--
Looking at the lsof output after around 60 suspend/resume cycles I can
see that for example this file appears 60 more times:
--
weston 921 root 11u CHR 29,0 0t0
5323 /dev/fb0
--

The 11u part changes, but other parts are same. First time there is
11u and 13u but later it goes all the way to 93u (some numbers are
missing).

Also my QT application running when suspending keeps keeping these
kinds of files open which I assume (I may be wrong) because the
weston.

--
qtapplicat 929 965 root 11u REG 0,18 48097
12272 /run/user/root/weston-shared-24Zz9a (deleted)
--

I know that the Weston is quite old since I think 4.0.0 has already
come out,but is there any known bug that causes this? I did a search,
but couldn't find any. I know it's rare to need suspend/resume over
400 times without cutting power at some point, but I'm sure there'll
be that one customer that does it.

-Teemu K
Pekka Paalanen
2018-08-22 09:51:43 UTC
Permalink
On Wed, 22 Aug 2018 12:18:00 +0300
Post by Teemu K
Hi,
I have custom iMX6 based HW running image generated with Yocto 2.4
(Wayland 2.0.0) and Linux kernel 4.1.x that I've been testing with
suspend/resume cycles.
I noticed that after each suspend/resume cycle open file count
increases. In my testing I got around 490 suspend/resume cycles until
libwayland gave error that there is too many open files.
--
[14:29:52.385] libwayland: dup failed: Too many open files
[14:29:52.385] caught signal: 6
The Wayland connection broke. Did the Wayland compositor die?
--
Looking at the lsof output after around 60 suspend/resume cycles I can
--
weston 921 root 11u CHR 29,0 0t0
5323 /dev/fb0
--
The 11u part changes, but other parts are same. First time there is
11u and 13u but later it goes all the way to 93u (some numbers are
missing).
Also my QT application running when suspending keeps keeping these
kinds of files open which I assume (I may be wrong) because the
weston.
--
qtapplicat 929 965 root 11u REG 0,18 48097
12272 /run/user/root/weston-shared-24Zz9a (deleted)
--
I know that the Weston is quite old since I think 4.0.0 has already
come out,but is there any known bug that causes this? I did a search,
but couldn't find any. I know it's rare to need suspend/resume over
400 times without cutting power at some point, but I'm sure there'll
be that one customer that does it.
Hi,

I hope it would be more correct to say it's rare to see anyone care
about the fbdev backend. ;-)

Leaking file descriptors to /dev/fb0 over suspend/resume does not
surprise me, but I don't recall any bug reports or fixes exactly to that
effect. Also the fbdev device handling has been changed since Weston
2.0.0, so it's possible it might be already fixed. Or maybe not. I seem
to recall that VT switching with the fbdev-backend has been broken for
years.

I would not expect Weston to ever have made a dozen open file
descriptors to /dev/fb0. Are you sure you are not using proprietary EGL
drivers with the fbdev-backend? If you use EGL with fbdev, then we
certainly cannot help you.

If you found fd leaking with the DRM-backend and a more recent release,
there would be much more interest in it. However, I suppose we do still
take patches to fix bugs in the fbdev-backend for master branch, even
though we probably reject any feature additions (complicated ones at
least).

weston-shared-* file might be a keymap, nothing else comes to mind that
Weston would allocate, and I hope Qt does not use weston's name for its
own stuff.


Thanks,
pq
Teemu K
2018-08-23 05:30:35 UTC
Permalink
Post by Pekka Paalanen
On Wed, 22 Aug 2018 12:18:00 +0300
Post by Teemu K
Hi,
I have custom iMX6 based HW running image generated with Yocto 2.4
(Wayland 2.0.0) and Linux kernel 4.1.x that I've been testing with
suspend/resume cycles.
I noticed that after each suspend/resume cycle open file count
increases. In my testing I got around 490 suspend/resume cycles until
libwayland gave error that there is too many open files.
--
[14:29:52.385] libwayland: dup failed: Too many open files
[14:29:52.385] caught signal: 6
The Wayland connection broke. Did the Wayland compositor die?
--
Looking at the lsof output after around 60 suspend/resume cycles I can
--
weston 921 root 11u CHR 29,0 0t0
5323 /dev/fb0
--
The 11u part changes, but other parts are same. First time there is
11u and 13u but later it goes all the way to 93u (some numbers are
missing).
Also my QT application running when suspending keeps keeping these
kinds of files open which I assume (I may be wrong) because the
weston.
--
qtapplicat 929 965 root 11u REG 0,18 48097
12272 /run/user/root/weston-shared-24Zz9a (deleted)
--
I know that the Weston is quite old since I think 4.0.0 has already
come out,but is there any known bug that causes this? I did a search,
but couldn't find any. I know it's rare to need suspend/resume over
400 times without cutting power at some point, but I'm sure there'll
be that one customer that does it.
Hi,
I hope it would be more correct to say it's rare to see anyone care
about the fbdev backend. ;-)
Leaking file descriptors to /dev/fb0 over suspend/resume does not
surprise me, but I don't recall any bug reports or fixes exactly to that
effect. Also the fbdev device handling has been changed since Weston
2.0.0, so it's possible it might be already fixed. Or maybe not. I seem
to recall that VT switching with the fbdev-backend has been broken for
years.
I would not expect Weston to ever have made a dozen open file
descriptors to /dev/fb0. Are you sure you are not using proprietary EGL
drivers with the fbdev-backend? If you use EGL with fbdev, then we
certainly cannot help you.
If you found fd leaking with the DRM-backend and a more recent release,
there would be much more interest in it. However, I suppose we do still
take patches to fix bugs in the fbdev-backend for master branch, even
though we probably reject any feature additions (complicated ones at
least).
weston-shared-* file might be a keymap, nothing else comes to mind that
Weston would allocate, and I hope Qt does not use weston's name for its
own stuff.
Hi,

Well, since it's i.MX6 and NXP (still) doesn't provide proper video
driver so proprietary EGL driver is being used. I think they even made
their own fork for Weston 2.0.0 onward that they can keep using their
old driver instead fixing it to the way Wayland wants it.

I did try Etnaviv opensource graphics driver, but I couldn't get it
working even with the newer kernel so I'm stuck with the old driver.

But it seems that as I suspected it's not so much Wayland problem, but
the driver issue.

-Teemu
Pekka Paalanen
2018-08-23 07:17:42 UTC
Permalink
On Thu, 23 Aug 2018 08:30:35 +0300
Post by Teemu K
Well, since it's i.MX6 and NXP (still) doesn't provide proper video
driver so proprietary EGL driver is being used. I think they even made
their own fork for Weston 2.0.0 onward that they can keep using their
old driver instead fixing it to the way Wayland wants it.
I did try Etnaviv opensource graphics driver, but I couldn't get it
working even with the newer kernel so I'm stuck with the old driver.
Hi,

I've heard mentions that etnaviv could actually work. Make sure you
have a very very fresh upstream kernel, possibly Mesa master branch,
and upstream Weston 4.0.94 at least. And no propriatery drivers
anywhere. The bits needed to support etnaviv span through the whole
graphics stack, but I believe they should be mostly in place by now.
Post by Teemu K
But it seems that as I suspected it's not so much Wayland problem, but
the driver issue.
Right. It could have been a Weston issue, and if there still is a
Weston bug, we would be happy to receive a fix. But, that means you
need to reproduce the issue with upstream Weston and without
proprietary drivers.

It would also be possible to run Weston with its software renderer
(Pixman) to avoid needing any GPU drivers, on the Weston DRM backend
with upstream display drivers. Of course, doesn't help if you need
the GPU.

All that said, I'm not sure if anyone else is seriously testing
suspend/resume with Weston.


Thanks,
pq
Daniel Stone
2018-08-23 07:29:43 UTC
Permalink
Hi,
Post by Pekka Paalanen
Post by Teemu K
Well, since it's i.MX6 and NXP (still) doesn't provide proper video
driver so proprietary EGL driver is being used. I think they even made
their own fork for Weston 2.0.0 onward that they can keep using their
old driver instead fixing it to the way Wayland wants it.
I did try Etnaviv opensource graphics driver, but I couldn't get it
working even with the newer kernel so I'm stuck with the old driver.
I've heard mentions that etnaviv could actually work. Make sure you
have a very very fresh upstream kernel, possibly Mesa master branch,
and upstream Weston 4.0.94 at least. And no propriatery drivers
anywhere. The bits needed to support etnaviv span through the whole
graphics stack, but I believe they should be mostly in place by now.
Indeed, etnaviv + imx-drm does work very well now. It's one of my
primary test platforms.

Cheers,
Daniel

Ray, Ian (GE Healthcare)
2018-08-22 09:54:15 UTC
Permalink
Post by Teemu K
Hi,
I have custom iMX6 based HW running image generated with Yocto 2.4
(Wayland 2.0.0) and Linux kernel 4.1.x that I've been testing with
suspend/resume cycles.
I noticed that after each suspend/resume cycle open file count
increases. In my testing I got around 490 suspend/resume cycles until
libwayland gave error that there is too many open files.
Hi,

f981d69553f52ca50aaf864bf821bb022ab7da82 fixes an fd leak, though I’m
not sure if it is relevant to your scenario.
Post by Teemu K
--
[14:29:52.385] libwayland: dup failed: Too many open files
[14:29:52.385] caught signal: 6
The Wayland connection broke. Did the Wayland compositor die?
--
Looking at the lsof output after around 60 suspend/resume cycles I can
--
weston 921 root 11u CHR 29,0 0t0
5323 /dev/fb0
--
The 11u part changes, but other parts are same. First time there is
11u and 13u but later it goes all the way to 93u (some numbers are
missing).
Also my QT application running when suspending keeps keeping these
kinds of files open which I assume (I may be wrong) because the
weston.
--
qtapplicat 929 965 root 11u REG 0,18 48097
12272 /run/user/root/weston-shared-24Zz9a (deleted)
--
I know that the Weston is quite old since I think 4.0.0 has already
come out,but is there any known bug that causes this? I did a search,
but couldn't find any. I know it's rare to need suspend/resume over
400 times without cutting power at some point, but I'm sure there'll
be that one customer that does it.
-Teemu K
_______________________________________________
wayland-devel mailing list
https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Loading...