This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Samsung Chromebook; Serious stability issues when following graphics and compute guide

Brief version: How can I debug my Chromebook woes when trying to run kernel-next with the Mali drivers?

I followed this guide: Graphics and Compute Development on Samsung Chromebook It worked, but I'm left with an incredibly unstable system. I'm running a Samsung Chromebook, model number SNOW FREMONT A-E 0878.

A list of stability issues so far:

  • Most importantly, all of the MMC bus seems to give out rather soon after booting up, no matter what. All I/O starts to fail and after a while of waiting I see messages from the kernel reporting timeouts on the MMC bus, and then the system gets locked up.
  • VT1 will typically start to get flooded with something like "udevd[187]: timeout: killing '/sbin/modprobe -bv sdio:c00v02DFd912A' [262]" until it eventually locks up or throws a kernel panic message.
  • The system will often run long enough to actually let me do something. I can connect to wifi, but briefly after doing so, I will usually see the mwifiex_sdio driver hit a time out, followed by a card reset, at which point the device disappears from the system.
  • Sometimes it doesn't even boot. It just stops during the upstart sequence, before any of the other VTs are created, and then it eventually throws some sort of kernel panic. The actual kernel panic message is pushed off screen by the remainder of the data.

I'm even sure there's more, but for now that should give a good idea of what's up: ridiculous instability. I might be able to throw the image on a USB drive and use a USB wifi adapter and skate by some of this, but that's really not a sufficient solution. Given that and the incredible unlikeliness that anyone here has already had or is having this issue, I'd really like it if someone could tell me how to debug these issues as I don't have any clue how to approach it. I have a bit of familiarity with the Linux kernel and a ton with C, but not a darn clue how I'd go about debugging Linux running on a Chromebook.

Hope this is related enough to be discussed here - I certainly don't know where else to discuss problems with this guide.

Parents
  • Hi fattire,

    No update from our side, my only advice would be to use USB as this is more stable.

    The kernel we are using (3.8) is still considered unstable, 3.4 being the stable kernel, but we decided to use 3.8 for performance reasons. 3.4 might fix this problem but wouldn't be officially supported by us, and would likely result in significantly reduced performance. We are strictly speaking just responsible for the Mali side of things, so while we do have a few kernel hackers on board who are aware of the problem, I can't say that we WILL fix this problem, and ultimately the fix would very likely have to come from elsewhere in the community, but it is at least on our radar and we are keeping it in mind in case we can fix/work around the problem. If the problem is not present in 3.4 maybe someone can take a look at the differences?

    I've had a quick look and the instructions and they refer to a specific commit on an old repo, which has now moved and is active (last commit today!). It might be worth trying with HEAD on that, in case the problem has been solved? We will test with this and check if it's stable on our side but you'll be better at telling us whether this has fixed it for you

    Thanks,

    Chris

Reply
  • Hi fattire,

    No update from our side, my only advice would be to use USB as this is more stable.

    The kernel we are using (3.8) is still considered unstable, 3.4 being the stable kernel, but we decided to use 3.8 for performance reasons. 3.4 might fix this problem but wouldn't be officially supported by us, and would likely result in significantly reduced performance. We are strictly speaking just responsible for the Mali side of things, so while we do have a few kernel hackers on board who are aware of the problem, I can't say that we WILL fix this problem, and ultimately the fix would very likely have to come from elsewhere in the community, but it is at least on our radar and we are keeping it in mind in case we can fix/work around the problem. If the problem is not present in 3.4 maybe someone can take a look at the differences?

    I've had a quick look and the instructions and they refer to a specific commit on an old repo, which has now moved and is active (last commit today!). It might be worth trying with HEAD on that, in case the problem has been solved? We will test with this and check if it's stable on our side but you'll be better at telling us whether this has fixed it for you

    Thanks,

    Chris

Children
  • Can you explain how you got the system to boot from a USB stick?  I built as per the development guide with a 16GB usb stick as the target and the boot hangs once the it detects the stick.

    --Mike

  • Heya-- thanks for the response-

    Looks like after the 3.8.11 tag they moved on to 3.10..  [see below-- you're talking about the chromeos-3.8 branch...]  .  I'm tethering off my phone right now so don't wanna grab the latest source, but I can give it a shot in the next few days and get back to you.  I hope they checked something in.  I don't know if I mentioned in another thread I did try applying the changes to mainline kernel at HEAD (as recommended here), but there were SO many changes-- code didn't match up-- some of the files didn't even exist any more.  I started hand-checking stuff and matching up moved headers and adding files and such but it quickly became a rabbit hole, and I gave up after a few hours.


    I'll try kernel-next's HEAD later and see what happens, and report back here.  I have a feeling though the source will be too different and/or the Samsung Chromebook has been long abandoned by the Google engineers...  but I'll letcha know.

    Thanks!  If anyone else wants to try too, lemme know.

    Update-- oh you mean this branch chromeos-3.8, not the raw v3.8 branch, huh?  I'll try this.  Also, I'll update this thread as I go...  hopefully it doesn't send a new email to you every time I change it... if it does, let me know.

    Update #2:  Build breaks here:

    drivers/gpu/arm/t6xx/kbase/src/platform/chromebook/mali_kbase_config_chromebook.c:376:2: error: unknown field ‘power_suspend_callback’ specified in initializer

      .power_suspend_callback = pm_callback_suspend,

      ^

    drivers/gpu/arm/t6xx/kbase/src/platform/chromebook/mali_kbase_config_chromebook.c:377:2: error: unknown field ‘power_resume_callback’ specified in initializer

      .power_resume_callback = NULL

      ^

    I commented both of the above out and the build went to the end, though I did see some "arch/arm/boot/compressed/head.S:978: Warning: (null)" when making the zImage.

    Next, in kernel.its, "arch/arm/boot/dts/exynos5250-snow.dtb" doesnt' exist any more.. so I guessed and used arch/arm/boot/dts/exynos5250-snow-rev5.dtb

    The rest of the kernel built normally and I have some kernel modules + a boot.img

    When booting, I get a Oops 5.... mmcqd in the same place--  dw_mci_pre_dma_transfer.isra.... from dw_mci_pre_req...  again at random times... sometimes on boot, sometimes after UI starts... about 15 seconds.  So not fixed...  it actually seems worse as I'm trying to get in to at least boot back to chromeos by default... :/

    (mmc2 / wifi having issues too, but I'm not worrying about that...)

    Any suggestions would be appreciated.

  • This seems to confirm the mmc issue is still present thanks for trying. A lot of the wifi issues we've seen are userspace related (outdated udev) so people having trouble could try using a more recent distribution, people in the office are using Arch for example without issue.

    mikewinter: I haven't actually got it BOOTING from USB, sorry to not be clear, I meant to boot from SD card and as soon as possible move to using USB stick/NFS mount for actual work, as this is far more stable.

  • Yeah it seems to be there still unfortunately.  I thought for a second it may have been related to interference from the USB (which was plugged in, and from which I had flashed boot.img) but even after pulling it, though it didn't crash as quickly it seemed, it still crashed..

    The outdated userspace stuff makes total sense... if this one bug were fixed I'd probably be rockin' and rolling in no time...  so strange that the new 3.8 branch doens't have something so basic fixed... I wonder if it's some kind of interactiion with the changes I made...  I dunno.

    If it would be helpful to anyone I can put the 3.8 kernel + the patches from arm + the new device driver on github just to save some time...