This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Samsung Chromebook; Serious stability issues when following graphics and compute guide

Brief version: How can I debug my Chromebook woes when trying to run kernel-next with the Mali drivers?

I followed this guide: Graphics and Compute Development on Samsung Chromebook It worked, but I'm left with an incredibly unstable system. I'm running a Samsung Chromebook, model number SNOW FREMONT A-E 0878.

A list of stability issues so far:

  • Most importantly, all of the MMC bus seems to give out rather soon after booting up, no matter what. All I/O starts to fail and after a while of waiting I see messages from the kernel reporting timeouts on the MMC bus, and then the system gets locked up.
  • VT1 will typically start to get flooded with something like "udevd[187]: timeout: killing '/sbin/modprobe -bv sdio:c00v02DFd912A' [262]" until it eventually locks up or throws a kernel panic message.
  • The system will often run long enough to actually let me do something. I can connect to wifi, but briefly after doing so, I will usually see the mwifiex_sdio driver hit a time out, followed by a card reset, at which point the device disappears from the system.
  • Sometimes it doesn't even boot. It just stops during the upstart sequence, before any of the other VTs are created, and then it eventually throws some sort of kernel panic. The actual kernel panic message is pushed off screen by the remainder of the data.

I'm even sure there's more, but for now that should give a good idea of what's up: ridiculous instability. I might be able to throw the image on a USB drive and use a USB wifi adapter and skate by some of this, but that's really not a sufficient solution. Given that and the incredible unlikeliness that anyone here has already had or is having this issue, I'd really like it if someone could tell me how to debug these issues as I don't have any clue how to approach it. I have a bit of familiarity with the Linux kernel and a ton with C, but not a darn clue how I'd go about debugging Linux running on a Chromebook.

Hope this is related enough to be discussed here - I certainly don't know where else to discuss problems with this guide.

  • Good, glad to hear it's not just me.  I had the same mmc issue which I reported but didn't get any response.  Note that I'm using chrubuntu on my internal mmc, not the sd-card version.  I kind of diverged from the official instructions as described in the previous link.

    I was planning, if I had time, to look at the kernel to see what subsequent patches to dw_mci_pre_dma_transfer were made to kernel-next to see if this was fixed that arm isn't including.

    I didn't notice those udev errors btw, though I was more focused on "why is the mmc dying all of a sudden".   Given the instability of the mmc, I wonder if fixing this might fix some of the other errors...

    I only had wifi working for like a second before everything died horribly.  Most of the time, it's not working at all... although my chrubuntu setup wasn't initially even loading all the necessary modules so that accounts for part of it...

    Anyhoo... if you're on IRC and want to talk through it, I'm "fattire" on freenode.

  • Hi guys,

    The mmc instability issue is strange as some chromebooks get it really bad and others do not. We will look into it, but for now using a USB drive is one way to sidestep the issue.

    Thanks,

    Chris

  • Hm, could it possibly be related to what kind of SD card is being used? I'm using a Sandisk Pixtor 32 GB Class 10 SD card. I ask because I vaguely remember similar problems with running the Raspberry Pi device with a greater-than-Class 6 SD card (although I did not experience them myself.)

    I'd like to test this hypothesis, but I don't have any other free SD cards on hand. I bought this one specifically to do this guide, in fact.

  • I've seen problems with specific SD vendors not playing nice in the past with certain boards, so it wouldn't surprise me! For what it's worth, I use a mixture of Sandisk 8GB SDHC class 4, and Sandisk Ultra 8GB MicroSDHC class 10, and I have SOME issues on the former but as long as I don't do anything too taxing, i.e. boot and immediately mount NFS/USB then it's fine. Maybe someone out there can fix and push upstream?

  • So Chris are you saying that you do have the problem with some SD cards and not others?  I'm having the problem with the internal mmc itself.

  • From what I can tell, it may be a combination of both. I've migrated over to USB for the time being, which works, although the USB stick is pretty crappy. However, Wifi still occasionally gives out, and in fact, you're right: I can still see the internal mmc crashing once in a while, even though I'm not doing anything with it.

    I don't know if I have enough free-time to look for what the actual issue might be. Part of me wants to try jumping back in kernel-next and trying to see if the issue still occurs months ago, but then I don't know what patches I may need to leave in for the ARM Chromebook to still boot properly.

  • Hi fattire,

    Sorry, misread your original post, I do see some instability sometimes on certain SD cards, but generally it doesn't stop me mounting nfs and testing things over that. I only really use the SD card for booting. I haven't tried using internal mmc but its surprising that doesn't work. I'll feed back these issues to the relevant team so they can look into them for a future release.

    Thanks,

    Chris

  • I had this issue long time back with external mmc. This problem was with 3.4 as well 3.8. Got really frustrated trying to fix it. I then finally decided to flash uboot and format internal mmc. I have put the steps in a blog post Bootloader unlock on samsung arm chromebook | kr blogs

    Please note that its very dangerous and risky flashing uboot

  • Hmm... I'm certainly intrigued, but will it really solve my problems? I'm having issues with the MMC bus on only 3.8 - 3.4 stock with Chrubuntu works, and I can access the SD card just fine inside Cros, no signs of instability. However, even running off USB stick, excessive use of the SDIO WiFi device will cause my device to crash. (The crash leaves me at a screen with an empty VT, a non-blinking cursor at the top left, and the mouse cursor still on screen. I'm guessing that's just the result of a kernel panic happening.)

  • The hardware modifications seems a little more hardcore than I feel like risking.  You can if you like run your own u-boot using the second bootloader method.  I got it working, and you can read the steps I took here in this discussion as they differed from the "official" steps.

    That said-- people seem to be building the mainline kernel instead of kernel-next, which I don't think anyone has checked changes into for two months.  I'm wondering if this might be the way to go.  (also see this thread)  If no one gives it a try, I am thinking of giving it a shot by taking mainline, applying the patches, and then seeing what happens

    ft

    Update:  Hmm, I'm not seeing much gpu support-- there's no /drivers/gpu/arm directory in mainline, making it kind of hard to do the changes to mali_kbase_config_chromebook.c, considering that file doesn't exist. Looks like framebuffer only. I added in the gpu driver and I'm trying to see if it can be shoehorned into place... but I invite anyone who might know more than I do to chime in.  Can the mali open source device driver be dropped into 3.12?

  • Hey Chris,

    It's been a couple months.  Any words on fixes for those of us with internal emmc issues?

    Update:  I spent a little more time looking at the crash.  It's so totally random-- can be while booting, but can also be after 3-5 minutes after logging in.

    Since the panic doens't seem to write anywhere, I grabbed a screenshot to see what was up...  looks like s5p_mfc_init_pm was failing to get a clock-gating control, whatever that means.  I found this, and applied it, along with this to the 3.8 kernel (had to tweak a little), but didn't seem to do anything.

    Do i give up?  Someone out there must know what to do...

  • Hi fattire,

    I've let our team know you are still in need of some help here. In future if you can make a discussion as a question it will ensure it gets more visibility as unanswered questions remain at the top of the homepage. I've moved this to a question now so you will be able to select an answer as correct and helpful once this has been resolved.

    Thanks,

    Ryan

  • Hi fattire,

    No update from our side, my only advice would be to use USB as this is more stable.

    The kernel we are using (3.8) is still considered unstable, 3.4 being the stable kernel, but we decided to use 3.8 for performance reasons. 3.4 might fix this problem but wouldn't be officially supported by us, and would likely result in significantly reduced performance. We are strictly speaking just responsible for the Mali side of things, so while we do have a few kernel hackers on board who are aware of the problem, I can't say that we WILL fix this problem, and ultimately the fix would very likely have to come from elsewhere in the community, but it is at least on our radar and we are keeping it in mind in case we can fix/work around the problem. If the problem is not present in 3.4 maybe someone can take a look at the differences?

    I've had a quick look and the instructions and they refer to a specific commit on an old repo, which has now moved and is active (last commit today!). It might be worth trying with HEAD on that, in case the problem has been solved? We will test with this and check if it's stable on our side but you'll be better at telling us whether this has fixed it for you

    Thanks,

    Chris

  • Can you explain how you got the system to boot from a USB stick?  I built as per the development guide with a 16GB usb stick as the target and the boot hangs once the it detects the stick.

    --Mike

  • Heya-- thanks for the response-

    Looks like after the 3.8.11 tag they moved on to 3.10..  [see below-- you're talking about the chromeos-3.8 branch...]  .  I'm tethering off my phone right now so don't wanna grab the latest source, but I can give it a shot in the next few days and get back to you.  I hope they checked something in.  I don't know if I mentioned in another thread I did try applying the changes to mainline kernel at HEAD (as recommended here), but there were SO many changes-- code didn't match up-- some of the files didn't even exist any more.  I started hand-checking stuff and matching up moved headers and adding files and such but it quickly became a rabbit hole, and I gave up after a few hours.


    I'll try kernel-next's HEAD later and see what happens, and report back here.  I have a feeling though the source will be too different and/or the Samsung Chromebook has been long abandoned by the Google engineers...  but I'll letcha know.

    Thanks!  If anyone else wants to try too, lemme know.

    Update-- oh you mean this branch chromeos-3.8, not the raw v3.8 branch, huh?  I'll try this.  Also, I'll update this thread as I go...  hopefully it doesn't send a new email to you every time I change it... if it does, let me know.

    Update #2:  Build breaks here:

    drivers/gpu/arm/t6xx/kbase/src/platform/chromebook/mali_kbase_config_chromebook.c:376:2: error: unknown field ‘power_suspend_callback’ specified in initializer

      .power_suspend_callback = pm_callback_suspend,

      ^

    drivers/gpu/arm/t6xx/kbase/src/platform/chromebook/mali_kbase_config_chromebook.c:377:2: error: unknown field ‘power_resume_callback’ specified in initializer

      .power_resume_callback = NULL

      ^

    I commented both of the above out and the build went to the end, though I did see some "arch/arm/boot/compressed/head.S:978: Warning: (null)" when making the zImage.

    Next, in kernel.its, "arch/arm/boot/dts/exynos5250-snow.dtb" doesnt' exist any more.. so I guessed and used arch/arm/boot/dts/exynos5250-snow-rev5.dtb

    The rest of the kernel built normally and I have some kernel modules + a boot.img

    When booting, I get a Oops 5.... mmcqd in the same place--  dw_mci_pre_dma_transfer.isra.... from dw_mci_pre_req...  again at random times... sometimes on boot, sometimes after UI starts... about 15 seconds.  So not fixed...  it actually seems worse as I'm trying to get in to at least boot back to chromeos by default... :/

    (mmc2 / wifi having issues too, but I'm not worrying about that...)

    Any suggestions would be appreciated.