Hi,
I am using r20p0 midgard drivers with kernel 4.14 (rc4) on an odroid xu4 board.
I have enabled DEVFREQ, and have simple_ondemand and performance governors available (default set to performance).
I get this kernel error when mali device is probed:
[ 4.492991] mali 11800000.mali: Continuing without Mali regulator control [ 4.503602] mali 11800000.mali: GPU identified as 0x0620 r0p1 status 0 [ 4.511482] mali 11800000.mali: Protected mode not available [ 4.518109] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 [ 4.525520] in_atomic(): 0, irqs_disabled(): 0, pid: 1, name: swapper/0 [ 4.532000] 3 locks held by swapper/0/1: [ 4.535675] #0: (&dev->mutex){....}, at: [<c04b69f8>] __driver_attach+0x78/0x120 [ 4.543424] #1: (&dev->mutex){....}, at: [<c04b6a08>] __driver_attach+0x88/0x120 [ 4.550960] #2: (rcu_read_lock){....}, at: [<c04af8a8>] kbase_devfreq_init+0x18/0x6e0 [ 4.558938] Preemption disabled at: [ 4.559007] [<c0122e58>] irq_enter+0x44/0x88 [ 4.566504] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc4-02 #2 [ 4.572971] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 4.579044] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 4.586781] [<c010cabc>] (show_stack) from [<c0821484>] (dump_stack+0x98/0xc4) [ 4.593969] [<c0821484>] (dump_stack) from [<c0144bcc>] (___might_sleep+0x264/0x2cc) [ 4.601690] [<c0144bcc>] (___might_sleep) from [<c0837200>] (__mutex_lock+0x2c/0xa38) [ 4.609484] [<c0837200>] (__mutex_lock) from [<c0837c28>] (mutex_lock_nested+0x1c/0x24) [ 4.617459] [<c0837c28>] (mutex_lock_nested) from [<c04c6a88>] (_find_opp_table+0x20/0x5c) [ 4.625692] [<c04c6a88>] (_find_opp_table) from [<c04c6cac>] (dev_pm_opp_get_opp_count+0xc/0x90) [ 4.634451] [<c04c6cac>] (dev_pm_opp_get_opp_count) from [<c04af934>] (kbase_devfreq_init+0xa4/0x6e0) [ 4.643648] [<c04af934>] (kbase_devfreq_init) from [<c049b008>] (kbase_platform_device_probe+0x5ec/0xc98) [ 4.653182] [<c049b008>] (kbase_platform_device_probe) from [<c04b83f4>] (platform_drv_probe+0x4c/0xb0) [ 4.662538] [<c04b83f4>] (platform_drv_probe) from [<c04b67d8>] (driver_probe_device+0x2d0/0x478) [ 4.671370] [<c04b67d8>] (driver_probe_device) from [<c04b6a84>] (__driver_attach+0x104/0x120) [ 4.679944] [<c04b6a84>] (__driver_attach) from [<c04b48a4>] (bus_for_each_dev+0x68/0x9c) [ 4.688083] [<c04b48a4>] (bus_for_each_dev) from [<c04b5a84>] (bus_add_driver+0x1cc/0x264) [ 4.696315] [<c04b5a84>] (bus_add_driver) from [<c04b7400>] (driver_register+0x78/0xf8) [ 4.704284] [<c04b7400>] (driver_register) from [<c0101b64>] (do_one_initcall+0x44/0x170) [ 4.712435] [<c0101b64>] (do_one_initcall) from [<c0c00df4>] (kernel_init_freeable+0x144/0x1d0) [ 4.721105] [<c0c00df4>] (kernel_init_freeable) from [<c0834dd0>] (kernel_init+0x8/0x110) [ 4.729247] [<c0834dd0>] (kernel_init) from [<c01088c8>] (ret_from_fork+0x14/0x2c) [ 4.740536] devfreq devfreq0: Couldn't update frequency transition information. [ 4.752374] mali 11800000.mali: Probed as mali0
Afterwards, if I change devfreq governor to simple_ondemand, I get the same error every second in syslog:
[ 1021.940152] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 [ 1021.947111] in_atomic(): 0, irqs_disabled(): 0, pid: 120, name: kworker/u16:1 [ 1021.954278] INFO: lockdep is turned off. [ 1021.958110] Preemption disabled at: [ 1021.958119] [< (null)>] (null) [ 1021.964879] CPU: 3 PID: 120 Comm: kworker/u16:1 Tainted: G W 4.14.0-rc4-02 #2 [ 1021.973101] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 1021.979175] Workqueue: devfreq_wq devfreq_monitor [ 1021.983863] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 1021.991568] [<c010cabc>] (show_stack) from [<c0821484>] (dump_stack+0x98/0xc4) [ 1021.998762] [<c0821484>] (dump_stack) from [<c0144bcc>] (___might_sleep+0x264/0x2cc) [ 1022.006475] [<c0144bcc>] (___might_sleep) from [<c0837200>] (__mutex_lock+0x2c/0xa38) [ 1022.014269] [<c0837200>] (__mutex_lock) from [<c0837c28>] (mutex_lock_nested+0x1c/0x24) [ 1022.022241] [<c0837c28>] (mutex_lock_nested) from [<c04c6a88>] (_find_opp_table+0x20/0x5c) [ 1022.030472] [<c04c6a88>] (_find_opp_table) from [<c04c6e10>] (dev_pm_opp_find_freq_ceil+0x18/0x64) [ 1022.039398] [<c04c6e10>] (dev_pm_opp_find_freq_ceil) from [<c0672b28>] (devfreq_recommended_opp+0x34/0x4c) [ 1022.049018] [<c0672b28>] (devfreq_recommended_opp) from [<c04af4ec>] (kbase_devfreq_target+0x7c/0x408) [ 1022.058287] [<c04af4ec>] (kbase_devfreq_target) from [<c0671214>] (update_devfreq+0xd4/0x1c4) [ 1022.066778] [<c0671214>] (update_devfreq) from [<c0671400>] (devfreq_monitor+0x24/0x78) [ 1022.074752] [<c0671400>] (devfreq_monitor) from [<c013827c>] (process_one_work+0x19c/0x504) [ 1022.083070] [<c013827c>] (process_one_work) from [<c013861c>] (worker_thread+0x38/0x568) [ 1022.091132] [<c013861c>] (worker_thread) from [<c013ed04>] (kthread+0x160/0x19c) [ 1022.098500] [<c013ed04>] (kthread) from [<c01088c8>] (ret_from_fork+0x14/0x2c) [ 1037.896456] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 [ 1037.903447] in_atomic(): 0, irqs_disabled(): 0, pid: 818, name: bash [ 1037.909732] INFO: lockdep is turned off. [ 1037.913667] Preemption disabled at: [ 1037.913683] [<c0837208>] __mutex_lock+0x34/0xa38 [ 1037.921730] CPU: 4 PID: 818 Comm: bash Tainted: G W 4.14.0-rc4-02 #2 [ 1037.929143] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [ 1037.935217] [<c01102c8>] (unwind_backtrace) from [<c010cabc>] (show_stack+0x10/0x14) [ 1037.942925] [<c010cabc>] (show_stack) from [<c0821484>] (dump_stack+0x98/0xc4) [ 1037.950118] [<c0821484>] (dump_stack) from [<c0144bcc>] (___might_sleep+0x264/0x2cc) [ 1037.957829] [<c0144bcc>] (___might_sleep) from [<c0837200>] (__mutex_lock+0x2c/0xa38) [ 1037.965627] [<c0837200>] (__mutex_lock) from [<c0837c28>] (mutex_lock_nested+0x1c/0x24) [ 1037.973600] [<c0837c28>] (mutex_lock_nested) from [<c04c6a88>] (_find_opp_table+0x20/0x5c) [ 1037.981830] [<c04c6a88>] (_find_opp_table) from [<c04c6e10>] (dev_pm_opp_find_freq_ceil+0x18/0x64) [ 1037.990758] [<c04c6e10>] (dev_pm_opp_find_freq_ceil) from [<c0672b28>] (devfreq_recommended_opp+0x34/0x4c) [ 1038.000377] [<c0672b28>] (devfreq_recommended_opp) from [<c04af4ec>] (kbase_devfreq_target+0x7c/0x408) [ 1038.009646] [<c04af4ec>] (kbase_devfreq_target) from [<c0671214>] (update_devfreq+0xd4/0x1c4) [ 1038.018138] [<c0671214>] (update_devfreq) from [<c067357c>] (devfreq_performance_handler+0x34/0x48) [ 1038.027150] [<c067357c>] (devfreq_performance_handler) from [<c0672578>] (governor_store+0xe0/0x168) [ 1038.036250] [<c0672578>] (governor_store) from [<c0284254>] (kernfs_fop_write+0x104/0x208) [ 1038.044482] [<c0284254>] (kernfs_fop_write) from [<c02144ec>] (__vfs_write+0x1c/0x128) [ 1038.052367] [<c02144ec>] (__vfs_write) from [<c021476c>] (vfs_write+0xa4/0x168) [ 1038.059646] [<c021476c>] (vfs_write) from [<c0214930>] (SyS_write+0x3c/0x90) [ 1038.066666] [<c0214930>] (SyS_write) from [<c0108820>] (ret_fast_syscall+0x0/0x28)
After going back to performance governor, it stops showing up.
Any resolutions?
Thanks.
Here's the patch for this issue, for those that want to add r20p0 to kernel 4.14 or above:
Author: memeka <mihailescu2m@gmail.com> Date: Fri Oct 13 10:25:00 2017 +1030 mali/midgard devfreq: fix for double locking _find_opp_table() After commit 5b650b388844f26c61c70564865598836d05dcb3, _find_opp_table() increments the reference under the opp_table_lock. So now there is no need to take the opp_table_lock or rcu_read_lock(). This patch drops the rcu_read_lock() around _find_opp_table() in the mali midgard r20p0 drivers. diff --git a/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c b/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c index d3e800e..2ba96f2 100644 --- a/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c +++ b/drivers/gpu/arm/midgard/backend/gpu/mali_kbase_devfreq.c @@ -89,10 +89,8 @@ kbase_devfreq_target(struct device *dev, unsigned long *target_freq, u32 flags) freq = *target_freq; - rcu_read_lock(); opp = devfreq_recommended_opp(dev, &freq, flags); voltage = dev_pm_opp_get_voltage(opp); - rcu_read_unlock(); if (IS_ERR_OR_NULL(opp)) { dev_err(dev, "Failed to get opp (%ld)\n", PTR_ERR(opp)); return PTR_ERR(opp); @@ -215,20 +213,16 @@ static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev, unsigned long freq; struct dev_pm_opp *opp; - rcu_read_lock(); count = dev_pm_opp_get_opp_count(kbdev->dev); if (count < 0) { - rcu_read_unlock(); return count; } - rcu_read_unlock(); dp->freq_table = kmalloc_array(count, sizeof(dp->freq_table[0]), GFP_KERNEL); if (!dp->freq_table) return -ENOMEM; - rcu_read_lock(); for (i = 0, freq = ULONG_MAX; i < count; i++, freq--) { opp = dev_pm_opp_find_freq_floor(kbdev->dev, &freq); if (IS_ERR(opp)) @@ -236,7 +230,6 @@ static int kbase_devfreq_init_freq_table(struct kbase_device *kbdev, dp->freq_table[i] = freq; } - rcu_read_unlock(); if (count != i) dev_warn(kbdev->dev, "Unable to enumerate all OPPs (%d!=%d\n",
Hi marianmi,
Thanks for the report, I will ask someone from our driver team to have a look at the issue.
Regards,
Daniele