Hello there,
I'am developing a job schedule policy for Mali T628. But I am confused about Jos Slot and Address Space:
1. What does Job Slot mean? What is the relationship between job slot and shader core?
Mali T628 has 8 shader core. But I get the amount of job slot is 3 ( by printk gpu_props->num_job_slots).
2. What does Address Space mean?
Does it mean a space in memory to store process's context during context swicth?
Is context switch by hardware?
3. What the GPU do during soft and hard stop a job?
Thank you!
Hi wlc,
For Q1, the job slots refer to different types of work being run on the GPU. JS 1, 2, and 3 refer to Fragment, Tiling/Vertex/Compute, and Compute respectively. It has no relationship with number of cores.
Hi Chris,
Thanks for your reply.
So the job slost are just concepts in the driver level, am I right?
Furthermore, are the 8 shader core same and can compute any type of jobs, or a shader core can only compute appointed type of jobs?
best regards,
wlc
Some advice about Q2 and Q3?
Job slots are a concept in hardware as well, but not at the shader core level. All shader cores are identical, with the exception that they may be members of different "core groups". In GLES, this is transparent to the application, but for CL the core groups are exposed as different devices currently. All shader cores can execute any workload, regardless of which job "slot" the work happens to be coming from.
Q2 and Q3 I'll leave to other people, it's not my area of expertise. Maybe peterharris knows? I will say that context switching occurs when the CPU changes to another thread/process, but the GPU does not do this. The GPU just consumes jobs fed to it by the kernel driver, it is decoupled from the CPU processes themselves by the kernel driver.
Hope this helps,
Chris
Is job slot a concept in InterCore Task Management in T628? How many shader cores in each job slot?
Is Job Slot the same as "core groups"? And to some extent the job slot is like SM(Streaming Multiprocessor) in Nvidia cuda architecture and shader core is like cuda core, am I right?
Thanks very much!
I think that a "cuda core" would correspond to a small part of our arithmetic pipeline. The internet tells me that a cuda core can run scalar arithmetic operations, and a Mali T628 GPU can perform several vector operations in one pass through the arithmetic pipeline (see the question about floating point capacity of the core on this site). To me it sounds as if our concept of core is closer to their concept of an SM, although any such comparison will of course only be approximate.
>> Is job slot a concept in InterCore Task Management in T628? How many shader cores in each job slot?
No. The job slots - and as Chris has said there are 3 of them in total for the entire GPU - are the mechanism for submitting jobs. Chris' reply detailed which job slots are used for which type of job.
>> Is Job Slot the same as "core groups"?
No... core groups are a collection of cores. There can be a maximum of 2 core groups, each with up to 4 cores. You might - for example - have a Mali-T628 MP6, which refers to a total of 6 cores split between 2 core groups. These would be typically arranged as 4 cores in the first core group with 2 in the other. The three available job slots serve all the core groups and cores.
>> What does Address Space mean?
Can you clarify the context of "Address Space" in your question? For example, is this from a particular document you're referencing?
>> What the GPU do during soft and hard stop a job?
I assume you're after how the GPU recovers and resets itself after soft or hard stops. I believe for example that it can automatically stop a CL job that appears to have hung, or is just taking too long, but I'm not sure of the details. I'll see what else I can find for you.
Regards,
Tim
There is no heirarchical relationship between job slots and shader cores, they are distinct concepts. All shader cores are capable of processing any work from any job slot. The job slot mechanism is the concern of hardware outside of the shader core, and really is only there to facilitate the co-processing of vertex/compute and fragment loads in parallel.
Is Job Slot the same as "core groups"?
No they are distinct, core groups contain shader cores, job slots are not correlated with shader cores or core groups in any way. You will always have 3 job slots, regardless of the number of core groups.
And to some extent the job slot is like SM(Streaming Multiprocessor) in Nvidia cuda architecture and shader core is like cuda core, am I right?
Mali and NVidia architectures are really not comparable at this level in any meaningful sense. As Johan says, a CUDA "core" is probably most analogous to a part of a Mali ALU, and as T628 is vector based there are really 4 such things per ALU, so 8 per core, 48 per T628 MP6. This is a huge simplification.
Also, a CUDA core is a member of a warp, within which all cores and therefore threads must operate in lock step. T628 has no such restriction.
A "process" from the GPU's point of view - i.e. a unique address space with a set of MMU tables. Not necessarily a 1:1 mapping to a CPU process.
Soft-stop = suspend a GPU task prior to context switch.
Hard-stop = kill a badly behaved GPU process.
Thank you very much about your professional helps! @Chris Varnsverry @Johan Gronqvist @timhar01 @Peter Harris .
According to your answers, I further understand job slot, shader cores, adress space, but I'm not sure it's right or not. Please give me more advice.
1. Job slot is a concept in both driver and hardware. It's used to classify the job into 3 types so that different type of job can be separetly scheduled for parallel, load balancing, etc.
2. Shader core consists of several pipelines(ALU, LoadStore, Texture). All shader cores are same and can compute any type of job.
Shader cores are grouped in core group for efficiency. To support core group, mmu, l2 cache and AMBA interface are doubled.
3. Address Space is like virtual memory in cpu. Different context( kbase_context in driver) has different address space.
A context is associated with a cpu process, and a cpu process may have several context in gpu.Every time a cpu process open the /dev/mali, a context is created in driver.
4. Soft/hard stop is used to manage job life cycle. It's all done in hardware.
Some more questions:
1. Does the gpu save and restore job's state (registers, mmu, etc) automaticly during soft stop and resume a job? Where does the state save?
2. Dose Mali T628 support preempt? That is to say, can the driver pause a running job to run another one, and resume the paused one later?
3. Are gpu commands stored in per-context address space?
Thank you all again!
Apologies for the late response.
Your current understanding is correct on all points.
For your further questions:
1. This is too low level to go into detail on the forum. If you are an ARM licensee, you should ask those questions through the standard ARM support method
2. Yes
3. Yes
Thanks again, and let us know if you have further questions.
Kind Regards,
Michael McGeagh
Hi Peter,
Can you please explain more about "Hard-stop = kill a badly behaved GPU process."
I am wondering what a badly behaving GPU process could be like and under what situations?
Is it expected to come across hard stops and GPU reset?
Can any user application cause a GPU reset?
Basically I am seeing some GPU resets on mali gpu during some GLES composition test run in a loop, the gpu reset happens randomly while executing the tests for really long duration.
Thanks.
Very very long running threads/pixels which don't complete quickly enough to stop the GPU work before the timeout.
Technically, yes, but the application would have to be exceptionally strange (e.g. taking longer than the hard-stop timeout to render a tiny screen region or small number of vertices or compute work groups).
Under normal usage, no.
If your synthetic test has an exceptionally high number of layers per pixel, and/or very long shader programs, then it is certainly possible (real applications wouldn't do this).
HTH, Pete