Hello,
I was hoping to hear your opinion about a serious problem I have - it is either I solve it or reduce my LPC2478 CPU speed from 72[MHz] to 64[MHz] (11% loss. The problem does not seem to be occurring at lower MHz settings). I posted about this in the past but it was a long time ago. When I place a controller in an environmental chamber and increase the temperature to 80+ Celsius degrees, I often see data abort exceptions, and sometimes I get the impression that the PC takes a hike (even the firmware LED that blinks every 1 second becomes irregular for a while before it stops). The program is launched by a boot loader and has a lower level supporting firmware layer that handles some interrupts (not all). I also see that if RTX is not started at all (but the application hangs in a "for (;;)" loop instead, hence the bootloader and firmware layer were/are involved, but the application is idle) - the system never crashes! I have excluded, as far as I could tell, the roll of external memory or RTX in this situation. However, I still suspect RTX a little (even though my test programs never crashed). My question: did you ever encounter such a situation? Where do I look best? can this be the result of a misbehaving peripheral? NXP have confirmed the LPC2478 is not the reason.
No experience of this problem but just a random thought: how susceptible to high temperatures is the quartz crystal that you are using? A quick Google search came up with this statement from a manufacturer:
"Operating Temperature: Standard Operating Temperature ranges are generally considered as -20-+70 degrees Celsius (considered "commercial" Operating Temperature), and -40-+85 degrees Celsius (considered "Industrial" Operating Temperature)"
thanks for your attention. there is no casing involved (yet): the product is required to operate at temperatures of up to 70 degrees, but it might exceed that if the casing is installed, thus the vigorous testing. the crystal does not seem to be the problem - at least our hardware people said that... NXP claim that the LPC2478 was tested at their labs at up to 105 degrees, and some applications allow for us to 120 degrees...!
yes.
In that case, a look at the thermal properties of the coating material might be in order. And maybe a test with an uncoated unit, if available.
this situation seems to be directly related to the external RAM being addressed by both the application and the LCD controller. if the refresh rate of the LCD controller is reduced, or another type of software is flashed on the controller that does not benefit from the LCD controller - the issue disappears. I will have to adjust the timing parameters of my external RAM, it seems.
For the amount of stuff (and content of the stuff) you're posting, you would be better off with twitter :o
I don't know about you, but for me, these problems are the cream of the crop of this kind of line of work. almost every problem is a mystery, every problem can be solved (?) in different ways. did I enjoy sitting 3 days in front of an environmental chamber (I have a few burn makes!) ? no way, and the problem is not solved yet (but the probable cause known). but in the end, it is/was a lot of fun!
I second this!!, thats a real engineer soul, we are in some way... masochist geeks :)
Remember that a loop continuously accessing the DRAM will look like a super-charged RAM refresh. For problems with RAM refresh, it is often better to fill the RAM with a known pattern and then make sure that the RAM is not touched for a long time so that the only refresh there is comes from the DRAM controller performing background refreshes. Then revisit the chip one every hour and verify that the pattern is still correct. True.
Then, for DRAM memories the test case should be: fast access and slow access.
related to the external RAM being addressed by both the application and the LCD controller.
A conclusion you made impossible for anyone else to arrive at, by not mentioning anything about an LCD before, much less that it shared external RAM with the CPU. Is that dual-ported RAM, or how else do you organize shared access?
The data transfers LCDController <-> Memory are done by DMA, and there is an automatic mechanism for arbitration, This should not be a problem.
But it should be taked into account if the Video Buffer is located in DRAM and you want to test the DRAM PerÂ's suggestion.
"I second this!!, thats a real engineer soul, we are in some way... masochist geeks :)"
I would totally agree with that, I've been involved in plenty of projects where I've been totally engrossed for weeks/months on end, keeping a note pad by my bed for when I wake up with 'the ultimate answer' (much to the annoyance of my beloved wife).
But ... the difference is, most don't keep trying to share this random blabber out.
Have you ever been to a party and sat next to Mr. Boring?
Note that fast/slow accesses for a DRAM would most often be the actual timing of the signals. How long time for signals to settle, or to hold. Number of wait states.
There are quite a lot of tests needed for memory. Some for prototypes. Some for factory production. Some for every boot or maybe even regularly when run. - correct supply voltages at all temperatures and loads - correct timing of signals - good flanks and high/low logic levels for signals - all unused chip-selects etc having pull-up/pull-down - correctly wired (no shorts/breaks) - all memory cells working - stability at maximum load at low/high temperature - stability at zero load at low/high temperature (refresh working) - low-power retention (mainly SRAM with super-cap or battery) - ...
I have learned a little more about this problem in the mean time and was wondering if you can enlighten me further. I am currently running a weekend test of a controller that utilizes the LCD controller of the LPC2478 vs. a controller that does not. The first one is reduced to 64 [MHz] while the second one still runs at 72[MHz], and they communicate via a RS485 bus. Hopefully this remains stable but either way, I have just reduced the display's processing capacity by 12%... 'Samsung' have promised me that their DRAM (K4S561632J) does not suffer from any issues and that the EMC timing settings used now should apply to the entire range of temperatures (maybe the controller was not warmed up entirely or long enough when I concluded otherwise). I am not sure about the refresh rate, but either way I did try to play with it without any positive results. I am aware that the signals to the DRAM should be measured, but that is not so simple at 80+ degrees. The latest LPC24xx data sheet elaborates on the AHBCFGx registers which determine the arbitration of the AHB busses (my LCD, DRAM and peripheral(MCI interface uses GPDMA) hang on AHB1) . This is a very fundamental setting that I have no experience changing. Do you think this could help me out? I did a few tests with a negative result, but I feel that I have not exhausted it. Either way, can you think of another system setting that might influence this particular problem? I have, for now, ruled out bad traces and noise as another controller (without an LCD) uses the same hardware design and accesses to external RAM (MCI DMA) ) does not crash.
What does the manual say?
Try www.embeddedrelated.com/.../35996.php
I have found this reference myself, but unfortunately NXP do no explain the impact of modifying these registers. It is of course exceedingly hard to solve a problem that you do not fully understand with tools you do not fully understand... I believe this has something do to with how DMA/LCD DMA and the processor interact with the AHB bus, which changes slightly when temperature rises. I asked NXP to confirm that they have tested the LCD controller of the LPC2478 at these extreme temperatures but they have not replied yet.
If only you hadn't upset Master Zeusti.
Right now I am willing to use just about any help - Zeusti, that Steve figure from above, anything. It is either I solve this, or (assuming the system survives the weekend test!) CPU speed for the display has to go down to 64[MHz] !
It is either I solve this, or (assuming the system survives the weekend test!) CPU speed for the display has to go down to 64[MHz] !
I quickread this thread and did not see it mentioned that the internal heat generated by the chip is proportional to the clock speed.
NXP claim that the LPC2478 was tested at their labs at up to 105 degrees, and some applications allow for us to 120 degrees...! Under which operating conditions??
Erik
Erik,
Thanks for your comments. The answer to your questions is that I do not know: NXP did not elaborate, as far as I can tell, on the exact environmental conditions used to test the chip in any report I could get my hands on. I just don't have enough data to handle this properly...! And you are right: Going down to 64[MHz] might just mask a still existing problem. But at the moment, I don't have any other choice - product beta (thus, installation at the client site) phase is approaching.
OK, twittering continues. The display at 64[MHz] made it though the weekend. There are a couple of display related issues, but it is alive!
"...it is alive!"
I can breathe again. (Not to be confused with a yawn.)
I will sure to keep you up to speed, Stunned Steve. Hang in there!
As promised, I have an update that might interest operators of a LPC2470/78 using the LCD controller. I have found that: 1. lowering the CPU clock speed to 64[MHz] at 80+ degrees seems to stabilize the system. There are no additional legal PLL settings between 72[MHz] and 64[MHz] that support USB, I'm afraid. 2. This code
AHBCFG1 &= ~1 ; AHBCFG1 |= (3<<12) ; AHBCFG1 |= (4<<16) ; AHBCFG1 |= (2<<20) ; AHBCFG1 |= (1<<24) ; AHBCFG1 |= (5<<28) ;
when placed in main() {the data sheet does not specify that these fundamental settings are disallowed in application code and indeed they work}, will put the LCD in the most preferred position to access the AHB1 bus. This prevents image jitter and distortion when doing time consuming drawing on the LCD at 64[MHz].
"(I have a new, interesting post)"
Where?
There's a new post, but surely interesting is not a true description?
Or was that an attempt at irony?
Not knowing your true name, I must wonder: have you ever posted something useful on this (or any other) forum? please stop hijacking my thread. I'm trying to be informative (=helpful). if you are in a mood for child's play, I suggest you go to a kindergarten.
Is that you being serious again?
Have you tried looking for the Keil command line option -SetMaxTemp=80
You'll not find it. It's not a Keil related problem.
There are no additional legal PLL settings between 72[MHz] and 64[MHz] that support USB, I'm afraid.
Have you eliminated the PLL as the cause of the problem? We do some very high temperature stuff and have had problems with PLLs becoming unstable or ceasing to function altogether in devices that have worked fine with an external oscillator.
this really is beyond you, ha? this is an issue that might effect Keil USERS and the safety of their product since Keil supports the LPC2478, thus very much their concern. Apart from that, I don't remember asking you when, if or what I may or may not post. Just go away, it won't help you.
Jack,
As I have mentioned, I have another flavor of this product that does not use the LCD controller but is almost identical in any way (processor print, external RAM etc.). It does not crash under the same conditions. Our hardware guy promised me that the crystal is not the issue. He did not measure yet, though.
Ok.
You're obviously happy with your tenuous links.
A small update: Our hardware guy has installed an extra resister across the DRAM clock input and it seems to do wonders! We saw that the first 2 clock pulses of a refresh cycle were far from perfect. On the other hand, this could be an issue with the particular DRAM. Soon we will try another one, and will know for sure.
What do you mean "across"? Having a resistor in series with high-speed signals will reduce the rise/fall times. It will reduce the amount of radiated noise but can also solve problems with ringing signals where the input side may see multiple toggles. Having too large resistor will on the other hand produce a shark tooth waveform where you either don't reach the full voltage swing fast enough or where the flanks are so slow that an input without schmitt-trigger will get false readings.
View all questions in Keil forum