0
Fixed
Blue screen on MD4-7 while operating
Hi,
I was wondering if anyone had blue screens happening while in operation?
I currently have an intermittent blue screen problem on an Md4-7 running Iqan 6.06. Images of one event are attached. The busload on the can-buses are low with no error frames.
Customer support service by UserEcho
What is the processor utilization of the application? It looks like there was a watchdog reset in the first 4 pictures. The last picture, I am not sure what happened there.
This MD4-7 unit runs with a cycle time of 10ms and the cycle utilization remains within 52% +/- 3%. We were able to reproduce this error while logging the cycle utilization on CAN and there was no change prior to event.
Should we look somewhere else? As for the last image, is there a way to go and fetch the latest logs? Since these blue screens happen randomly, expecting the user to pull out his cell phone and grab a photo fast enough seems pretty unlikely that we collect the correct image.
The last picture looks like the MD4 is scrolled up to show a historic error from July 16th 15:29 (year not included, but version was 5.04 at the time). That old one looks like it was too much outgoing CAN traffic on CAN-C.
The timestamp on the two more recent watchdog resets indicate both occurred during startup.
Our average bus loads for CAN-A, B, C and D are respectively 13.5%, 9.7%, 24.5%, 2.5%. We were able to reproduce this event while logging all CAN buses and we did not notice a change in bus load prior to the event.
Hi Gustav,
Any possible avenues to resolve this problem? My colleague answered your questions.
Let us know if you need additional information.
Thanks
Thank you. At a 10 ms cycle, MD4 cycle utilization at 55% is quite high, but if this is the peak value it does not explain the error. A relevant test could be to modify the application to include a MEM channel that record the peak utilization. If it peaks at startup, it would be impossible to catch with an IQANrun graph.
On MD4, it is is possible It is possible to get the full history of crash reports by getting a clone with debug info. The IQANscript get clone action has a property for retrieving this, but to extract the information the clone with debug info would have to be sent to us at Parker and manually decoded here.
An MD4 clone with this information could be quite large, it is good to be at the machine so you can extract it via local Ethernet connection.
I also recommend you to reach out to your local Parker rep on how to get the information across.
Further troubleshooting will involve looking specifically at the application, so it is best to take it through the conventional support channels.
Hi Gustav, thanks for the reply. Do we need to purchase an IQANscript license to retrieve those debug logs? The IQANsimulate trial version says we can only connect to a simulator. It seems I can only retrieve "logs" with IQANrun and not "debug logs".
The script needs to be created with a licensed copy of IQANscript. If you do not have IQANscript, the local Parker rep can provide a script file for you. I believe someone from the team has already reached out to you by now.
I have a customer that is getting a watchdog reset intermittently on a MD4-5 and according to the information provided, the CAN utilization is around mid 20% and cycle time is set to 50 ms. The application file is minimal and they say the utilization would be quite low. On the current MD4-5, it has happened once, after running for over an hour and they have seen it happen on other MD4-5's.
They are using IQANdesign version 6.06.14.7676
I am curious as to what would cause the watchdog reset for their particular case and also what in general triggers a watchdog reset?Image.jpeg
Kerry,
When you say "the CAN utilization is around mid 20% and cycle time is set to 50 ms.", do you mean CAN bus utilization or cycle utilization?
CAN bus utilization is measured on each bus, and is the fraction of time that bus is busy with traffic. In IQANdesign/IQANrun system layout you see a rough measurement of this for every bus that is used, up to four different on the MD4. You can get more accurate measurement with a CAN analyzer.
Cycle utilization is measured with a system information channel, and is the fraction of the application cycle time that is needed for calculating the application.
The first action if cycle utilization exceeds 100% is that the module skips calculating the next cycle. You see this as an error on the system information channel cycle utilization. If the application calculation is not completed within the watchdog timeout, the execution will stop and you will see the screen with the error message watchdog reset.
Hi Gustav,
Would you mind sending me the script file?
The watchdog reset events may have been triggered by problems on MD4 CAN A or CAN B, see:
IQANdesign 7.02.33 update / Software / IQAN
I've a customer who got once a blue screen event last week, we though this was linked to ESD (as discussed in another channel) because the MD4-10 in error is mounted on a cantilever part of the dash. They installed a ground strap to make sure metal part holding the screen are linked to the same ground reference as the rest of the vehicle chassis. They finally got 2 time machine complete shutdown due to those bluescreen episode today! It's crucial to understand what is going on here, this screen is required for vehicle control. Sorry for poor photo quality, this has been provided by the user in the field !
Earlier this year we found and fixed a problem where CAN errors could trigger this.
I suggest updating to 7.02.33
(your photo shows 7.02.31)
I have a machine that is producing the below blue screen while operating intermittently. Could this be an issue with the cycle time and/or CAN bus traffic? Might it be fixed with an update to 7.02.33?
Unfortunately, I don't have any detail on the cycle utilization or CAN bus utilization, but the cycle time is set at 20ms and the master/diagnostic bus is shared with a J1939 module with only 1 J1939 frame and transmit rate of 100ms.
Yes, I suggest updating to 7.02.33.
It is always a good idea to record the application cycle utilization, but unless it is at several hundred percent in your 20 ms application, I don't think that is the reason for the problems you have had.
In light of case 60506, CAN wiring errors or similar on MD4 CAN-A or CAN-B seems like a more likely cause.
Note that a CAN trace that only looks at bus utilization would not be enough to see the faults, you'd need to run a trace that counts error frames over time to see which bus it is you have problems on.