0
Answered

MC41FS CAN Faults - Clarification please.

Julian Robinson 3 years ago in Master modules / MC4x updated by Gustav Widén (System support) 4 months ago 3

Hi there, I have an application with an MC41-FS connected to engine ECM and genset J1939 on CANB input. IQAN 6.05.18.6047. There are other devices on this J1939 as well such as instrumentation devices. These are all contained within a genset canopy and bolted under a train. The MC41 is used to control a single relay, thats its only function. Its not connected to any screens nor "Multi Master" system over Diagnostics A CAN for example. It does broadcast its status and other info like internal temps/status/fan speed etc using custom PGNs onto the same J1939 on CAN B.

This common J1939 CAN datalink is connected to the main group of Multi-Master controllers (MC43FSx3 and MD4-7x2). So the MC41 is indirectly linked to MC43FS on its CAN D using public J1939, just not the official Parker CAN Diagnostics link etc if you can follow that. It was not possible to use expansion modules for many reasons, MC41FS had to operate independently of the train when genset runs in isolation (not on the train) for testing etc.

There are 3 gensets on each train, therefore 3x MC41FS modules connecting to each MC43FS over 3x individual public J1939 datalinks.

I am also monitoring the J1939 using telematics, so I can see these devices operating semi real time.

Recently in service (12 October 21) all three MC41FS modules stopped communicating on CAN J1939 - all within about 20 minutes of each other. Train was in service and between stations. All other devices were fine and did not stop operating nor flag errors etc.

During monitoring , 3x MC41FS internal temps are well below 50 deg C at all times. Temperatures don’t seem a problem.

Yesterday we were able to attend the train while being serviced. I attempted to connect to the MC41FS's over J1939 CAN using IQANrun, and there was no response. We could not open genset to check lamp status as this is a sealed unit. So we only had the option to power down and on again. This cleared the issue and all the modules CAN network came back online again.

  • I have read the article regarding the No Contact CAN errors and possible causes for this. There could be many reasons why this module may have problems, but the unit has been in service since July 2021 which is 3 months now. No other device has locked up (ie the MC43s modules are connected to the same J1939 datalink
  • Looking at the MC41FS System Logs, there were historical errors with the Relay (Critical Error) but these were in July 2021, and since this time, there have been 5-6 System Started log events, meaning to me that the MC41FS was powered down and back again as part of battery isolation etc. This was the period when the genset was being commissioned and power supply and wiring issues were probably encountered regularly
  • There were no System Errors recorded in the log folders since August. Nothing at the time of the CAN Buss Off issue in service
  • The MC41FS has TDA enabled, so the MC43FS is used to send TDA PGN to the MC41FS over J1939 for date time alignment

Questions:

  • The article on No Contact references that CAN errors are accumulative before the CAN goes into Buss OFF status. Are CAN faults / issues cleared on reset? I would have thought so. Would this have been accumulative even though there were resets?
  • With reference to 2x nodes sending the same CAN Identifier: Does this refer to the entire 29Bit header ID or the PGN only? Ie I see no problem if two nodes send the same PGN, provided they are not the same Source Address?
  • Has there been any helpful updates in recent IQAN FW releases after this 6.05.18 version that will help re-set the CAN if there is a Bus Off event?

Its very hard to troubleshoot this situation from afar I know, and the only thing I can possibly consider changing is the application of the MC41FS or the FW or both.

Does anyone have any helpful comments/suggestions?

Thanks

Julian

Yes, error counters for CAN are cleared when restarting. 

The issue with identical identifier is the entire CAN identifier. Same PGN with different SA means the identifiers are different, so no risk of collision on the bus. 

From the description, it does not sound like a CAN bus off event. There should have been records of critical error on the buses in the MC41 system log. And since you are running 6.05, you have the CAN bus recovery function for MC4 (introduced in 6.03) 

It sounds as if there could be a critical stop event on the MC41 module. But the only way to know is to check the blink code. 

The historic log record with Critical error on a DOUT could possibly provide a clue, that could have been triggered by extreme undervoltage. 

Hi Gustov,

I reposted a new topic on this report, as its the same error but now with further information.  The MC41 LED blink codes are provided, which indicates a 4,1 critical error code.

Again, no system errors reported in the log folders.

Julian

Answered

As for the difference between CAN errors and Critical stop, I think this one is sort of answered. 

Continuing in MC41FS Critical Errors / Hardware / IQAN where the blink code is described.