+3
Completed

Bus-off recovery function?

David Hansson 5 years ago in IQANdesign updated by Gustav Widén (System support) 4 years ago 4 1 duplicate

Hi.

I have noticed that sometimes my MC43 gets stuck in a "Bus-off" state (Error code 3.4) a it does not recover, until I shut of the entire system. In my case the controller is used for steering and I need it to be able to reboot and recover locally on its own, in order for me to uphold my steering functionality without shutting down my system.

We have put in a functional request for this, and I'm just reaching out to see if anyone else is experiencing the same problem?

Regards

David PH Sc

Duplicates 1

Hi David,

I've been working with IQAN modules for 13 years. Quite sometime back, I had a similar experience using a MC2 as part of the control system for a military vehicles rear steer control system. In working through this with the IQAN personnel, I found that the low level firmware in IQAN modules does not implement the auto-retry function that is prescribed by the CAN 2.0b specification. The rational is that if a BusOff occurs, they don't want the module to (possibly) leave outputs in undefined states that could cause the affected hydraulic systems to behave abnormally.

The workaround that was suggested to me, at that time,  was to use a relay to supply power to the IQAN module through it's NC contact. Under normal Bus conditions, the relay is de-energized. The relay is to be controlled by a output from the module, such that it is to be energized when a Bus Off is detected. This in turn will cycle power on the affected module, resetting the Bus so it comes back on, as well as de-energizing the relay.

We did not go down this path as the amount of time required to cycle the module and bring the bus back on-line was not acceptable for a critical control/safety function like steering. Instead we designed our own in-house control module based on a Microchip part that was independent of what was being controlled using IQAN.

I would suggest that Parker IQAN needs to make the auto-restart behavior user selectable, as some times their "default safety" approach can actually makes things unsafe.

In the meanwhile, you may want to spend some time to determine what is causing your BusOff's. It could be physical layer issues (connectivity, termination, etc) or possibly bus utilization. We ended up putting our steering controls on a dedicated bus, to insure that other bus traffic could not take it down.

Good Luck,

Mark Bevington

Under review

Thanks for posting the idea and for sharing some of your experience. 

For the sake of the discussion, it should be mentioned that the IQAN system is designed based on the assumption that by switching off the controller, the machine is brought to a safe state. See for example the MC4x safety manual on safe state.

In an application where this isn't a valid assumption (for example steering at speeds above 20-30 km/h), the system has to be designed with redundant controllers to improve availability of the function. 


With this said, I think that a bus-off recovery function could be useful, and there is nothing wrong in including functions that can recover after a fault as long as the restart conditions are considered.

For example, after an error on a COUT or DOUT, the output will be kept off in the error status until there command is zero, it is only after this one can attempt to to reactivate. This behavior on COUT and DOUT is the same regardless of the type of error, it doesn't matter if it is a wiring error like an open load, or a No contact with an expansion where the outputs were located.


So in principle, a bus off recovery wouldn't be any different, the restart behavior could be the same as after a "normal" timeout. 

For some general advice on troubleshooting the root cause of a CAN bus off, also see:

https://forum.iqan.se/knowledge-bases/2/articles/621-no-contact-and-critical-can-bus-error

Completed

Implemented in 6.03 for MD4, MC4x/MC4xFS and XC4x.