APM 3.3-rc8 beta crash log and forensics. Help appreciated

@pneves Please do not take my reply as me trying to convince you that there is nothing wrong, it is just my thought process .

My point was that there is definitely something unstandard going on and we do not fully understand what it is. We need to get to the bottom of this and make sure that we add some protection from letting it happen again.

Please try a fresh image and APM and to restore the data from the corrupted SD card and post back.

cristal clear,

thanks a lot for these informationsā€¦

regards

@pneves

Hereā€™s a log from 3.3-rc8 hexa wifi dongle attached. But it looks to be a null point.

https://drive.google.com/file/d/0B7HD5XkwsmeJNnJEOGk0STh4OUE/view

My compass panic was due to the different unit of measure mentioned earlier, and my log issue was a user error, I needed to update my gcs.

Looking forward to seeing the outcome of this. I plan to test out the navio+ on my octa, I will report back results.

@aquila It seems like the log 78.bin in your link is by @porelmundo. Do you have any particular question about it?

@igorvereninov , That was a mistake, sorry about that. I meant to post a log from my flight.

@igor.vereninov

Sorry to be reviving the post, but we have some preliminary conclusions as to why the behavior that caused the crash happened. It seems that it is due to some USB functionality problems.

When we have just the USB Wifi connected, nothing important happens in Nlon and MaxT parameters and the logs are good. On the other hand when we plug an Arduino dongle and have it communicate with Ardupilot in harmless ways bad things happen in the NLon and MaxT as described in the initial logs.

We have confirmed that it is not the communication per se that causes a problem because we flew with the same communications workloads through the ttyAMA0(UART) and everything is normal. Just when we have the exact same communication happening through USB(ttyUSB0) things are bad.

I am open to opinions or additional information requests.

Paulo Neves
Airborne Projects Lda

One question, can pigpiod cause bad timings in combination with the recent AP which is not using pipiod anymore.
On my old Pi1 pigpio is eating 25% cpu time, even if not used.

Yes it can and it will! Basically because you will be trying to access to the same resource at a very high rate.

@pneves

Can you reproduce this behaviour in a ground test? So if you plug in an arduino board over USB performance drops immediately?

As weird as it sounds in the Ground (no props) we never get to see the bad MaxT behaviour with the USB, only flying. It is true that the ground tests we do are much shorter than a full flight, but in the real flight with the USB dongle you can clearly see something is wrong.

When we connect the dongle directly to the ttyAMA0 everything is normal

The logs include RPMs field because I am running a slightly modified custom code to add support for RPM measurement. Even so this behavior occurs with the stock ArduCopter 3.3RC11 firmware provided by Emlid.

https://s3-eu-west-1.amazonaws.com/airborneprojects/BOUSSON_REDE.BIN ā€“ Same dongle in ttyAMA0 success
https://s3-eu-west-1.amazonaws.com/airborneprojects/MAXT_BOM_CHAO_USB.BIN ā€“ Dongle USB Ground, no problem
https://s3-eu-west-1.amazonaws.com/airborneprojects/MAXT_FDD_VOO_USB.BIN ā€“ USB Flight, MAXT very high. No crash.

@pneves
Have you actually ensured the pigpiod is disabled? It runs as standard (I think) in the image.
Second thing you can try is to power the usb dongle externally by a usb hub.
This is actually what I do. Bigger dongles donā€™t work reliably on the RPi.
I used a USB hub removed the shell and soldered JST (I think) connectors on it.
Never had any problems so far with this solution.
I my opinion the pipiod issue should be declared in the docs, as it is relatively dangerous.

Best, Daniel

@dgrat

We do not run pigpiod by default, only when APM starts.

Then it was maybe a config problem of my distro.
I noticed that pigpiod was running on startup.
I was not sure why.

I still do not understand this issue about pidgpio.

I am running a 3.3-rc11 version of APM, if it doesnā€™t need the pidgpio service then why does it launch it?

Paulo Neves
Airborne Projects


@pneves

Youā€™re running a 3.3-rc11 and youā€™re absolutely positive that pigpiod gets launched? Itā€™s very unlikely actually because itā€™s been removed for a while now. Is there a chance it is run during the boot (/etc/rc.local might have mentioned it)?

The issue is quite simple! pigpio and the new code that handles PPM conflict. Thereā€™s no way they could work together because as itā€™s been said above they use the same resources.

Oh it seems that indeed pigpio was indeed running. But we installed from the previous Raspbian image. Hopefully this will work to eliminate the problem.

Will report back after a flight
Paulo Neves

I can confirm that the pigpiod is not launched if RC11 is launched, which makes this problem unrelated to pigpiod. We just canā€™t trust the USB bus not being strange, it seems.

One interesting point is that the *dwc_otg processes seem to always consume around 12% of processor each even if no USB devices are connected. I know that the network card is internally connected through USB but I have 3 dwc_otg processes. I suppose that are relate to USB On the Go.

@george.staroselskiy Do you need any additional feedback on this problem? I have several clients and new products stuck for a Navio launch (we develop on the Navio) that are asking when is the USB problem solved.

Thank you for your attention
Paulo Neves
Airborne Projects Lda

@pneves

It is really hard to tell much, we will need more info. My initial guess is that USB dongle starts consuming too much power in flight as it gets further from the base and takes too much from the processor, so that it has to start throttling. Power consumption increase associated with increased distance would explain why it happens in flight only.

1 Like

On my small copter I have no problem with a tiny USB dongle.
However, for my bigger copter I use a USB Hub which is powering more demanding hardware.
This circumvents voltage drops. I donā€™t really think that something else then a voltage drop @the Piā€™s end can cause this problem.
Maybe you can try to reproduce this problem with an external powered USB Hubs and your problem dongle.
In general, powering a USB dongle is a very old problem from the times of the RPi1. Often such dongles cause timeouts and reconnect to the network again ā€¦
I am not sure what happens on the Pi under such conditions. Maybe the Pi2 is more sensitive or maybe not.
I was not really trying to find out.

2 Likes