[PLS8-X linux 3.10.y ] network not responding after resume from mem suspend | Telit Cinterion IoT Developer Community
January 17, 2019 - 9:19pm, 2305 views
Hello
I'v have a fleet of PLS8-X that are deploy on ARM board running 3.10.26. We are using pppd to initiate the connection with the modem. The modem are using the firmware 3.017 .
First a bit of context:
We can initiate a connection with modem and keep it running until we start doing sleep (echo mem > /sys/power/state ). Usually after 10-20 sleep/wake-up (no exact pattern found yet) the network connection will stop responding (PING will timeout ) even if the PPP0 network interface still show up and pppd show no sign of error or disconnection. The usb port ACM0 and ACM1 are still present.
The cdc_acm driver for the modem clearly show we are entering suspend/resume mode.
For now the only way to get a connection back is to poweroff the modem using the IGT/EMERGENCY_OFF pin which re-trigger whole process of adding the usb device and driver.
Issuing the AT command CERR show nothing special.
my experience with modem is limited
so I'v got a few question:
- Is the PLS8-X is known to support linux sleep mode and recover/resume with a network connection still active when using the ACM modem mode and no interruption?
- Would the WWAN mode support suspend/resume and keep the network connection without any interruption?
- Would there be an efficient way debug the modem state.
feel free to ask for any configuration I might have.
Any help would be greatly appreciated.
Alexandre Leblanc
Alexandre Leblanc
Hello,
This should work with ACM and WWAN. Can you write more details about the use case for this sleep. Do you intend to use remote wakeup by network interface or is the device suspended and resumed on the regular basis? How long do these 10-20 sleep/wake-up cycles take, is it more or less constant time? How about USB interface, is it also suspended? Is it possible to communicate with the module on some other interface when this happens?
Regards,
Bartłomiej
Hello ,
using echo mem > /sys/power/state all devices (disk, usb, cpu, video, etc. ) will sleep. See dump A
we ususally sleep/resume every 3 minutes, but the issue can still be observed if we do fuzzing on the time. We dont use remote wakeup for now.
As soon as we resume the system, yes we still da have access to the ACM1 port and we can send AT command,
We use the sleep to prevent the board and the periphical to drain to much energie as we run on batterie.
Also , yesterday I was able to find something else. Right before we lose connection ,
we can see :
It seems the acm_tty_hangup is only called when "if (!acm->clocal && (acm->ctrlin & ~newctrl & ACM_CTRL_DCD))" return TRUE (in drivers/usb/class/cdc-acm.c)
BEGING DUMP - A (only the printk for the cdc_acm is active)
[ 251.020782] PM: Preparing system for mem sleep
[ 251.043884] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 251.069580] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 251.089508] PM: Entering mem sleep
[ 251.093109] Suspending console(s) (use no_console_suspend to debug)
[ 251.100952] cdc_acm 1-1.4:1.9: acm_suspend
[ 251.100952] cdc_acm 1-1.4:1.8: acm_suspend
[ 251.100952] cdc_acm 1-1.4:1.7: acm_suspend
[ 251.100982] cdc_acm 1-1.4:1.6: acm_suspend
[ 251.100982] cdc_acm 1-1.4:1.5: acm_suspend
[ 251.100982] cdc_acm 1-1.4:1.4: acm_suspend
[ 251.101013] cdc_acm 1-1.4:1.3: acm_suspend
[ 251.117858] cdc_acm 1-1.4:1.2: acm_suspend
[ 251.117858] cdc_acm 1-1.4:1.1: acm_suspend
---- Sleeping
[ 251.404052] cdc_acm 1-1.4:1.0: acm_resume
[ 251.404083] cdc_acm 1-1.4:1.1: acm_resume
[ 251.404113] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 0
[ 251.404113] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 1
[ 251.404144] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 2
[ 251.404144] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 3
[ 251.404174] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 4
[ 251.404174] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 5
[ 251.404205] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 6
[ 251.404205] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 7
[ 251.404235] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 8
[ 251.404235] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 9
[ 251.404266] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 10
[ 251.404266] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 11
[ 251.404266] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 12
[ 251.404296] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 13
[ 251.404296] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 14
[ 251.404327] cdc_acm 1-1.4:1.1: acm_submit_read_urb - urb 15
[ 251.404327] cdc_acm 1-1.4:1.2: acm_resume
[ 251.404357] cdc_acm 1-1.4:1.3: acm_resume
[ 251.404357] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 0
[ 251.404388] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 1
[ 251.404388] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 2
[ 251.404418] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 3
[ 251.404418] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 4
[ 251.404449] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 5
[ 251.404449] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 6
[ 251.404479] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 7
[ 251.404479] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 8
[ 251.404510] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 9
[ 251.404510] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 10
[ 251.404541] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 11
[ 251.404541] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 12
[ 251.404541] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 13
[ 251.404571] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 14
[ 251.404571] cdc_acm 1-1.4:1.3: acm_submit_read_urb - urb 15
[ 251.404602] cdc_acm 1-1.4:1.4: acm_resume
[ 251.404602] cdc_acm 1-1.4:1.5: acm_resume
[ 251.404602] cdc_acm 1-1.4:1.6: acm_resume
[ 251.404632] cdc_acm 1-1.4:1.7: acm_resume
[ 251.404632] cdc_acm 1-1.4:1.8: acm_resume
[ 251.404632] cdc_acm 1-1.4:1.9: acm_resume
[ 251.477996] usb 1-1.2: reset full-speed USB device number 5 using ohci-omap3
[ 251.597045] PM: resume of devices complete after 419.433 msecs
[ 252.842468] PM: Finishing wakeup.
[ 252.845977] Restarting tasks ... done.
END DUMP A
Alexandre Leblanc
Hello,
Could you paste AT&V output? You could also attach pppd log. Have you checked the network quality or network connection when this problem happens?
It seems based on the condition from cdc-acm.c code that it's fulfilled when DCD line is set by the module (to indicate that the connection is alive) and the application pulls DTR down. So it's rather not the network issue but the system (software or hardware) that causes hangup initiated by the driver.
As for remote wakeup if you don't wake the system on network traffic there is a risk that input buffers may be overflown if there is a trasmission from the network. Is this 3 minutes interval intended to prevent this or there is other logic behind? Remote wakeup on USB could be used in order to wakeup the system. Is the system waken-up only to transmit the data or for other reasons also and it doesn't communicate each time when it's up?
Best regards,
Bartłomiej
Hello !
The system will wake up to send hearthbeat, which would then signal a remote server that it is ok to send or request data . The system can also be wakeup by an external sensor. There is a tcp socket always open by a java application on the ppp0 interface managed by pppd.
"It seems based on the condition from cdc-acm.c code that it's fulfilled when DCD line is set by the module (to indicate that the connection is alive) and the application pulls DTR down. So it's rather not the network issue but the system (software or hardware) that causes hangup initiated by the driver.
AT&CESQ
AT&V
Alexandre Leblanc
Hello,
According to pppd log it seems that the connection termination starts when pppd receives signal 2 which is SIGINT (Interrupt from keyboard):
Jan 18 11:10:50 pppd[954]: Terminating on signal 2
This signal is sent to all processes started form a certain terminal when the user presses interrupt key (for instance CTRL-C) and by default it terminates the process.
Then cdc_acm toggles DTR line (based on your previous log) and the module is configured to react on that (AT&D2 setting, which is a default one and it is a correct configuration). And additionally pppd sends termination request on ppp layer:
Jan 18 11:10:50 pppd[954]: sent [LCP TermReq id=0x2 "User request"]
So it seems that the termination request comes form your application - you need to verify where this SIGINT comes from.
Best regards,
Bartłomiej
ok I'll make validation to remove any sigint sent to pppd and come back to you.
Alexandre Leblanc