Self sign-up has been disabled due to increased spam activity. If you want to get access, please send an email to a project owner (preferred) or at gitlab(at)nic(dot)cz. We apologize for the inconvenience.
Project 'turris/turris-build' was moved to 'turris/os/build'. Please update any links and bookmarks that may still have the old path.
On Turris OS 4.0.3, I had configured a WAN interface of my MOX classic to PPPoE protocol (behind VDSL modem in bridge mode).
The connection is fine, however it is unreliable after reboot. The connection flaps up and down after reboot and in fact never gets fully connected. If I issue /etc/init.d/network restart manually after reboot, the connection goes up and everything is fine until next reboot.
This issue is really annoying and critical as
It needs manual action after each reboot
The device is not remotely accessible until someone connects and fix PPP connection from LAN
WAN interface configuration in /etc/config/network (generated from Foris) is:
I experienced this issue also during the first DSL test about half year ago and @roburka was also unable to set up DSL connection on MOX with 4.0 beta... So I think this is not a unique situation and won't mark this Unconfirmed
@vmyslivec Can I ask you to try it on Turris OS 5.0 release? You can find it in the HBL branch for now. It was reported to me that one of our users, who is using Turris 1.1 on it has a similar (or even same) issue.
Is apparently a SIGTERM for the PPPD process which may or may not be related to **[1]**
There is a new option, child-timeout, which sets the length of time that pppd will wait for child processes (such as the command specified with the pty option) to exit before exiting itself. It defaults to 5 seconds. After the timeout, pppd will send a SIGTERM to any remaining child processes and exit. A value of 0 means no timeout.
To get a more verbose debug output for PPP you could enable in /etc/ppp/options the debug flag (and if convenient set a logfile path with logfile)
The ISP's PPPoE server may fail to respond if it does not receive the discovery packet in the first place, which could be do to the physical link state between modem and router being down at that point of time. PPPoE discovery is unaware of the physical link state, not sure how often it re-tries before giving up.
@kkoci if I unplug and plug in again the physical cable, it starts working normally. I ecxpect ip link down/ip link up would be the same. I would try it next time.
@n8v8R it's in infinite loop. Link diodes keep flashing on and off. I am out of ideas.
The log still showing
Not sure what I should do about it...
Anyway, I am waiting for HBK build become ready so I could test TOS 5.0.
I reckon that OpenWrt (netifd) suffers some issue with the link state when connected to external (incl. SFP) modems/module which are not likely to go away with 19.07 (5.x) but might improve with the code in the development branch.
The TOS forum has various reports about link state issue with external modems (since 18.06.x) and PPPoE seems to be suffering the most, perhaps because it unaware of the link state.
Log excerpt from my node (HBD) upon boot
kernel: [ 54.649804] mvneta f1034000.ethernet eth2: Link is Down
kernel: [ 54.668505] mvneta f1034000.ethernet eth2: configuring for 802.3z/1000base-x link mode
kernel: [ 54.668561] mvneta f1034000.ethernet eth2: Link is Up - 1Gbps/Full - flow control off
kernel: [ 54.674523] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
insmod: module is already loaded - ppp_generic
insmod: module is already loaded - pppox
insmod: module is already loaded - pppoe
pppd[4865]: Plugin rp-pppoe.so loaded.
pppd[4865]: RP-PPPoE plugin version 3.8p compiled against pppd 2.4.7
pppd[4865]: pppd 2.4.7 started by root, uid 0
pppd[4865]: Timeout waiting for PADO packets
pppd[4865]: Unable to complete PPPoE Discovery
pppd[4865]: Exit.
netifd: Interface 'wan' is now down
netifd: Interface 'wan' is disabled
netifd: Interface 'wan' is enabled
netifd: Interface 'wan' is setting up now
insmod: module is already loaded - slhc
insmod: module is already loaded - ppp_generic
insmod: module is already loaded - pppox
insmod: module is already loaded - pppoe
pppd[5038]: Plugin rp-pppoe.so loaded.
pppd[5038]: RP-PPPoE plugin version 3.8p compiled against pppd 2.4.7
pppd[5038]: pppd 2.4.7 started by root, uid 0
pppd[5038]: Timeout waiting for PADO packets
pppd[5038]: Unable to complete PPPoE Discovery
pppd[5038]: Exit.
netifd: Interface 'wan' is now down
netifd: Interface 'wan' is disabled
netifd: Interface 'wan' is enabled
netifd: Interface 'wan' is setting up now
insmod: module is already loaded - slhc
insmod: module is already loaded - ppp_generic
insmod: module is already loaded - pppox
insmod: module is already loaded - pppoe
pppd[5188]: Plugin rp-pppoe.so loaded.
pppd[5188]: RP-PPPoE plugin version 3.8p compiled against pppd 2.4.7
pppd[5188]: pppd 2.4.7 started by root, uid 0
pppd[5188]: PPP session is 8978
pppd[5188]: Connected to 78:ba:f9:73:f5:74 via interface eth2
pppd[5188]: Renamed interface ppp0 to pppoe-wan
pppd[5188]: Using interface pppoe-wan
pppd[5188]: Connect: pppoe-wan <--> eth2
pppd[5188]: PAP authentication succeeded
pppd[5188]: peer from calling number 78:BA:F9:73:F5:74 authorized
I reckon that OpenWrt (netifd) suffers some issue with the link state when connected to external (incl. SFP) modems/module which are not likely to go away with 19.07 (5.x) but might improve with the code in the development branch.
The TOS forum has various reports about link state issue with external modems (since 18.06.x) and PPPoE seems to be suffering the most, perhaps because it unaware of the link state.
Yop, I am suspicious about OpenWRT/netifd.
If we could isolate the issue, we should push some fixes to upstream or at least patch them at our side.
But is the SIGTERM still exhibited or not, considering you are mentioning "crash loop"? If it does it would seem somewhat unique. Did you try with a 5.x medkit from scratch?
My only idea is to get some non-Turris router and install plain OpenWrt on it and test the PPP connection. If it work, then test on Turris router with plain OpenWrt. It can help to isolate the problem.
@mmatejek can borrow you mine Xiaomi router, which has plain OpenWrt. But if you have a spare Turris Omnia router, you can flash it with plain OpenWrt as well.
I could test in on Alix APU and Xiomi router (OpenWrt 19.07) to rule out hardware issue, but I'm not much familiar with PPP and not sure if I can create ppp testing environment at home.
In newer OpenWrt (TOS 4.0+), default lcp-echo-failure and lcp-echo-intervalpppd options moved from /etc/ppp/options file to uci network config (handled via /lib/netif/proto/ppp.sh) keepalive option.
This change also enables keepalive_adaptive default value 1 (true) to be passed as pppd optionlcp-echo-adaptive, which is Debian/OpenWrt pppd enhancment (not included in original pppd).
This option means ppp will send lcp-echo packets only if the link is idle
When I apply the patch mentioned above (sleep 10). PPP starts and the connection is established.
In my case, it works with sleep 5 as well. But I feel like a some cargo cult member when I try to tune this "parameter". In only 1 out of about 6 testing reboots, IPv6 does not established correctly. IPv4 come up and works in all cases.
We should really find out why this sleep helps. cc @kkoci@mhrusecky
In the meantime as it works, I'm in favor to apply that sleep to fix the issues, which some of our users might have. And then we can investigate it more.
If this would be a permanent or long-term solution, I would introduce an UCI config option to control the length of this sleep. Users who don't experience this issue want to turn it off to speed up Internet connection after a reboot
I think that we should look more deep in to it. I would go with 10 seconds as previously used for now and I would include it in TOS 5.0.
I would consider that as clear hack. This clearly is because something has to happen and that sleep should be replaced with at least busy loop wait for that. The question we have to investigate for is why it helps and what exactly happens in the meantime.
We will see. I would still aim for 5.0.1 or even 5.1 with better fix but let's prepare and hopefully merge sleep 10 hack to 5.0 now.
Vodafone CZ VDSL with Comtrend VR-3031eu modem in bridge mode: I’m completely unable to connect to my ISP without the mentioned workaround. I tried /etc/init.d/network restart, but even that did not help without the sleep.