Node is not able to start behind mikrotik, ports are open

Issue Report

Environment

  • ubuntu 20.04
  • ryzen9 5950x
  • 128gb
  • 1tb ssd
  • 850gb
  • docker

Problem

I never getting Discovered new external address for our node: and Node cannot start to sync.
If I replace mikrotik by cheap tplink its working, but i need mikrotik.

Steps to reproduce

Expected result

Node should report that it cannot register to network and farmer should not fill logs with retries.

What happens instead

I never getting Discovered new external address for our node: and Node cannot start to sync.
Ports are open and i can telnet to them from another place.
Here my mikrotik rules

add action=dst-nat chain=dstnat dst-port=30333 protocol=tcp to-addresses=192.168.88.102 to-ports=30333
add action=dst-nat chain=dstnat dst-port=30433 protocol=tcp to-addresses=192.168.88.102 to-ports=30433
add action=dst-nat chain=dstnat dst-port=30533 protocol=tcp to-addresses=192.168.88.102 to-ports=30533

Also I get every time

node_1 | 2023-03-26 11:23:38 [PrimaryChain] :x: Error while dialing /dns/telemetry.subspace.network/tcp/443/x-parity-wss/%2Fsubmit%2F: Custom { kind: Other, error: Timeout }

But i can connect to it

telnet telemetry.subspace.network 443
Trying 165.227.120.232…
Connected to telemetry.subspace.network.
Escape character is ‘^]’.

[Paste error here]

Attaching to subspace_node_1
node_1 | 2023-03-26 11:23:16 Subspace
node_1 | 2023-03-26 11:23:16 :v: version 0.1.0-4c28b5505e20a0367074dd1a59b8fbbd4f536f59
node_1 | 2023-03-26 11:23:16 :heart: by Subspace Labs https://subspace.network, 2021-2023
node_1 | 2023-03-26 11:23:16 :clipboard: Chain specification: Subspace Gemini 3c
node_1 | 2023-03-26 11:23:16 :label: Node name: gzivdo
node_1 | 2023-03-26 11:23:16 :bust_in_silhouette: Role: AUTHORITY
node_1 | 2023-03-26 11:23:16 :floppy_disk: Database: ParityDb at /var/subspace/chains/subspace_gemini_3c/paritydb/full
node_1 | 2023-03-26 11:23:16 :chains: Native runtime: subspace-1 (subspace-0.tx0.au0)
node_1 | 2023-03-26 11:23:17 [PrimaryChain] Storage provider cache loaded - 256 items.
node_1 | 2023-03-26 11:23:17 [PrimaryChain] DSN instance configured. allow_non_global_addresses_in_dht=false peer_id=12D3KooWR2S7eE5wcKR2PrEBaH9TXGkHpucC15dXiNg3UewMFJoa
node_1 | 2023-03-26 11:23:17 [PrimaryChain] Subspace networking initialized: Node ID is 12D3KooWR2S7eE5wcKR2PrEBaH9TXGkHpucC15dXiNg3UewMFJoa
node_1 | 2023-03-26 11:23:17 [PrimaryChain] Starting archiving from genesis
node_1 | 2023-03-26 11:23:17 [PrimaryChain] Archiving already produced blocks 0…=0
node_1 | 2023-03-26 11:23:18 [PrimaryChain] Processing a segment. segment_index=0
node_1 | 2023-03-26 11:23:18 [PrimaryChain] Segment publishing was successful. segment_index=0
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :label: Local node identity is: 12D3KooWR2S7eE5wcKR2PrEBaH9TXGkHpucC15dXiNg3UewMFJoa
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :adult::ear_of_rice: Starting Subspace Authorship worker
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Operating system: linux
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: CPU architecture: x86_64
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Target environment: gnu
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: CPU: AMD Ryzen 9 5950X 16-Core Processor
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: CPU cores: 16
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Memory: 128722MB
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Kernel: 5.15.0-67-generic
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Linux distribution: Ubuntu 20.04.5 LTS
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :computer: Virtual machine: no
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :package: Highest known block at #0
node_1 | 2023-03-26 11:23:18 [PrimaryChain] Running JSON-RPC HTTP server: addr=127.0.0.1:9933, allowed origins=[““]
node_1 | 2023-03-26 11:23:18 [PrimaryChain] Running JSON-RPC WS server: addr=0.0.0.0:9944, allowed origins=[”
”]
node_1 | 2023-03-26 11:23:18 [PrimaryChain] :part_alternation_mark: Prometheus exporter started at 127.0.0.1:9615
node_1 | 2023-03-26 11:23:23 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0
node_1 | 2023-03-26 11:23:28 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0
node_1 | 2023-03-26 11:23:33 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0
node_1 | 2023-03-26 11:23:38 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0
node_1 | 2023-03-26 11:23:38 [PrimaryChain] :x: Error while dialing /dns/telemetry.subspace.network/tcp/443/x-parity-wss/%2Fsubmit%2F: Custom { kind: Other, error: Timeout }
node_1 | 2023-03-26 11:23:43 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0
node_1 | 2023-03-26 11:23:47 Accepting new connection 1/100
node_1 | 2023-03-26 11:23:47 Accepting new connection 1/100
node_1 | 2023-03-26 11:23:47 Accepting new connection 2/100
node_1 | 2023-03-26 11:23:48 [PrimaryChain] :zzz: Idle (0 peers), best: #0 (0xab94…46b7), finalized #0 (0xab94…46b7), :arrow_down: 0 :arrow_up: 0

this is a common occurance and not an issue with your node, all this error is saying is that your node didnt connect to the telemetry server at https://telemetry.subspace.network

From the logs you posted it looks like your node is still just getting started and started to accept peer connections

it stay in this condition for days. i tried many times, its only with mikrotik. if i replace router - its start syning immediatly. but ports are open 100%, and i see traffic in both direction with tcpdump.
and i can telnet any time to telemetry, but node can not every time.
and this was like that from beginning (a year ago). but i need mikrotik for different purposes and the situation is really strange.
here dump 52Mb only for 1 minute of work after node is started (filter port 30333 or port 30433 or port 30533)

Which version are you using? There were some networking issues we had on some of the prior versions

switched right now to gemini-3d-2023-apr-05 - the same, used gemini-3d-2023-mar-29
but the issue exists since beginning (a year ago). I reported that issue exists many times in discord, i used different mikrotik (consumer and corporate grade). I cant debug myself.
I suppose that only 3 ports opened are not sufficient.

Have you enabled outbound? I see the rules to open inbound traffic, but what are your outbound rules?

It seems you have (I see you say you can telnet to telemetry - presumably from the node.) Since you know it is a router issue, I think there must be other rules conflicting, preventing discovery, etc.?

All outbound enabled, only incoming filtered by rules.
I can connect from that PC to telemetry each time (but not tried inside of docker, but with other router with that PC without any changes on PC subspace docker works too).
All other outbound connections are working, other nodes working too.

I only can imagine that some other ports are required. And the outbound traffic going to remote host, because they reply and there are alot of traffic (50-100Mb per minute) with up node and about 500Kb of incoming connections with down.

Actually if you want, you can use routeros VM inside virtualbox to help debug that issue or I can do some tests and actions which you tell to find out the root of the problem.

Just confirming you have port 30333 & 30433 forwarded?

Related Docs: Simple CLI (Recommended) | Farm from Anywhere

yes, sure, and i can telnet from outside

1 Like

@shamil Looks like were still having some weird issues with Mikrotik routers on new version, just wanted to bring to your attention

gemini-3d-2023-apr-14 - the same

2 Likes

gemini-3d-2023-apr-21 - the same

@gzivdo you mention that you are running multiple nodes that are working, are they all on the same network? Is it possible there is a clash between two or more nodes on the same router?

Thank you for continuing to report back :smile:

Yes, it was a time ago. Right now i try only subspace. And again - if i change the router, it start working right now, if i back to mikrotik - it cannot start.
But ports are open and i can telnet from internet to it and see that app is accepting connection.
And i see alot of traffic to and from node (you can see it in a dump too).

I can help to debug, but i dont know what to check now. I can suppose only that opened port not enough and some kind of UPnP is required, which is not enabled mikrotik.

Hey there! I was just wondering if you have any other firewall or NAT rules configured on your Mikrotik router that could be preventing the connections to the farmer/node? It’s possible that there may be conflicts between rules that could be causing this issue. Can you please provide more details about your router configuration (maybe even model, and such), so we can help you troubleshoot the problem?

Here small box for test with clean default config, only 3 nat rules:

/ip firewall/nat/export
# apr/25/2023 09:43:26 by RouterOS 7.6
# software id = 1574-7UCF
#
# model = RB941-2nD
/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" ipsec-policy=out,none out-interface-list=WAN
# no interface
add action=masquerade chain=srcnat out-interface=*A
add action=dst-nat chain=dstnat dst-port=30333 protocol=tcp to-addresses=192.168.88.102 to-ports=30333
add action=dst-nat chain=dstnat dst-port=30433 protocol=tcp to-addresses=192.168.88.102 to-ports=30433
add action=dst-nat chain=dstnat dst-port=30533 protocol=tcp to-addresses=192.168.88.102 to-ports=30533

My main device is RB2011UiAS-IN
Behavior is the same.