[Jool-list] NAT64 performance

Alberto Leiva ydahhrk at gmail.com
Mon Sep 4 19:09:00 CDT 2017


Hmmmmmmmmmmmmmmmmmmmmmmmmmm. I have a theory. There might be an
optimization that could fix this to a massive extent. Although I have the
feeling that I've already thought about this before, so there might be a
good reason why I didn't apply this optimization already. Then again, I
could just have forgotten while developing.

How many addresses does the attacker machine have? I don't really know the
nature of TRex's test, but I can picture it trying to use as many of its
node's IPv6 addresses and ports to open as many connections as possible
through Jool. Considering your configuration, if this machine has more
addresses than one... then I think I see the problem. It's a very, very
stupid oversight of mine. And one I am extremely surprised that also
managed to slip past the performance tests so easily.

Then again, I haven't had to dive deep into the session code for a while,
so I might simply be missing this optimization. I badly need to look into
this.

> Q2: what can we do to improve this?

Ok, here's my attempt to explain it:

There should (in theory) exist a quick way for Jool to tell whether the
pool4 has been exhausted (ie. there is one existing BIB entry for every
available pool4 address). If this information were available, it should be
able to skip *an entire tree traversal* for every translated packet that
needs the creation of a new BIB entry.

In your case, the tree traversal is `65535 - 1025 + 1 = 64511` nodes long.
And yes, this operation has to lock, because otherwise it can end up with a
corrupted tree. So yeah, this is really bad.

Now, I think I'm starting to realize why I might not have implemented this:
Because pool4 is intended as a mostly static database (because this helps
minimize locking), and usage counters would break this. So implementing
this might end up being a tradeoff. But give me a few days; I'll think
about it.

Maybe I also assumed that the admin would always grant enough addresses to
pool4. If pool4 has enough addresses to actually serve the traffic, it
won't waste so much time iterating pointlessly.

> PS: Developers that want to work on this and would like access to my lab
boxes (Dell R630 with lots of cores, memory and some 10Gbit/s NICs): feel
free to contact me!

That'd be interesting. If my theory proves to be a flop, this would be
helpful in spotting the real bottleneck; I don't have anything right now
capable of overwhelming Jool.

> Lockless/LRU structures?

I'd love to find a way to do this locklessly, but I haven't been proven
that creative.

The problem is that there are two separate trees that need to be kept in
harmony (one for IPv4 BIB lookups, another for IPv6 BIB lookups), otherwise
Jool will inevitably create conflicting entries and traffic will start
going in mismatched directions. At the same time, the BIB needs to be kept
consistent with pool4, so the creation of a BIB entry also depends on many
other BIB entries.

On top of that, this all happens in interrupt context so Jool can't afford
itself the luxury of a mutex. It *has* to be a dirty spinlock.

Come to think of it, Jool 4 might not necessarily translate in interrupt
context so the latter constraint could be overcomed. Interesting.

> "f-args": 10,

This should be contributing to the problem. But not by much, I think.

Actually, if TRex's test is random, it's probably actually not contributing
much at all. It would be more of an issue in a real environment.

> kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
[kworker/u769:0:209211]

Yeah... this is not acceptable. I really hope I can find a fix.

On Mon, Sep 4, 2017 at 4:23 PM, Sander Steffann <sander at steffann.nl> wrote:

> Hi,
>
> > Just before the box freezes there are a lot of ksoftirqd threads quite
> busy:
> >
> >   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> COMMAND
> >    20 root      20   0       0      0      0 R 100.0  0.0   1:27.29
> [ksoftirqd/2]
> >    60 root      20   0       0      0      0 R 100.0  0.0   1:16.64
> [ksoftirqd/10]
> >    90 root      20   0       0      0      0 R 100.0  0.0   1:44.79
> [ksoftirqd/16]
> >   160 root      20   0       0      0      0 R 100.0  0.0   1:32.92
> [ksoftirqd/30]
> >   220 root      20   0       0      0      0 R 100.0  0.0   1:42.52
> [ksoftirqd/42]
> >    50 root      20   0       0      0      0 R 100.0  0.0   1:26.21
> [ksoftirqd/8]
> >   120 root      20   0       0      0      0 R 100.0  0.0   1:24.15
> [ksoftirqd/22]
> >   190 root      20   0       0      0      0 R 100.0  0.0   1:47.05
> [ksoftirqd/36]
> >   200 root      20   0       0      0      0 R 100.0  0.0   1:49.58
> [ksoftirqd/38]
> >   210 root      20   0       0      0      0 R 100.0  0.0   1:58.60
> [ksoftirqd/40]
> >   230 root      20   0       0      0      0 R 100.0  0.0   1:35.77
> [ksoftirqd/44]
> >   240 root      20   0       0      0      0 R 100.0  0.0   1:59.37
> [ksoftirqd/46]
> >   250 root      20   0       0      0      0 R 100.0  0.0   1:41.46
> [ksoftirqd/48]
> >   260 root      20   0       0      0      0 R 100.0  0.0   1:28.37
> [ksoftirqd/50]
> >   280 root      20   0       0      0      0 R 100.0  0.0   1:17.73
> [ksoftirqd/54]
> >   100 root      20   0       0      0      0 R  86.3  0.0   1:35.04
> [ksoftirqd/18]
> >   150 root      20   0       0      0      0 R  86.3  0.0   1:36.73
> [ksoftirqd/28]
> >    30 root      20   0       0      0      0 R  85.3  0.0   1:18.45
> [ksoftirqd/4]
> >    40 root      20   0       0      0      0 R  85.3  0.0   1:42.14
> [ksoftirqd/6]
> >    70 root      20   0       0      0      0 R  85.3  0.0   1:27.51
> [ksoftirqd/12]
> >   110 root      20   0       0      0      0 R  85.3  0.0   1:22.93
> [ksoftirqd/20]
> >   130 root      20   0       0      0      0 R  85.3  0.0   1:37.32
> [ksoftirqd/24]
> >   140 root      20   0       0      0      0 R  85.3  0.0   1:36.85
> [ksoftirqd/26]
> >     3 root      20   0       0      0      0 S  84.3  0.0   2:47.02
> [ksoftirqd/0]
> >    80 root      20   0       0      0      0 R  84.3  0.0   1:43.63
> [ksoftirqd/14]
> >   270 root      20   0       0      0      0 R  84.3  0.0   1:21.22
> [ksoftirqd/52]
> >   170 root      20   0       0      0      0 R  66.7  0.0   1:43.50
> [ksoftirqd/32]
> >   180 root      20   0       0      0      0 R  51.0  0.0   0:52.70
> [ksoftirqd/34]
> >   444 root      20   0       0      0      0 R  46.1  0.0   1:41.75
> [kworker/34:1]
> > 205389 root      20   0       0      0      0 R  19.6  0.0   0:04.53
> [kworker/u769:2]
> >   892 root      20   0  110908  69888  69556 S  18.6  0.1   4:06.82
> /usr/lib/systemd/systemd-journald
> >   644 root      20   0       0      0      0 S  15.7  0.0   0:19.73
> [kworker/14:1]
> >   740 root      20   0       0      0      0 S  15.7  0.0   0:08.38
> [kworker/52:1]
> > 26633 root      20   0       0      0      0 S  15.7  0.0   0:35.72
> [kworker/6:0]
> > 209211 root      20   0       0      0      0 S  15.7  0.0   0:03.96
> [kworker/u769:0]
> >   111 root      20   0       0      0      0 S  14.7  0.0   0:16.67
> [kworker/20:0]
> >   541 root      20   0       0      0      0 S  14.7  0.0   0:16.59
> [kworker/12:1]
> >  2698 root      20   0       0      0      0 S  14.7  0.0   0:09.77
> [kworker/28:1]
> > 12100 root      20   0       0      0      0 S  14.7  0.0   0:17.94
> [kworker/18:1]
> > 40451 root      20   0       0      0      0 S  14.7  0.0   0:12.69
> [kworker/24:0]
> > 40592 root      20   0       0      0      0 S  14.7  0.0   0:23.30
> [kworker/32:1]
> > 212923 root      20   0       0      0      0 S  14.7  0.0   0:02.89
> [kworker/4:1]
> > 300255 root      20   0       0      0      0 S  14.7  0.0   0:17.89
> [kworker/26:0]
> >
> > At least I'm making some use of those 28 cores ;)
>
> Fun addition: the kernel just warned me right after I could log back in:
>
> Message from syslogd at tr3.retevia.eu at Sep  4 23:15:42 ...
>  kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
> [kworker/u769:0:209211]
>
> Very busy indeed :)
>
> Cheers!
> Sander
>
>
> _______________________________________________
> Jool-list mailing list
> Jool-list at nic.mx
> https://mail-lists.nic.mx/listas/listinfo/jool-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170904/7195ec34/attachment-0001.html>


More information about the Jool-list mailing list