[Jool-list] NAT64 performance

Mon Sep 4 19:23:07 CDT 2017

> Considering your configuration, if this machine has more addresses than
one...

Scratch that; I just noticed you're using more than one address per
protocol.

This means that the traversal is `(65535 - 1025 + 1) * 16` nodes long. Oops.

On Mon, Sep 4, 2017 at 7:09 PM, Alberto Leiva <ydahhrk at gmail.com> wrote:

> Hmmmmmmmmmmmmmmmmmmmmmmmmmm. I have a theory. There might be an
> optimization that could fix this to a massive extent. Although I have the
> feeling that I've already thought about this before, so there might be a
> good reason why I didn't apply this optimization already. Then again, I
> could just have forgotten while developing.
>
> How many addresses does the attacker machine have? I don't really know the
> nature of TRex's test, but I can picture it trying to use as many of its
> node's IPv6 addresses and ports to open as many connections as possible
> through Jool. Considering your configuration, if this machine has more
> addresses than one... then I think I see the problem. It's a very, very
> stupid oversight of mine. And one I am extremely surprised that also
> managed to slip past the performance tests so easily.
>
> Then again, I haven't had to dive deep into the session code for a while,
> so I might simply be missing this optimization. I badly need to look into
> this.
>
> > Q2: what can we do to improve this?
>
> Ok, here's my attempt to explain it:
>
> There should (in theory) exist a quick way for Jool to tell whether the
> pool4 has been exhausted (ie. there is one existing BIB entry for every
> available pool4 address). If this information were available, it should be
> able to skip *an entire tree traversal* for every translated packet that
> needs the creation of a new BIB entry.
>
> In your case, the tree traversal is `65535 - 1025 + 1 = 64511` nodes long.
> And yes, this operation has to lock, because otherwise it can end up with a
> corrupted tree. So yeah, this is really bad.
>
> Now, I think I'm starting to realize why I might not have implemented
> this: Because pool4 is intended as a mostly static database (because this
> helps minimize locking), and usage counters would break this. So
> implementing this might end up being a tradeoff. But give me a few days;
> I'll think about it.
>
> Maybe I also assumed that the admin would always grant enough addresses to
> pool4. If pool4 has enough addresses to actually serve the traffic, it
> won't waste so much time iterating pointlessly.
>
> > PS: Developers that want to work on this and would like access to my lab
> boxes (Dell R630 with lots of cores, memory and some 10Gbit/s NICs): feel
> free to contact me!
>
> That'd be interesting. If my theory proves to be a flop, this would be
> helpful in spotting the real bottleneck; I don't have anything right now
> capable of overwhelming Jool.
>
> > Lockless/LRU structures?
>
> I'd love to find a way to do this locklessly, but I haven't been proven
> that creative.
>
> The problem is that there are two separate trees that need to be kept in
> harmony (one for IPv4 BIB lookups, another for IPv6 BIB lookups), otherwise
> Jool will inevitably create conflicting entries and traffic will start
> going in mismatched directions. At the same time, the BIB needs to be kept
> consistent with pool4, so the creation of a BIB entry also depends on many
> other BIB entries.
>
> On top of that, this all happens in interrupt context so Jool can't afford
> itself the luxury of a mutex. It *has* to be a dirty spinlock.
>
> Come to think of it, Jool 4 might not necessarily translate in interrupt
> context so the latter constraint could be overcomed. Interesting.
>
> > "f-args": 10,
>
> This should be contributing to the problem. But not by much, I think.
>
> Actually, if TRex's test is random, it's probably actually not
> contributing much at all. It would be more of an issue in a real
> environment.
>
> > kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
> [kworker/u769:0:209211]
>
> Yeah... this is not acceptable. I really hope I can find a fix.
>
> On Mon, Sep 4, 2017 at 4:23 PM, Sander Steffann <sander at steffann.nl>
> wrote:
>
>> Hi,
>>
>> > Just before the box freezes there are a lot of ksoftirqd threads quite
>> busy:
>> >
>> >   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
>> COMMAND
>> >    20 root      20   0       0      0      0 R 100.0  0.0   1:27.29
>> [ksoftirqd/2]
>> >    60 root      20   0       0      0      0 R 100.0  0.0   1:16.64
>> [ksoftirqd/10]
>> >    90 root      20   0       0      0      0 R 100.0  0.0   1:44.79
>> [ksoftirqd/16]
>> >   160 root      20   0       0      0      0 R 100.0  0.0   1:32.92
>> [ksoftirqd/30]
>> >   220 root      20   0       0      0      0 R 100.0  0.0   1:42.52
>> [ksoftirqd/42]
>> >    50 root      20   0       0      0      0 R 100.0  0.0   1:26.21
>> [ksoftirqd/8]
>> >   120 root      20   0       0      0      0 R 100.0  0.0   1:24.15
>> [ksoftirqd/22]
>> >   190 root      20   0       0      0      0 R 100.0  0.0   1:47.05
>> [ksoftirqd/36]
>> >   200 root      20   0       0      0      0 R 100.0  0.0   1:49.58
>> [ksoftirqd/38]
>> >   210 root      20   0       0      0      0 R 100.0  0.0   1:58.60
>> [ksoftirqd/40]
>> >   230 root      20   0       0      0      0 R 100.0  0.0   1:35.77
>> [ksoftirqd/44]
>> >   240 root      20   0       0      0      0 R 100.0  0.0   1:59.37
>> [ksoftirqd/46]
>> >   250 root      20   0       0      0      0 R 100.0  0.0   1:41.46
>> [ksoftirqd/48]
>> >   260 root      20   0       0      0      0 R 100.0  0.0   1:28.37
>> [ksoftirqd/50]
>> >   280 root      20   0       0      0      0 R 100.0  0.0   1:17.73
>> [ksoftirqd/54]
>> >   100 root      20   0       0      0      0 R  86.3  0.0   1:35.04
>> [ksoftirqd/18]
>> >   150 root      20   0       0      0      0 R  86.3  0.0   1:36.73
>> [ksoftirqd/28]
>> >    30 root      20   0       0      0      0 R  85.3  0.0   1:18.45
>> [ksoftirqd/4]
>> >    40 root      20   0       0      0      0 R  85.3  0.0   1:42.14
>> [ksoftirqd/6]
>> >    70 root      20   0       0      0      0 R  85.3  0.0   1:27.51
>> [ksoftirqd/12]
>> >   110 root      20   0       0      0      0 R  85.3  0.0   1:22.93
>> [ksoftirqd/20]
>> >   130 root      20   0       0      0      0 R  85.3  0.0   1:37.32
>> [ksoftirqd/24]
>> >   140 root      20   0       0      0      0 R  85.3  0.0   1:36.85
>> [ksoftirqd/26]
>> >     3 root      20   0       0      0      0 S  84.3  0.0   2:47.02
>> [ksoftirqd/0]
>> >    80 root      20   0       0      0      0 R  84.3  0.0   1:43.63
>> [ksoftirqd/14]
>> >   270 root      20   0       0      0      0 R  84.3  0.0   1:21.22
>> [ksoftirqd/52]
>> >   170 root      20   0       0      0      0 R  66.7  0.0   1:43.50
>> [ksoftirqd/32]
>> >   180 root      20   0       0      0      0 R  51.0  0.0   0:52.70
>> [ksoftirqd/34]
>> >   444 root      20   0       0      0      0 R  46.1  0.0   1:41.75
>> [kworker/34:1]
>> > 205389 root      20   0       0      0      0 R  19.6  0.0   0:04.53
>> [kworker/u769:2]
>> >   892 root      20   0  110908  69888  69556 S  18.6  0.1   4:06.82
>> /usr/lib/systemd/systemd-journald
>> >   644 root      20   0       0      0      0 S  15.7  0.0   0:19.73
>> [kworker/14:1]
>> >   740 root      20   0       0      0      0 S  15.7  0.0   0:08.38
>> [kworker/52:1]
>> > 26633 root      20   0       0      0      0 S  15.7  0.0   0:35.72
>> [kworker/6:0]
>> > 209211 root      20   0       0      0      0 S  15.7  0.0   0:03.96
>> [kworker/u769:0]
>> >   111 root      20   0       0      0      0 S  14.7  0.0   0:16.67
>> [kworker/20:0]
>> >   541 root      20   0       0      0      0 S  14.7  0.0   0:16.59
>> [kworker/12:1]
>> >  2698 root      20   0       0      0      0 S  14.7  0.0   0:09.77
>> [kworker/28:1]
>> > 12100 root      20   0       0      0      0 S  14.7  0.0   0:17.94
>> [kworker/18:1]
>> > 40451 root      20   0       0      0      0 S  14.7  0.0   0:12.69
>> [kworker/24:0]
>> > 40592 root      20   0       0      0      0 S  14.7  0.0   0:23.30
>> [kworker/32:1]
>> > 212923 root      20   0       0      0      0 S  14.7  0.0   0:02.89
>> [kworker/4:1]
>> > 300255 root      20   0       0      0      0 S  14.7  0.0   0:17.89
>> [kworker/26:0]
>> >
>> > At least I'm making some use of those 28 cores ;)
>>
>> Fun addition: the kernel just warned me right after I could log back in:
>>
>> Message from syslogd at tr3.retevia.eu at Sep  4 23:15:42 ...
>>  kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
>> [kworker/u769:0:209211]
>>
>> Very busy indeed :)
>>
>> Cheers!
>> Sander
>>
>>
>> _______________________________________________
>> Jool-list mailing list
>> Jool-list at nic.mx
>> https://mail-lists.nic.mx/listas/listinfo/jool-list
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170904/18b4d413/attachment.html>