[Jool-list] NAT64 performance
Alberto Leiva
ydahhrk at gmail.com
Mon Sep 4 19:23:07 CDT 2017
> Considering your configuration, if this machine has more addresses than
one...
Scratch that; I just noticed you're using more than one address per
protocol.
This means that the traversal is `(65535 - 1025 + 1) * 16` nodes long. Oops.
On Mon, Sep 4, 2017 at 7:09 PM, Alberto Leiva <ydahhrk at gmail.com> wrote:
> Hmmmmmmmmmmmmmmmmmmmmmmmmmm. I have a theory. There might be an
> optimization that could fix this to a massive extent. Although I have the
> feeling that I've already thought about this before, so there might be a
> good reason why I didn't apply this optimization already. Then again, I
> could just have forgotten while developing.
>
> How many addresses does the attacker machine have? I don't really know the
> nature of TRex's test, but I can picture it trying to use as many of its
> node's IPv6 addresses and ports to open as many connections as possible
> through Jool. Considering your configuration, if this machine has more
> addresses than one... then I think I see the problem. It's a very, very
> stupid oversight of mine. And one I am extremely surprised that also
> managed to slip past the performance tests so easily.
>
> Then again, I haven't had to dive deep into the session code for a while,
> so I might simply be missing this optimization. I badly need to look into
> this.
>
> > Q2: what can we do to improve this?
>
> Ok, here's my attempt to explain it:
>
> There should (in theory) exist a quick way for Jool to tell whether the
> pool4 has been exhausted (ie. there is one existing BIB entry for every
> available pool4 address). If this information were available, it should be
> able to skip *an entire tree traversal* for every translated packet that
> needs the creation of a new BIB entry.
>
> In your case, the tree traversal is `65535 - 1025 + 1 = 64511` nodes long.
> And yes, this operation has to lock, because otherwise it can end up with a
> corrupted tree. So yeah, this is really bad.
>
> Now, I think I'm starting to realize why I might not have implemented
> this: Because pool4 is intended as a mostly static database (because this
> helps minimize locking), and usage counters would break this. So
> implementing this might end up being a tradeoff. But give me a few days;
> I'll think about it.
>
> Maybe I also assumed that the admin would always grant enough addresses to
> pool4. If pool4 has enough addresses to actually serve the traffic, it
> won't waste so much time iterating pointlessly.
>
> > PS: Developers that want to work on this and would like access to my lab
> boxes (Dell R630 with lots of cores, memory and some 10Gbit/s NICs): feel
> free to contact me!
>
> That'd be interesting. If my theory proves to be a flop, this would be
> helpful in spotting the real bottleneck; I don't have anything right now
> capable of overwhelming Jool.
>
> > Lockless/LRU structures?
>
> I'd love to find a way to do this locklessly, but I haven't been proven
> that creative.
>
> The problem is that there are two separate trees that need to be kept in
> harmony (one for IPv4 BIB lookups, another for IPv6 BIB lookups), otherwise
> Jool will inevitably create conflicting entries and traffic will start
> going in mismatched directions. At the same time, the BIB needs to be kept
> consistent with pool4, so the creation of a BIB entry also depends on many
> other BIB entries.
>
> On top of that, this all happens in interrupt context so Jool can't afford
> itself the luxury of a mutex. It *has* to be a dirty spinlock.
>
> Come to think of it, Jool 4 might not necessarily translate in interrupt
> context so the latter constraint could be overcomed. Interesting.
>
> > "f-args": 10,
>
> This should be contributing to the problem. But not by much, I think.
>
> Actually, if TRex's test is random, it's probably actually not
> contributing much at all. It would be more of an issue in a real
> environment.
>
> > kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
> [kworker/u769:0:209211]
>
> Yeah... this is not acceptable. I really hope I can find a fix.
>
> On Mon, Sep 4, 2017 at 4:23 PM, Sander Steffann <sander at steffann.nl>
> wrote:
>
>> Hi,
>>
>> > Just before the box freezes there are a lot of ksoftirqd threads quite
>> busy:
>> >
>> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
>> COMMAND
>> > 20 root 20 0 0 0 0 R 100.0 0.0 1:27.29
>> [ksoftirqd/2]
>> > 60 root 20 0 0 0 0 R 100.0 0.0 1:16.64
>> [ksoftirqd/10]
>> > 90 root 20 0 0 0 0 R 100.0 0.0 1:44.79
>> [ksoftirqd/16]
>> > 160 root 20 0 0 0 0 R 100.0 0.0 1:32.92
>> [ksoftirqd/30]
>> > 220 root 20 0 0 0 0 R 100.0 0.0 1:42.52
>> [ksoftirqd/42]
>> > 50 root 20 0 0 0 0 R 100.0 0.0 1:26.21
>> [ksoftirqd/8]
>> > 120 root 20 0 0 0 0 R 100.0 0.0 1:24.15
>> [ksoftirqd/22]
>> > 190 root 20 0 0 0 0 R 100.0 0.0 1:47.05
>> [ksoftirqd/36]
>> > 200 root 20 0 0 0 0 R 100.0 0.0 1:49.58
>> [ksoftirqd/38]
>> > 210 root 20 0 0 0 0 R 100.0 0.0 1:58.60
>> [ksoftirqd/40]
>> > 230 root 20 0 0 0 0 R 100.0 0.0 1:35.77
>> [ksoftirqd/44]
>> > 240 root 20 0 0 0 0 R 100.0 0.0 1:59.37
>> [ksoftirqd/46]
>> > 250 root 20 0 0 0 0 R 100.0 0.0 1:41.46
>> [ksoftirqd/48]
>> > 260 root 20 0 0 0 0 R 100.0 0.0 1:28.37
>> [ksoftirqd/50]
>> > 280 root 20 0 0 0 0 R 100.0 0.0 1:17.73
>> [ksoftirqd/54]
>> > 100 root 20 0 0 0 0 R 86.3 0.0 1:35.04
>> [ksoftirqd/18]
>> > 150 root 20 0 0 0 0 R 86.3 0.0 1:36.73
>> [ksoftirqd/28]
>> > 30 root 20 0 0 0 0 R 85.3 0.0 1:18.45
>> [ksoftirqd/4]
>> > 40 root 20 0 0 0 0 R 85.3 0.0 1:42.14
>> [ksoftirqd/6]
>> > 70 root 20 0 0 0 0 R 85.3 0.0 1:27.51
>> [ksoftirqd/12]
>> > 110 root 20 0 0 0 0 R 85.3 0.0 1:22.93
>> [ksoftirqd/20]
>> > 130 root 20 0 0 0 0 R 85.3 0.0 1:37.32
>> [ksoftirqd/24]
>> > 140 root 20 0 0 0 0 R 85.3 0.0 1:36.85
>> [ksoftirqd/26]
>> > 3 root 20 0 0 0 0 S 84.3 0.0 2:47.02
>> [ksoftirqd/0]
>> > 80 root 20 0 0 0 0 R 84.3 0.0 1:43.63
>> [ksoftirqd/14]
>> > 270 root 20 0 0 0 0 R 84.3 0.0 1:21.22
>> [ksoftirqd/52]
>> > 170 root 20 0 0 0 0 R 66.7 0.0 1:43.50
>> [ksoftirqd/32]
>> > 180 root 20 0 0 0 0 R 51.0 0.0 0:52.70
>> [ksoftirqd/34]
>> > 444 root 20 0 0 0 0 R 46.1 0.0 1:41.75
>> [kworker/34:1]
>> > 205389 root 20 0 0 0 0 R 19.6 0.0 0:04.53
>> [kworker/u769:2]
>> > 892 root 20 0 110908 69888 69556 S 18.6 0.1 4:06.82
>> /usr/lib/systemd/systemd-journald
>> > 644 root 20 0 0 0 0 S 15.7 0.0 0:19.73
>> [kworker/14:1]
>> > 740 root 20 0 0 0 0 S 15.7 0.0 0:08.38
>> [kworker/52:1]
>> > 26633 root 20 0 0 0 0 S 15.7 0.0 0:35.72
>> [kworker/6:0]
>> > 209211 root 20 0 0 0 0 S 15.7 0.0 0:03.96
>> [kworker/u769:0]
>> > 111 root 20 0 0 0 0 S 14.7 0.0 0:16.67
>> [kworker/20:0]
>> > 541 root 20 0 0 0 0 S 14.7 0.0 0:16.59
>> [kworker/12:1]
>> > 2698 root 20 0 0 0 0 S 14.7 0.0 0:09.77
>> [kworker/28:1]
>> > 12100 root 20 0 0 0 0 S 14.7 0.0 0:17.94
>> [kworker/18:1]
>> > 40451 root 20 0 0 0 0 S 14.7 0.0 0:12.69
>> [kworker/24:0]
>> > 40592 root 20 0 0 0 0 S 14.7 0.0 0:23.30
>> [kworker/32:1]
>> > 212923 root 20 0 0 0 0 S 14.7 0.0 0:02.89
>> [kworker/4:1]
>> > 300255 root 20 0 0 0 0 S 14.7 0.0 0:17.89
>> [kworker/26:0]
>> >
>> > At least I'm making some use of those 28 cores ;)
>>
>> Fun addition: the kernel just warned me right after I could log back in:
>>
>> Message from syslogd at tr3.retevia.eu at Sep 4 23:15:42 ...
>> kernel:NMI watchdog: BUG: soft lockup - CPU#38 stuck for 23s!
>> [kworker/u769:0:209211]
>>
>> Very busy indeed :)
>>
>> Cheers!
>> Sander
>>
>>
>> _______________________________________________
>> Jool-list mailing list
>> Jool-list at nic.mx
>> https://mail-lists.nic.mx/listas/listinfo/jool-list
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170904/18b4d413/attachment.html>
More information about the Jool-list
mailing list