[Jool-list] Moving to jool

Nico Schottelius nico.schottelius at ungleich.ch
Thu Nov 14 11:54:36 CST 2019


Good evening everyone,

thanks for following up and also for adding cool new pictures on the
jool website - it's cool to see some movements - AND to that guy who
created an Alpine package - I owe you a drink of choice!

So many things happening... today I've started to look into joold,
because we are considering to run 2 active routers with the same
available IPv4 addresses.

Regarding the UDP loss, Stefan Brudny (see mail below) actually
pointed me to an open bug in iperf that I wasn't aware of either. So
this might all be a false positive, for which I'm very sorry if it
costed anyone time!

Best regards and many motivated greetings from the mountains,

Nico

p.s.: From a performance point of view and remembering how P4 lets you
modify packets, I think jool should be able to handle native forwarding
/ line speed, as the actual modifications are very little and the
information required even fits into every L1 cache. Minus the obvious OS
overhead.


--------------------------------------------------------------------------------
From: Stefan Brudny <stefan.brudny at gmail.com>
To: Nico Schottelius <nico.schottelius at ungleich.ch>
Subject: Re: [Jool-list] Moving to jool
Flags: replied, seen
Date: Thu 07 Nov 2019 11:11:37 PM CET
Maildir: /ungleich/2019

Gents,

Blind shot for packet loss, I was experiencing some extreme packets loss in
udp in Azure, not related to any nat64, different service. I was using
iperf. It turned out that iperf has a bug and sometimes in some
environments and configurations it misbehaves.

https://github.com/esnet/iperf/issues/296

I used nttcp and udp packet loss dropped from 90 to 0.1%.

BTW, used jool for poc to find solution for pfsense, and it works
perfectly. Heads up too.

SB
--------------------------------------------------------------------------------

Alberto Leiva <ydahhrk at gmail.com> writes:

> Ok, I was able to replicate 8 Gbit/sec by using virtualization (since
j> my physical hardware cannot keep up at all). I can confirm that
>
> - according to top, the NAT64 machine refuses to exceed 100% CPU utilization
>   (which allegedly signifies that only one CPU is being used), and
> - according to /proc/interrupts, most traffic that shares an incoming
>   interface also shares CPU:
>
>     $ cat /proc/interrupts
>     CPU0       CPU1
>     3551       112239    enp0s8
>     4825321    49        enp0s3
> (Output trimmed to only relevant rows and columns)
>
> I don't know when this started happening, but considering that
> performance is (in my experience) most people's main concern, I do
> think this is a problem that needs immediate attention.
>
> I don't think this is a Jool bug; it's simply the way the kernel is
> configured to handle interrupts by default. However, it's certainly
> worth a note in the documentation, to ease the solution for people who
> need to squeeze as much performance out of their translator as
> possible. I just hope it doesn't require a custom kernel...
>
> I will try to figure this out and should come back in a few days with
> more information.
>
> ------------------------------------------
>
> I still haven't figured out what's with the "Datagram Lost" column.
> Sometimes iperf's output is quite nonsensical; I have seen it report
> literally 100% datagram lost rate and yet the reported "speed" is 8
> Gbit/sec. I don't understand what's up with this. Maybe it's a
> checksum problem (ie. the packets arrive but the checksum is incorrect
> so iperf reports them as arrived and lost at the same time), but then
> it's strange that I can't identify any artifacts in video streams.
> This needs to be investigated further.
>
> Working...
>
> On Thu, Nov 7, 2019 at 12:40 PM Nico Schottelius
> <nico.schottelius at ungleich.ch> wrote:
>>
>>
>> Good evening Jordi, Alberto,
>>
>>
>> JORDI PALET MARTINEZ <jordi.palet at consulintel.es> writes:
>>
>> > Hi Nico,
>> >
>> > I have read your complete document when you sent it to the list, and I want to thank you for it.
>> >
>> > I'm a frequent user of Jool, and teach about it to the community and
>> > customers.
>>
>> Very nice!
>>
>> > I was also surprised about your UDP failures, I've never seen that before, so as you just said, it may be due to your specific configuration. I recall having tested Jool the first time in Ubuntu 16.x, but I often try to upgrade the kernel to the latest available release, etc.
>> >
>> > In fact, I usually check and adjust myself CPU affinity (even I do that in my OpenWRT routers!).
>> >
>> > One suggestion, in case you can invest a bit of extra time on this, so to make your work more comprehensive, will be to test also using VPP:
>> >
>> > https://docs.fd.io/vpp/17.07/nat64_doc.html
>>
>> Interesting! I have added it to my backlog, I wasn't aware of nat64 in
>> vpp!
>>
>> > I will actually say, if you allow me, "forget Tayga", it doesn't
>> > scale, isn't longer mantained, and Jool and VPP are much better
>> > targets to focus on!
>>
>> I assumed so. However, there is one really, really big advantage of
>> tayga: it is included in every distribution. This was actually the
>> reason why we chose tayga in 2017 for datacenterlight.ch.
>>
>> Now that we hit cpu limitations we are more willing to manually maintain
>> it and it is somewhat "ok", because we only have 6 routers. I'm actually
>> considering to spend some of our resources to package jool for Alpine
>> Linux, which is our target os for the new router generation.
>>
>> Either way, I have to thank you guys, you did a quite impressive job
>> with jool!
>>
>> Best regards from Switzerland,
>>
>> Nico
>>
>>
>> --
>> Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch


--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch


More information about the Jool-list mailing list