[Jool-list] Jool at the Internet2 Tech Exchange

Wed Sep 6 17:40:18 CDT 2017

Awesome!

> Also, if there are Jool/SIIT experts here who would be willing to look
over our plans and make suggestions, that would be good too.

Having only a small 464XLAT setup in production, with hardly enough traffic
to make it challenging, I can unfortunately say that my experience with
deploying Jool is probably lacking. As the lead developer of the project,
however, I can offer you technical advice and support.

> I had planned to do.my own load tests, and then the recent T-Rex thread
came up. It looks like the levels of load that you have been talking about
are higher than what I'm likely to encounter with a few hundred laptops and
smart phones at a conference.

Well, I suppose that it would be a good idea to explain the problem a
little better to make sure that everyone understands its nature *and its
workarounds*. I'm aware that you might already understand most of what's to
come, buh here it is anyway.

(Though it should be noted that this is still just an hypothesis; I still
need to do all the benchmarking.)

The most likely bottleneck of any NAT64 is the BIB/session management.

(Note: BIB and session are technically different things, but they are
strongly interrelated, and as Tore mentioned in the T-Rex threads, most of
the time the BIB-to-session ratio is 1:1, so for the purposes of this
explanation, you can assume that they are the same thing.)

As you might be aware, whenever a client creates a connection that will
cross the NAT64, Jool will create a session and store connection-related
information in it that will be used to translate future packets from this
connection in a consistent manner.

When the session already exists in the database, retrieving it to serve a
packet is very fast; it's just a database lookup. In Jool's case, the
session database is a red-black tree so finding a session is generally a
O(log n) operation. In other words, this degrades very gracefully and I
don't think there is a problem with it.

When it gets complicated is in the first packet of a connection. In this
case, the session does not exist (yet) in the database. Therefore, upon
performing the tree lookup and failing, Jool finds itself needing to create
a new session. This is a straightforward procedure for every field in the
session except for what Jool's "--session --display" output refers to as
"Local IPv4 (transport) address". This is an address that has to match the
pool4 configuration and, at the same time, not collide with certain other
sessions' local v4 addresses.

To compute this address, Jool uses algorithm 3 of RFC 6056. The details of
this algorithm are not important here; suffice to say that it degrades
approximately according to the graph I have attached to this mail as file
"graph1.png". In other words, the algorithm is *EXTREMELY* fast until most
of pool4's addresses have been "taken" by the BIB/session database.

(Let's hope that this mail doesn't get filtered because of the images.)

-----------------

I should probably explain the graph more. Here goes. If you want to skip
this, go to the next "-----".

Only one BIB entry can hold a particular pool4 address at a time, so it
follows that you can only have as many BIB entries as pool4 transport
address. For example, if your pool4 has the following contents:

$ jool --pool4 --display
0    TCP    192.0.2.1    1-4
0    TCP    192.0.2.2    1-2

Then your NAT64 can at most have 6 BIB entries. (ie. one for each of
192.0.2.1:1, 192.0.2.1:2, 192.0.2.1:3, 192.0.2.1:4, 192.0.2.2:1 and
192.0.2.2:2.)

The horizontal axis represents the number of pool4 addresses that are
"taken" by some BIB/session entry.

The vertical axis is the "slowness" of the algorithm. You want this number
to be as low as possible.

It is important to highlight that the horizontal axis does *not* represent
time. Sessions *time out* over time, which means that "Number of BIB
entries" goes up and down depending on how many simultaneous connections
are being translated by the NAT64 at a time. New connections push the graph
towards the right, while closing connections push it towards the left. This
means that, if your pool4 size is somewhat larger than your connection
population, you will remain comfortably on the left side on the graph.
Meaning, the algorithm will never become slow.

Just for a bit of context, most of the connections I see while analyzing
traffic are HTTP. While those are many, they are also very short-lived.
Most of these connections are initiated and then closed instantly, and then
they time out after 4 minutes by default (
https://jool.mx/en/usr-flags-global.html#--tcp-trans-timeout).

-----------------

The reason why I believe Sander's machine is slowing down to a crawl is
because I suspect that what the T-Rex test does is to attempt to create as
many connections as possible and/or keep them alive for as long as
possible. Jool gets stuck on the far right of the graph and therefore takes
very long while attempting to allocate pool4 addresses for BIB entries.
(And then most of the time fails because there are *no* available pool4
transport addresses.)

By the way: T-Rex's flood is probably not an accurate representation of
natural traffic, but Jool should be able to hold its ground nonetheless
because an attacker might try to exploit it.

So you might realize that one way to minimize the damage would be to keep
pool4 very small. Sander's pool4 has 16 addresses, and almost 64k ports per
address, so on the peak of the graph the local v4 address computation takes
almost 16 * 64k iterations per new connection. If you had, say, 1 address,
that'd be 64k iterations. And so on. Of course, this solution is hardly
viable if you need to serve a large amount of v6 clients.

The solution will be somewhat more viable if you throw --mark into the mix (
https://jool.mx/en/usr-flags-pool4.html#--mark). The reason for this is
that pool4 entries wearing different marks are basically members of
different pools, so if you have this pool4, for example (Note the first
column, which represents the mark):

$ jool --pool4 --display
0    TCP    192.0.2.1    1-65535
1    TCP    192.0.2.2    1-65535
2    TCP    192.0.2.3    1-65535
3    TCP    192.0.2.4    1-65535

Then the peak of the graph will be 65535 for any connection (as opposed to
4 * 65535). This would have the added benefit that, if one of your users is
attacking you, he or she will only mess up their own pool4 and not affect
everyone else.

Ironically, another solution would be to keep pool4 *as big as possible*.
This is because, again, if your pool4 size is somewhat larger than your
connection population, you will remain comfortably on the left side on the
graph. Assuming you're not being attacked, that is. I suppose that, if you
fear an attack, you could always filter users that are opening abnormal
numbers of connections. I'm not fluent with firewalls though, so I don't
know how to do this.

And the third solution off the top of my head would be to test waters with
the fake NAT64 Tore proposed a few mails back. That thing makes it very,
very difficult to force the RFC 6056 algorithm to iterate a lot.

-----------------

And finally: What I plan to do in an attempt to fix this mess through code:

<Insert attached graph2.png here>

If I can find a way for the code to keep track of whether the pool4 has
been exhausted by the BIB/session database, I can completely obviate the
address lookup step when it has. Of course, this will not get rid of the
peak, but since the attacker is trying to keep the state as far right as
possible, that last (practically no-op) dot should speed things
considerably.

The number of attempted connections will most of the time exceed the number
of BIB entries in the table, and so, because the horizontal axis does not
extend to infinity, the number of iterations will snap at zero.

Also, it might be a good idea to avoid the peak completely if pool4 is too
big. I'm thinking if using some form of heuristics like, "if I expect on
average to iterate more than 100 times, pretend that pool4 is exhausted".
This would avoid the peak pretty much completely at the cost of relatively
suboptimal pool4 utilization. A configuration switch could toggle this
feature.

Though don't expect me to fix this anywhere as quickly as I coded the fake
NAT64 prototype; optimization is hard.

On Wed, Sep 6, 2017 at 4:56 PM, Alan Whinery <whinery at hawaii.edu> wrote:

> Sorry APNIC44, not APAN44. APAN44 was in China last week...
>
>
> On 9/6/2017 9:39 AM, Alan Whinery wrote:
> > Thanks Jordi -- good to hear I'm not-yet-confirmed-crazy.
> >
> > I have seen that ripe74 slide set, looking forward to seeing the one
> > after APAN44.
> >
> > Travel well, and safe. I will share a directory with some deployment
> > notes in it with you, for when you get home.
> >
> >
> > Alan
> >
> > On 9/6/2017 8:00 AM, JORDI PALET MARTINEZ wrote:
> >> Hi Alan,
> >>
> >> Nothing crazy at all.
> >>
> >> I’m doing it since some time ago.
> >>
> >> We have a plan for the next LACNIC meeting to do so in fact …
> >>
> >> I’ve used both Jool (for the NAT64 and the CLAT) in Ubuntu VMs. I’ve
> also used as CLAT a VM with OpenWRT.
> >>
> >> Also doing some testing with ISPs with customers in residential
> networks.
> >>
> >> https://ripe74.ripe.net/presentations/151-ripe-74-
> ipv6-464xlat-residential-v2.pdf
> >>
> >> I’ve done many presentations on this, some in Spanish, you can google
> for them. After tomorrow doing also workshops using Jool (NAT64, CLAT) in
> the APNIC44 in Taiwan. You will be able to download the slides in a couple
> of days, will work in a last review during the flight.
> >>
> >> I may be able to help if you have some questions. Note that my
> responses can be slightly delayed the next few days, as I’m heavily busy
> during the APNIC.
> >>
> >> Regards,
> >> Jordi
> >>
> >>
> >> -----Mensaje original-----
> >> De: <jool-list-bounces at nic.mx> en nombre de Alan Whinery <
> whinery at hawaii.edu>
> >> Responder a: <whinery at hawaii.edu>
> >> Fecha: miércoles, 6 de septiembre de 2017, 19:24
> >> Para: <jool-list at nic.mx>
> >> Asunto: [Jool-list] Jool at the Internet2 Tech Exchange
> >>
> >>     Greetings,
> >>     I have been working with Jool at the University of Hawaii, and I
> have had good results setting up NAT64 networks with it. So far, when I do
> an additional 464xlat piece, I get some issues with SSL and packet size (I
> assume), but I'm hoping to clean that up.
> >>
> >>     When I look at the developmental open source NAT64 solutions, it
> seems like Jool is the most "serious" design, especially in terms of
> scalability.
> >>
> >>     As co-chair of the Internet2 IPv6 Working Group, I am facilitating
> the development of some IPv6 tutorials for the Internet2 Technology
> Exchange in San Francisco next month, to include how to set up IPv6-only
> networks, with NAT64/DNS64/464XLAT to access legacy v4 Internet. I am
> planning to promote Jool as a toolkit to do NAT64 on a campus scale, and
> also talk about Cisco IOS, etc.
> >>
> >>     I have also talked Internet2 into planning a v6-only SSID for the
> conference. Since their road kit can set up VMs easily, I was planning to
> do this with Jool and Linux based NAT64/464XLAT.
> >>
> >>     My first question to this group -- am I doing something crazy? I
> had planned to do.my own load tests, and then the recent T-Rex thread came
> up. It looks like the levels of load that you have been talking about are
> higher than what I'm likely to encounter with a few hundred laptops and
> smart phones at a conference.
> >>
> >>     Also, if there are Jool/SIIT experts here who would be willing to
> look over our plans and make suggestions, that would be good too.
> >>
> >>     Regards,
> >>     Alan Whinery
> >>     Chief Internet Engineer
> >>     University of Hawaii System
> >>     Co-chair Internet2 IPv6 Working Group
> >>
> >>     _______________________________________________
> >>     Jool-list mailing list
> >>     Jool-list at nic.mx
> >>     https://mail-lists.nic.mx/listas/listinfo/jool-list
> >>
> >>
> >>
> >>
> >> **********************************************
> >> IPv4 is over
> >> Are you ready for the new Internet ?
> >> http://www.consulintel.es
> >> The IPv6 Company
> >>
> >> This electronic message contains information which may be privileged or
> confidential. The information is intended to be for the exclusive use of
> the individual(s) named above and further non-explicilty authorized
> disclosure, copying, distribution or use of the contents of this
> information, even if partially, including attached files, is strictly
> prohibited and will be considered a criminal offense. If you are not the
> intended recipient be aware that any disclosure, copying, distribution or
> use of the contents of this information, even if partially, including
> attached files, is strictly prohibited, will be considered a criminal
> offense, so you must reply to the original sender to inform about this
> communication and delete it.
> >>
> >>
> >>
> >> _______________________________________________
> >> Jool-list mailing list
> >> Jool-list at nic.mx
> >> https://mail-lists.nic.mx/listas/listinfo/jool-list
>
> _______________________________________________
> Jool-list mailing list
> Jool-list at nic.mx
> https://mail-lists.nic.mx/listas/listinfo/jool-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170906/3781e69e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph1.png
Type: image/png
Size: 11944 bytes
Desc: not available
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170906/3781e69e/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph2.png
Type: image/png
Size: 11678 bytes
Desc: not available
URL: <http://mail-lists.nic.mx/pipermail/jool-list/attachments/20170906/3781e69e/attachment-0003.png>