IPSec
From TLUGWiki
IPSec is a backport of the security features found in IPv6 into IPv4. Yes, it seems like we're going to be stuck with IPv4 for at least a while longer.
Contents |
[edit] Introduction
Why IPSec? Well, there are reasons. One of them is that it's an open standard that should in theory allow different vendor products to inter-operate. It's known that routers such as CISCO supports IPSec (Not sure how good they are, I've never configured one), which implies that you can for example set up IPSec between your linux box and a CISCO router.
OpenVPN, which is my other choice is also very configurable, it functions with actual tun/tap devices, which simplifies routing issues, something which I will address later on. Another difference is that IPSec is more a peer-to-peer type architecture whereas OpenVPN follows the server/client model for initial negotiation.
There are many pages describing what IPSec is and how it works, including how to configure it. The best ones I've picked up is the ipsec-howto that can be found on www.ipsec-howto.org and the NetBSD howto (yes, it's for NetBSD but the linux ipsec-tools is based on KAME - and the clone is pretty accurate) which can be found on http://www.netbsd.org/Documentation/network/ipsec/ and then there is the linux advanced routing and traffic control guide, specifically the ipsec chapter http://lartc.org/howto/lartc.ipsec.html. Most of these pages is linked from the ipsec-tools homepage which is hosted on sourceforge: http://ipsec-tools.sourceforge.net/.
So if there is so much information out there - why yet another guide? Well, it took me 24 hours to just get my first esp packet to successfully go from one end to the other. I spoke a lot with Simeon Miteff whilst trying to figure this all out and he questioned me a lot on why I would want to get it working when I've personally already configured OpenVPN and know that it works. I also spoke a lot with Andrew who claims that 24 hours from starting to getting it working isn't bad. Well, I'm used to getting things done a lot quicker.
So instead of trying to explain how to do IPSec, I'm rather going to explain what I've gone through, the things I've learned and figured out that is not explained in the other guides. Hopefully this information may be helpful to other people.
[edit] Why IPSec
Because I'm planning to build what I call a N:N VPN mesh. Meaning where you have N distinct nodes, be they networks in their own rights (which is what I'll mostly be working with) or simple hosts. The idea is to not have all the traffic go via a central server, and if there has to be a central server (I can't get away from it entirely at this stage) to have as little as possible interaction with this server. OpenVPN can do this but the configuration is a nightmare, I've done it for four nodes and eventually it just becomes so damn hard to keep track of all the "subnets" that you have to declare for all the tunnels (2 nodes => 1 tunnel, 3 nodes => 3 tunnels, 4 nodes => 6 tunnels, n nodes => 2n - 1 tunnels). Generating unique keys for all of this becomes a nightmare at best. In addition if I'm using tunneling mode I need n-1 tun devices on every node. Not a serious problem (I often have 6+ devices show up in ifconfig anyway).
Hamachi does something similar for Windows where you can create a virtual network. This never did work for me, nor is this the "market" I'm aiming at. Also, hamachi connects individual machines together into a single network, and I have no clue as to the security of your data. I'm not even sure if it's possible to connect different networks together using Hamachi.
Another advantage that IPSec has for me is the fact that I don't need yet more IP addresses on my machines. I only need two IPs on my routers (for the purposes of this "howto" I'm going to assume lo, eth0, eth1 and ppp0 on each of my two routers), the internal IP address and my external ADSL assigned IP address.
Further motivation can be found by being a performance maniac, consider the OpenVPN case of a packet that comes off the network that needs to be forwarded:
OpenVPN: received by kernel, routed to tun/tap device (possible switch to userspace) openvpn issues read (switch back to kernel space) packet passed back (switch to userspace) packet processed and UDP packet sent to peer using write (switch back to kernel space) kernel queues packet for ppp0 device (switch back to userspace) openvpn issues read on tun/tap device (switch back to kernel space, and probably blocks).
IPSec: received by kernel, routed to ppp0, processed and queued.
Right, no context switching at all in the IPSec case! Note that some smart person is bound to mention iptables, in both cases the packet traverses the tables twice, however I suspect that even here there are some reductions in the IPsec case in terms of the number of actual chains that need to be traversed.
[edit] The hardware
Right, I'm going to assume that our "routers" are both linux boxes with two network cards, eth0 and eth1 where eth1 is connected to an adsl modem/router that is put into bridging mode. Thus the linux machine does the dialing and thus we have ppp0 on the box. Actually, this is kinda a requirement as IPSec, and more specifically the esp packets, does not like being passed over a NAT.
Any router that can be put into bridging mode will work. Getting the router into bridging mode is your baby - if you have one of those Telkom routers - yea, they should work: in theory.
[edit] The software
[edit] The Linux Kernel
Obviously we need a linux kernel (I used 2.6.20). There are a few options that you should enable. Hopefully the following gets the lot, many of the mentioned howtos does not specify all the options you need (note that some ooptions, like AH transformation may not be needed depending on your setup):
- Networking
- Networking Options
- Transformation user configuration interface
- PF_KEY sockets
- IP: AH transformation
- IP: ESP transformation
- IP: IPComp transformation
- IP: IPsec transport mode
- IP: IPsec tunnel mode
- IP: IPsec BEET mode
- Network packet filtering
- Core Netfilter Configuration
- Netfilter Xtables support
- "ESP" match support
- IPsec "policy" match support
- IP: Netfilter Configuration
- IP tables support
- "AH" match support
- IP tables support
- Netfilter Xtables support
- Core Netfilter Configuration
- Networking Options
- Cryptographic API
- HMAC support
- MD5 digest algorithm
- CBC support
- AES cipher algorithms
Please note you only need netfilter stuff if you are going to be running a firewall on the same machine as you're running ipsec on. I kind do NAT on the same box (something which is highly discouraged but when your endpoints is dynamic IPs on ADSL you really don't have a choice, especially if you have full-blown networks behind those IPs). Also, the options I selected here are only the ones required for IPsec related stuff ... you probably want other options like NAT, multiport, protocol and connection tracking and a bunch of other options too if you are serious about firewalls. Usually I just compile all the iptables options as modules.
When looking at the crypto stuff I usually just select everything (as modules at least).
Recompile, install, reboot. Yes, you probably have to reboot.
[edit] ipsec-tools
This includes the setkey program which is used to set up ipsec policies (Security Policy Database - SPD) and keys (Security Association Database - SAD), and racoon, which can be used to maintain the SAD. I've currently got version 0.6.3 set up.
[edit] iproute2
I recommend using iproute2. I became an evangelist a while back. The ip command whilst initially slightly awkward, is apparently more in line with what is the standard on routers (Like the CISCO ones), and is most definitely way more powerful. In the IPsec case there is also a very particular reason to use it (We want to explicitly use the wrong source IP in some cases in order to enforce using he right ipsec policy). I currently use version 2.6.19.20061214.
[edit] iptables
Only needed if you are also running a firewall on the same machine. I currently have version 1.3.5, although if I were to update right now I'd probably get 1.3.6 but since I haven't updated in like forever this isn't the case.
[edit] The network
I started out this whole exercise trying to connect up my office network (192.168.42.0/24) with my home network (192.168.0.0/24), each of which is connected to the internet using ADSL. I've got dynamic DNS configured for both systems, at the office it's atlantis.dyndns.uls.co.za and for home it's krooninfsys.dyndns.uls.co.za (used to be krooninfsys.dyndns.org ... want to make it xacatecas.dyndns.uls.co.za at some point). So, please forgive my ASCII art - but I like pictures, this ends up looking something like:
/-----------------\ /----------------\ | | | | | 192.168.42.0/24 |-- 192.168.42.1--<atlantis>--ADSL--[[ big bad www with many routers ]]--ADSL--<xacatecas>--192.168.0.1--| 192.168.0.0/24 | | | | | \-----------------/ \----------------/
(Hint hint: It would be really cool if someone can draw a nice picture - with the correct UML symbols)
The actual network is a wee bit more complicated with atlantis being connected to 192.168.10.0/24 as well, and there being small /30 networks between both routers and the ADSL routers.
[edit] The process
Initially I figured I'd simply get everything working using manually keyed connections. This turned out to be the hardest thing to do of everything, but without this crucial learning process I still would not have understood everything as I do it now. Due to struggling with this I've had the opportunity to actually sit through and actually observe what is going on.
The first thing that you should realize is there are two separate databases (as I've already mentioned in passing), the SPD, which specifies WHAT we want encrypted, and the SAD, which specifies HOW. My SPD worked out of the box, and your SAD should pretty much work out of the box too if you've got all the modules I specified above (I missed the transport module, which is why my initial transport setup didn't work).
[edit] transport level encryption
This basically encrypts only the traffic that is placed inside of the IP packet, or the IP payload. So if you encrypt a UDP packet, only the UDP part of the packet will be encrypted, and the original IP header will stay intact. This is good for connecting two hosts, I decided to start of with trying to encrypt all the traffic between my two external ppp0 interfaces using transport mode.
You should be able to do this pretty simply with a script like the following (hopefully what is happening here is pretty clear, i find that reading remote vs local is easier than mentally decoding the IPs the whole time):
#! /bin/bash
xacatecas=$(dnsip krooninfsys.dyndns.uls.co.za)
atlantis=$(dnsip atlantis.dyndns.uls.co.za)
local_name=$(hostname)
[[ $local_name == "atlantis" ]] && remote_name=xacatecas || remote_name=atlantis
local_ip=${!local_name}
remote_ip=${!remote_name}
# Flush any existing policies
setkey -FP
setkey -F
# Set up our new policies
setkey -c <<EOF
add $xacatecas $atlantis esp 1 -m transport -E aes-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef -A hmac-md5 0x12345678123456781234567812345678;
add $atlantis $xacatecas esp 2 -m transport -E aes-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef -A hmac-md5 0x12345678123456781234567812345678;
spdadd $local_ip $remote_ip any -P out ipsec esp/transport//require;
spdadd $remote_ip $local_ip any -P in ipsec esp/transport//require;
EOF
If you were to run "setkey -DP" now you should see something like:
196.209.31.174[any] 196.209.72.128[any] any
in prio def ipsec
esp/transport//require
created: Mar 1 23:56:54 2007 lastused:
lifetime: 0(s) validtime: 0(s)
spid=800 seq=1 pid=7497
refcnt=1
196.209.72.128[any] 196.209.31.174[any] any
out prio def ipsec
esp/transport//require
created: Mar 1 23:56:54 2007 lastused:
lifetime: 0(s) validtime: 0(s)
spid=793 seq=2 pid=7497
refcnt=1
196.209.31.174[any] 196.209.72.128[any] any
fwd prio def ipsec
esp/transport//require
created: Mar 1 23:56:54 2007 lastused:
lifetime: 0(s) validtime: 0(s)
spid=810 seq=0 pid=7497
refcnt=1
That is the SPD, or the WHAT. It basically says require encryption on all incoming packets from the peer, and require encryption on outgoing packets to the peer.
The SAD should look something like (setkey -D):
196.209.72.128 196.209.31.174
esp mode=transport spi=2(0x00000002) reqid=0(0x00000000)
E: aes-cbc deadbeef deadbeef deadbeef deadbeef
A: hmac-md5 12345678 12345678 12345678 12345678
seq=0x00000000 replay=0 flags=0x00000000 state=mature
created: Mar 1 23:56:54 2007 current: Mar 1 23:58:07 2007
diff: 73(s) hard: 0(s) soft: 0(s)
last: hard: 0(s) soft: 0(s)
current: 0(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 0 hard: 0 soft: 0
sadb_seq=1 pid=7498 refcnt=0
196.209.31.174 196.209.72.128
esp mode=transport spi=1(0x00000001) reqid=0(0x00000000)
E: aes-cbc deadbeef deadbeef deadbeef deadbeef
A: hmac-md5 12345678 12345678 12345678 12345678
seq=0x00000000 replay=0 flags=0x00000000 state=mature
created: Mar 1 23:56:54 2007 current: Mar 1 23:58:07 2007
diff: 73(s) hard: 0(s) soft: 0(s)
last: hard: 0(s) soft: 0(s)
current: 0(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 0 hard: 0 soft: 0
sadb_seq=0 pid=7498 refcnt=0
If you get no output (not even "No SAD entries.") then you're (from my experience) probably missing some kernel modules. Try loading other modules, although, from a clean boot just having the modules on my system causes the required ones to be auto-loaded. If setkey -D does not give any output, then try setkey -Da, which will most likely show your keys but declare them to be in state=dead. Yes, this is bad. This took me close to 24 hours to figure out (I only worked on this, for that period, other than eating and sleeping and driving to/from work and about 2 hours of band practice). The only other symptom I could see was:
atlantis ~ # ping krooninfsys.dyndns.uls.co.za connect: No such process atlantis ~ #
Well, my guess is that what is happening is that the kernel notes that it has to encrypt the data. So it goes of looking for an SA it can use. It doesn't find one, so it tries to contact userspace to initiate IKE (Internet Key Exchange), but since the racoon daemon isn't running this isn't possible. So it simply declares that the process isn't running and returns "No such process". Simple, effective, highly cryptic.
Right, once your ping is working you can be resting assured that IPSec is doing it's business. If you want to confirm this, then you can do a tcpdump, which will show you that the the packets are being encrypted:
atlantis ~ # tcpdump -i ppp0 host krooninfsys.dyndns.uls.co.za tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ppp0, link-type LINUX_SLL (Linux cooked), capture size 68 bytes 00:10:38.307840 IP c1-128-9.tbnb.isadsl.co.za > c1-174-16.nngy.isadsl.co.za: ESP(spi=0x07039d13,seq=0x30), length 116 00:10:38.380744 IP c1-174-16.nngy.isadsl.co.za > c1-128-9.tbnb.isadsl.co.za: ESP(spi=0x0afa3427,seq=0x2f), length 116 00:10:39.306835 IP c1-128-9.tbnb.isadsl.co.za > c1-174-16.nngy.isadsl.co.za: ESP(spi=0x07039d13,seq=0x31), length 116 00:10:39.377836 IP c1-174-16.nngy.isadsl.co.za > c1-128-9.tbnb.isadsl.co.za: ESP(spi=0x0afa3427,seq=0x30), length 116 00:10:40.309973 IP c1-128-9.tbnb.isadsl.co.za > c1-174-16.nngy.isadsl.co.za: ESP(spi=0x07039d13,seq=0x32), length 116 00:10:40.384544 IP c1-174-16.nngy.isadsl.co.za > c1-128-9.tbnb.isadsl.co.za: ESP(spi=0x0afa3427,seq=0x31), length 116 00:10:41.310031 IP c1-128-9.tbnb.isadsl.co.za > c1-174-16.nngy.isadsl.co.za: ESP(spi=0x07039d13,seq=0x33), length 116 00:10:41.381595 IP c1-174-16.nngy.isadsl.co.za > c1-128-9.tbnb.isadsl.co.za: ESP(spi=0x0afa3427,seq=0x32), length 116 8 packets captured 16 packets received by filter 0 packets dropped by kernel atlantis ~ #
Things looks slightly differently in tunnel mode for some reason, but I've been assured that it functions as advertised.
[edit] tunnel
To progress from transport to tunnels (where the entire IP packet is wrapped up inside an ESP packet which is transmitted inside a new IP packet) is as easy as fall out of a tree (permitting you can get your routing right, which happens to be correct for most people due to the fact that they don't use prohibit rules. So first I'll take a look at my routing table:
atlantis ~ # ip route show 196.209.64.1 dev ppp0 proto kernel scope link src 196.209.72.128 10.0.0.0/30 dev ethadsl proto kernel scope link src 10.0.0.2 192.168.0.0/24 dev ppp0 scope link src 192.168.42.1 192.168.42.0/24 dev ethlan proto kernel scope link src 192.168.42.1 prohibit 192.0.0.0/24 metric 65535 192.168.10.0/24 dev ethtt proto kernel scope link src 192.168.10.100 prohibit 223.255.255.0/24 metric 65535 prohibit 192.0.2.0/24 metric 65535 prohibit 128.0.0.0/16 metric 65535 prohibit 169.254.0.0/16 metric 65535 prohibit 192.168.0.0/16 metric 65535 prohibit 191.255.0.0/16 metric 65535 prohibit 198.18.0.0/15 metric 65535 prohibit 172.16.0.0/12 metric 65535 prohibit 0.0.0.0/8 metric 65535 prohibit 39.0.0.0/8 metric 65535 prohibit 10.0.0.0/8 metric 65535 127.0.0.0/8 dev lo scope link prohibit 14.0.0.0/8 metric 65535 prohibit 224.0.0.0/4 metric 65535 prohibit 240.0.0.0/4 metric 65535 default via 196.209.64.1 dev ppp0
As you can see, the routes are pretty standard (except most people don't have a bunch of prohibit rules, and the route to 192.168.0.0/24 is a tad strange). I feel that at this stage I need to motivate the prohibit rules. I don't want to have to struggle with iptables to filter out all my illegal ranges. so instead I add prohibit rules and enable rp_filter on all devices (the default) to let these issues be sorted out before packets even get to iptables. This also means that nobody is going to be spoofing any private networks, nor do I have to repeat this for my INPUT and FORWARD chains, and private networks I am connected to has appropriate routing entries allowing the kernel to sort out the mess highly effectively. I obtained those prohibit ranges from http://en.wikipedia.org/wiki/IPv4. This means that without the explicit route for 192.168.0.0/24 this range will be prohibited by the 192.168.0.0/16 prohibit rule, and be routed into oblivion with a message back to the sender of network unreachable (ICMP). Most people will thus be able to survive with the default route pointing to dev ppp0, except, that if you do that then any traffic you initiate to 192.168.0.0/24 from the router itself will use your external IP as source, and will then bypass the ipsec rules (unless you go and explicitly add those policies).
The solution? Add a routing rule that informs the kernel to use 192.168.42.1 (internal IP) as the source for any connections initiated to 192.168.0.0/24, but still route them out via ppp0:
# ip route add 192.168.0.0/24 dev ppp0 src 192.168.42.1
Obviously ranges and IPs needs to be reversed on the other side. Yes, it's possible to configure these routes as part of your standard configuration, for example, in Gentoo you can add:
routes_ppp0=(
"192.168.0.0/24 src 192.168.42.1"
)
The Gentoo scripts automatically add "dev ppp0" to the parameters passed to ip route add.
Now we want to actually set up ipsec to be in tunnel mode. A slight modification and a bit of an extension to the script above makes this rather easy as well:
#! /bin/bash
xacatecas=$(dnsip krooninfsys.dyndns.uls.co.za)
atlantis=$(dnsip atlantis.dyndns.uls.co.za)
xacatecas_net=192.168.0.0/24
atlantis_net=192.168.42.0/24
local_name=$(hostname)
[[ $local_name == "atlantis" ]] && remote_name=xacatecas || remote_name=atlantis
local_ip=${!local_name}
remote_ip=${!remote_name}
local_net_name=${local_name}_net
remote_net_name=${remote_name}_net
local_network=${!local_net_name}
remote_network=${!remote_net_name}
echo "${local_name}: ${local_ip} (${local_network})"
echo "${remote_name}: ${remote_ip} (${remote_network})"
setkey -F
setkey -FP
setkey -c <<EOF
add $xacatecas $atlantis esp 1 -m tunnel -E aes-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef -A hmac-md5 0x12345678123456781234567812345678;
add $atlantis $xacatecas esp 2 -m tunnel -E aes-cbc 0xdeadbeefdeadbeefdeadbeefdeadbeef -A hmac-md5 0x12345678123456781234567812345678;
spdadd $local_network $remote_network any -P out ipsec esp/tunnel/$local_ip-$remote_ip/require;
spdadd $remote_network $local_network any -P in ipsec esp/tunnel/$remote_ip-$local_ip/require;
EOF
That should be it. The script once more automatically figures out which end of the tunnel it is and adjust the commands passed to setkey appropriately. This for me worked pretty much out of the box, after of course I figured out the routing as explained above. At this point all machines inside the networks should be able to communicate, except none of the machines on your network (except the routers) will be able to communicate with the outside world since at this point you should not have any NAT going on.
[edit] dynamic keying
The above setup has one serious drawback. Your keys are static. They don't change over the lifetime of your ipsec setup, at all. Generating and distributing them is a pain. That pain can be taken away with X509 certificates but in my opinion even that is overkill. What I don't want though is to have these keys stay the same for too long. I'm willing to have a master key that is pre-shared (via ssh) that can be used to periodically generate new keys. I'm guessing IKE uses some kind of keyed diffie hellman exchange to renegotiate keys, frankly I don't really care, as long as it happens securely.
Here enters the racoon daemon, which is the missing thing that caused the No such process error earlier. How do you configure it? Well, it's dead simple, like fall out of a tree. Unfortunately you need a different config file for each end, I'll only give one as the only change is the fqdn I use:
path pre_shared_key "/etc/racoon/psk.txt";
remote anonymous
{
exchange_mode aggressive,main;
doi ipsec_doi;
situation identity_only;
my_identifier fqdn "atlantis.dyndns.uls.co.za";
lifetime time 2 min; # sec,min,hour
initial_contact on;
proposal_check obey; # obey, strict or claim
proposal {
encryption_algorithm aes;
hash_algorithm sha1;
authentication_method pre_shared_key;
dh_group 2 ;
}
}
sainfo anonymous
{
pfs_group 1;
lifetime time 2 min;
encryption_algorithm aes;
authentication_algorithm hmac_sha1;
compression_algorithm deflate;
}
Basically this says we use aes encryption and sha1 messaging. aes is used in cbc mode by default. An example of an esp key generated by the above is:
196.209.72.128 196.209.31.174
esp mode=tunnel spi=247345712(0x0ebe3230) reqid=0(0x00000000)
E: aes-cbc 58c89085 0ff06a6b 3b1cf5b6 6be13884
A: hmac-sha1 0a58cdf4 6b799b2b 340306f7 b56e65ce 6b533a74
seq=0x00000000 replay=4 flags=0x00000000 state=mature
created: Mar 2 00:43:06 2007 current: Mar 2 00:43:52 2007
diff: 46(s) hard: 120(s) soft: 96(s)
last: Mar 2 00:43:15 2007 hard: 0(s) soft: 0(s)
current: 22656(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 148 hard: 0 soft: 0
sadb_seq=1 pid=12672 refcnt=0
Now the gentoo startup scripts require that you provide /etc/ipsec.conf which is essentially your ipsec policy that is passed to setkey -f. I just create a blank file, prefering to run the script above without the lines adding the keys after starting racoon. Well, my startup order is ppp0, and then the script/racoon in any order. Note that the shutdown for racoon flushes both the SPD and SAD (On Gentoo at least, this is done in the shutdown script). Al that remains now is to "pre share" the secret. Here things differ again, on atlantis I've got (obviously not the real secret):
krooninfsys.dyndns.uls.co.za somephrase
Once again, just change the fqdn when moving to the other peer.
Fire up racoon, I got an error about -4 not being a valid option, I don't have IPv6 support on any of my machines (yet), so I'm guessing that option is only available when racoon needs to pick between IPv[46] and since I haven't compiled in IPv6 support it's simply not there.
Run the script, and if all went well you'll once more be able to ping between your networks.
[edit] Firewalling
There are various issues to consider with firewalling. You still don't want anybody to access anything that they would normally have access to. So in this case, you don't want to have more services suddenly open up on ppp0 for somebody that now has an IP on your remote range (or has the ability to spoof the remote range). So a rule such as:
# iptables -A INPUT -i ppp0 -s 192.168.42.0/24 -j ACCEPT
On xacatecas is a BAD rule. Well, just the fact that you specify IP ranges makes this a relatively bad rule in my opinion. I usually firewall purely based upon interfaces if at all possible, and only if really needed to I specify IP ranges.
Also, you need to remember that IKE needs to be able to function, and in the above configuration that makes use of udp port 500, non esp/ah (or non-ipsec). Also, you are going to first see an esp packet on INPUT and then the decapsulated packet on either INPUT/FORWARD.
For input, I simply state that I expect esp packets to come in on ppp0:
# iptables -A INPUT -i ppp0 -p esp -j ACCEPT
What about invalid/old packets? Well, just let the ipsec code in the kernel sort those out. A firewall imho is there to filter out traffic that we're not interrested in ... do you make your firewall block all requests that would result in a HTTP/404 response? Didn't think so.
The next step is to allow the required INPUT and FORWARD. Fortunately iptables has a policy module that can help us with identifying packets that came in off ipsec. I trust both networks fully, so I simply want to allow all legit esp decapsulated packets coming in off ppp0, this is simply achieved with:
# iptables -A INPUT -i ppp0 -m policy --dir in --pol ipsec -j ACCEPT # iptables -A FORWARD -i ppp0 -o ethlan -m policy --dir in --pol --ipsec -j ACCEPT
If you want to restrict the services that can be accessed then you can restrict it by either passing to another chain instead of ACCEPT, or by having a number of rules that filters and/or allows specific traffic. I'm always for default DROP and ACCEPT on exception when I do filter (aka, ppp0 INPUT/FORWARD), which is something I always do unless I fully control the entire network that can input to the input device. If you want to match packets that did not come in off ipsec then you can use "--pol none".
Right, now our networks are happily communicating however, since you are not performing any NAT at this stage your internal networks can't access the internet (even though they can traverse the much more complex ipsec route). Now, all the how to documents seem recommend to run a separate NAT and ipsec box. Frankly, I don't see exactly how they can even make that work unless they've got at least a small pool of public IPs (and even then the routing complexity just boggles my mind). So what is the problem with the standard one-liner:
# iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
Well, it NATs the packets that need to traverse ipsec and changes the source so you end up with a packet that doesn't start within the network that gets processed by ipsec so you force a plain text packet off onto the internet that won't arrive since it's destination is an unrouteable address. Instead what we want to do is ACCEPT packets that are coming off our network, going onto ppp0 and destination inside the other network WITHOUT performing any NAT whatsoever. Initially I did this (on xacatecas):
# iptables -t nat -A POSTROUTING -o ppp0 -s 192.168.0.0/24 -d 192.168.42.0/24 -j ACCEPT # iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
Which works nicely enough, but what if the packet isn't actually going to be processed by ipsec? Then we are essentially sending a packet into oblivion. Something which in South Africa is especially stupid (1. it doesn't make sense to knowingly send a packet to a slow death, I'd rather just put it out of it's missery, 2. one word: CAP). So instead we once more use the policy module. It allows us to check whether a packet is GOING TO BE encapsulated by some ipsec policy, so the above simply turns into:
# iptables -t nat -A POSTROUTING -o ppp0 -m policy --dir out --pol ipsec -j ACCEPT # iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
Then on input you need to allow udp port 500 to make IKE work correctly (4500 as well if you are doing NAT-T - something which I've thankfully managed to avoid). Note that racoon sets per-socket properties to prevent ipsec from being applied to IKE related packets. This is perfectly correct, after you've fired racoon up you'll see quite a number entries in the SPD (setkey -DP).
To finish off, here is my complete firewall I've got running on xacatecas (simpler of the two):
xacatecas ~ # iptables -t nat -L -v -n
Chain PREROUTING (policy ACCEPT 3 packets, 338 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 10 packets, 1519 bytes)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all -- * ppp0 0.0.0.0/0 0.0.0.0/0 policy match dir out pol ipsec
9 684 MASQUERADE all -- * ppp0 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT 17 packets, 1893 bytes)
pkts bytes target prot opt in out source destination
xacatecas ~ # iptables -L -v -n
Chain INPUT (policy DROP 3 packets, 504 bytes)
pkts bytes target prot opt in out source destination
0 0 logdrop all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
2055 132K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
19 1845 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0
6 1262 ACCEPT all -- br0 * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT tcp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 22,80,1723 tcp flags:0x17/0x02
0 0 ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 multiport dports 500,1194
0 0 ACCEPT esp -- ppp0 * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- ppp0 * 0.0.0.0/0 0.0.0.0/0 policy match dir in pol ipsec
Chain FORWARD (policy DROP 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 logdrop all -- * * 0.0.0.0/0 0.0.0.0/0 state INVALID
11 734 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
0 0 TCPMSS tcp -- * ppp+ 0.0.0.0/0 0.0.0.0/0 tcp flags:0x17/0x02 TCPMSS clamp to PMTU
2 310 ACCEPT all -- br0 * 0.0.0.0/0 0.0.0.0/0
0 0 ACCEPT all -- ppp0 br0 0.0.0.0/0 0.0.0.0/0 policy match dir in pol ipsec
Chain OUTPUT (policy ACCEPT 1641 packets, 211K bytes)
pkts bytes target prot opt in out source destination
Chain logdrop (2 references)
pkts bytes target prot opt in out source destination
0 0 LOG all -- * * 0.0.0.0/0 0.0.0.0/0 LOG flags 0 level 4
0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0
xacatecas ~ #
I reckon you should note that I've got a bridge between a tap0 interface (OpenVPN) and ethint (internal ethernet interface). So you'll most likely have an ethX instead of br0.
[edit] traceroute
Just figured I might as well post this. Since the packets get encapsulated, routing seems a bit strange for the hosts on either end. It looks like packets from a host (non-router) on one network makes exactly two hops to a host on the other network (also non-router):
pug ~ # traceroute 192.168.42.10 traceroute to 192.168.42.10 (192.168.42.10), 30 hops max, 46 byte packets 1 xacatecas (192.168.0.1) 0.392 ms 0.154 ms 0.145 ms 2 192.168.42.1 (192.168.42.1) 68.488 ms 68.914 ms 69.694 ms 3 192.168.42.10 (192.168.42.10) 69.429 ms 68.907 ms 69.626 ms pug ~ #
pug has an IP of 192.168.0.3.

