User Tools

Site Tools


unix:gateway

This is an old revision of the document!


Linux Gateway

Due to the wholesome failure of my ISP to do things in a nice way my ADSL router was rendered useless and I was forced to either setup my own router or be content with a single PC on the internet in a house of 6 computers… I chose to setup my own gateway.

See also Traffic Shaping

Home Setup

Notes to self on how I setup my home network.

Physical Topology

Device Notes
modem ADSL modem with 1x phone line socket and 1x ethernet socket. Tends to get clogged for some reason (high latency, but connection stays up)
gateway Linux host with *one* network adaptor, and nothing much to do
LAN 4 or so PCs, Wii, Xbox, couple of Nintendo DS consoles, etc

Virtual Topology

Here's how it works in practice when a PC is connected up:

  1. PC broadcasts via DHCP for an IP address
  2. Modem (LAN, 192.168.1.1) responds with an IP address + static settings
    • Gateway IP = Gateway (eth0, 192.168.1.2)
    • Primary DNS = Gateway (eth0, 192.168.1.2)
    • Secondary DNS = Modem (LAN, 192.168.1.1)
  3. User of PC starts to browse example.com
  4. PC queries Gateway (eth1) for IP of example.com (1.2.3.4)
    • If Gateway's DNS service does not know the IP it will contact the internet via Modem, as below
  5. PC connects to example.com (1.2.3.4) via Gateway (eth0, 192.168.1.2)
  6. Gateway applies traffic shaping
  7. Gateway forwards the shaped traffic to Modem (LAN)
  8. Modem (WAN) forwards connection to ISP
  9. ISP do their thing
  10. ISP sends response to Modem (WAN)
  11. Modem (LAN) forwards response to Gateway
  12. Gateway applies traffic shaping and forwards response to PC

Configuration

Modem

It's a ZyXEL P-660R-D1 ADSL Modem.

The web interface is fairly limited, so enable the Telnet interface (Advanced → Remote MGMT → Telnet).

$ telnet 192.168.1.1
Trying 192.168.1.1...
Connected to 192.168.1.1.
Escape character is '^]'.
 
Password: ******
Copyright (c) 1994 - 2007 ZyXEL Communications Corp.
P-660R-D1> lan index 1        # Select LAN port 1 (of 1)
enif0 is selected
P-660R-D1> lan dhcp server gateway 192.168.1.2
P-660R-D1> lan save
lan: save ok
P-660R-D1> ip dhcp enif0 status
DHCP on iface enif0 is server
     Start assigned IP address: 192.168.1.2/24
     Number of IP addresses reserved: 192
     Hostname prefix: dhcppc
     DNS server: 192.168.1.2 212.159.13.49
     WINS server: 0.0.0.0 0.0.0.0
     Domain Name : 
     Default gateway: 192.168.1.2
     Lease time: 259200 seconds
     Renewal time: 129600 seconds
     Rebind time: 226800 seconds
     Probing count: 4
slot    state      timer   type  hardware address     hostname
   0  UNCERTAIN          0    0  00 
   1  UNCERTAIN          0    0  00 
   2  UNCERTAIN          0    0  00 
   3  UNCERTAIN          0    0  00 
...
Status:
     Packet InCount: 0, OutCount: 0, DiscardCount: 0
P-660R-D1> exit
Connection closed by foreign host.

Changes are immediate, and persistent. Renew your DHCP lease to get the updated setting, and run route -n to check the routing table, which should look like this:

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     0.0.0.0         255.255.255.0   U     1      0        0 eth0
0.0.0.0         192.168.1.2     0.0.0.0         UG    0      0        0 eth0

Important feature:

  • The default gateway (destination + mask of 0.0.0.0 means “any”) is 192.168.1.2.

Gateway

Desired routing table:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.1.0     0.0.0.0         255.255.255.0   U     1      0        0 eth0
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 eth0

Important features:

  • All LAN traffic goes via eth0
  • The rest (internet traffic) should be forwarded through 192.168.1.1 (Modem LAN)

Desired DNS server list:

# Generated by NetworkManager
nameserver 127.0.0.1
nameserver 192.168.1.1

Configuration files to edit:

/etc/dhcp3/dhclient.conf1):

option rfc3442-classless-static-routes code 121 = array of unsigned integer 8;

send host-name "<hostname>";
send dhcp-requested-address 192.168.1.2;
supersede domain-name "local robmeerman.co.uk";
supersede routers 192.168.1.1;
prepend domain-name-servers 127.0.0.1;

request subnet-mask, broadcast-address, time-offset, routers,
        domain-name, domain-name-servers, domain-search, host-name,
        netbios-name-servers, netbios-scope, interface-mtu,
        rfc3442-classless-static-routes, ntp-servers;

IP Forwarding and NAT

On-the-fly:

Taken from http://www.technize.com/2007/05/03/configuring-a-nat-gateway-in-linux/

echo 1 > /proc/sys/net/ipv4/ip_forward

Persistent:

Edit /etc/sysctl.conf:

# Uncomment the next line to enable packet forwarding for IPv4
net.ipv4.ip_forward=1

Disabling ICMP Host Redirection

As you probably noticed from the physical topology diagram, there is only one network interface on the gateway PC, and so you may find that the gateway PC informs all of its clients that they can talk to the modem directly:

PING google.com (173.194.37.104) 56(84) bytes of data.
From skuld.local (192.168.1.2): icmp_seq=1 Redirect Host(New nexthop: 192.168.1.1)
64 bytes from lhr14s02-in-f104.1e100.net (173.194.37.104): icmp_seq=1 ttl=57 time=15.4 ms

This can be disabled on-the-fly via:

echo 0 | sudo tee /proc/sys/net/ipv4/conf/*/accept_redirects
echo 0 | sudo tee /proc/sys/net/ipv4/conf/*/send_redirects

Update 2013-10: This guide used to update /proc/sys/net/ipv4/conf/all/accept_redirects, but now uses * in place of all. That was bad as the all configuration merely sets the default, but won't alter any existing interfaces. Thanks to unix.stackexchange.com for this tip.

Or permanently by adding the following to /etc/sysctl.conf. Again, be on the safe side and explicitly name your interfaces:

net/ipv4/conf/all/accept_redirects = 0
net/ipv4/conf/all/send_redirects = 0
net/ipv4/conf/eth0/accept_redirects = 0
net/ipv4/conf/eth0/send_redirects = 0

See http://www.itsyourip.com/Security/how-to-disable-icmp-redirects-in-linux-for-security-redhatdebianubuntususe-tested/

DNS Service

sudo aptitude install bind9
  1. Enable caching
    • sudoedit /etc/bind/named.conf.options
    • Uncomment forwarder section and add ISP DNS server IPs:
          forwarders {
              212.159.13.49;
              212.159.13.50;
          };
    • sudo service bind9 restart
  2. Alias ikari.robmeerman.co.uk (real public domain name) to a private IP. (This is not required if you have search robmeerman.co.uk in /etc/resolv.conf)
    • sudoedit /etc/bind/named.conf.local
    • // LAN hosts
      zone "ikari.robmeerman.co.uk" {
          type master;
          file "/etc/bind/db.lan.ikari";
      };   
    • sudoedit /etc/bind/db.lan.ikari
    • ;
      ; BIND data file for local area network (LAN)
      ;   
      $TTL    604800
      @       IN      SOA     ns.localhost. root.localhost. (
                                    1         ; Serial
                               604800         ; Refresh
                                86400         ; Retry
                              2419200         ; Expire
                               604800 )       ; Negative Cache TTL
      ;   
      @       IN      NS      ns.localhost.
      
      @       IN      A       192.168.1.2 ; Zone's address
      *       IN      A       192.168.1.2 ; Wildcard (all sub-domains)

Traffic Shaping

sudo aptitude install wondershaper
 
# Assuming downlink == 3712 kbps / uplink == 448 kbps
sudo wondershaper eth0 $((3712*1000)) $((448*1000))

I used to use Ubuntu's stock wondershaper package, but now use my own adaptation of it that does *not* shape or police LAN traffic. This allows my gateway PC to double as a file server: internet traffic is shaped and policed to match my ADSL line speeds, while file-server (local) traffic runs at gigabit speeds.

See my ADSL project on GitHub: https://github.com/meermanr/adsl

wondershaper
#!/bin/bash -e
 
# Adapted from http://lartc.org/wondershaper/
 
DEV=$1
DOWNLINK=$2
UPLINK=$3
 
# The following fudge factors allow you to express the usable % of your link.  
#
# Experimentation has shown that ~75% of the author's ADSL downlink can be used 
# before upstream congestion starts to affect round-trip times. In other words, 
# by throttling our download speeds we can ensure that our ISP does not queue 
# any packs on our behalf, giving us full control over congestion.
DOWNFACTOR='74/100'
UPFACTOR='75/100'
 
if [ "x$DEV" = "x" ]
then
    echo "Usage: $0 (DEV) [ 'clear' | (DOWNLINK kbit/s) (UPLINK kbit/s) ]"
    exit 0
fi
 
# Display status when DOWNLINK/UPLINK are ommitted
if [ "x$DOWNLINK" = "x" ]
then
    #echo "--------------------------------------------------------------------------------"
    #iptables -nvL -t mangle
    #echo "--------------------------------------------------------------------------------"
    tc -s filter ls dev $DEV
    echo "--------------------------------------------------------------------------------"
    #tc -s qdisc ls dev $DEV
    #echo "--------------------------------------------------------------------------------"
    tc -s class ls dev $DEV
    exit 0
fi
 
# Clear both IN and OUT
tc qdisc del dev $DEV root    2> /dev/null > /dev/null || true
tc qdisc del dev $DEV ingress 2> /dev/null > /dev/null || true
 
# Flush and delete all mangle rules
iptables -F 
iptables -X 
iptables -t mangle -F 
iptables -t mangle -X 
 
if [ "x$DOWNLINK" = "xclear" ]
then
    echo "Cleared traffic rules on $DEV"
    exit 0
fi
 
trap "$0 $1 clear" ERR
 
# Calculations
#
# Target latency is < 50ms. This means max burst length should be limited to 
# 1/20th the queue's rate.
 
 
LOCALIP=$(ifconfig eth0 | sed -ne 's/^.*inet addr:\([0-9.]\+\).*/\1/p')
 
# =============================================================================
# Queues and Classes
# =============================================================================
# 1: ROOT
# |-- 1:ff LOCAL_TRAFFIC (to/from this host itself)
# | `-- ff: (sfq)
# |-- 1:1 INTERNET->LAN (downlink)
# | `-- 10: (red) Drop traffic as link approaches congestion
# `-- 1:2 LAN->INTERNET (uplink)
#   |-- 1:21: High priority
#   | `-- 21: (sfq)
#   |-- 1:22: Medium priority
#   | `-- 22: (sfq)
#   `-- 1:23: Low priority
#     `-- 23: (sfq) Low priority
 
# ROOT
tc qdisc add dev $DEV root handle 1: htb
 
    # LOCAL TRAFFIC
    tc class add dev $DEV parent 1: classid 1:ff htb \
        rate 100mbit \
        burst $((100/20))mbit \
        cburst $((100/20))mbit \
        prio 1
 
        # .. and its actual queue that holds the packets
		tc qdisc add dev $DEV parent 1:ff handle ff: sfq perturb 10
 
    # INTERNET->LAN (downlink)
    tc class add dev $DEV parent 1: classid 1:1 htb \
        rate $(($DOWNLINK*$DOWNFACTOR))kbit \
        ceil $(($DOWNLINK*$DOWNFACTOR))kbit \
        burst $(($DOWNLINK*$DOWNFACTOR/20))kbit \
        cburst $(($DOWNLINK*$DOWNFACTOR/20))kbit \
        prio 10
 
        # .. and its actual queue that holds the packets
        # Note: All values are in BYTES. It doesn't seem to accept "kbit"
        #
        # The burst calculation needs to be increased by one so as to avoid an 
        # internal assert in the qdisc (seems our target and their min 
        # acceptable burst are one and the same)
        tc qdisc add dev $DEV parent 1:1 handle 10: red \
            limit $(($DOWNLINK*$DOWNFACTOR*1000/8)) \
            avpkt 1500 \
            burst $((($DOWNLINK*1000/8/20/1500)+1)) \
            min   $(($DOWNLINK*1000/8/20)) \
            max   $(($DOWNLINK*$DOWNFACTOR*1000/8)) \
            ecn \
            probability 1
 
    # LAN->INTERNET (uplink)
    tc class add dev $DEV parent 1: classid 1:2 htb \
        rate $(($UPLINK*$UPFACTOR))kbit \
        ceil $(($UPLINK*$UPFACTOR))kbit \
        burst $(($UPLINK/20))kbit \
        cburst $(($UPLINK/20))kbit \
        prio 20
 
        # High priority
        tc class add dev $DEV parent 1:2 classid 1:21 htb \
            rate $(($UPLINK*$UPFACTOR*4/6))kbit \
            ceil $(($UPLINK*$UPFACTOR))kbit \
            prio 0
 
        # Medium priority
        tc class add dev $DEV parent 1:2 classid 1:22 htb \
            rate $(($UPLINK*$UPFACTOR*2/6))kbit \
            ceil $(($UPLINK*$UPFACTOR))kbit \
            prio 1
 
        # Low priority
        tc class add dev $DEV parent 1:2 classid 1:23 htb \
            rate $(($UPLINK*$UPFACTOR*1/6))kbit \
            prio 2
 
        # .. and their actual queues that hold the packets
        for ID in 21 22 23
        do
            tc qdisc add dev $DEV parent 1:$ID handle $ID: sfq
            ## tc qdisc add dev $DEV parent 1:$ID handle $ID: red \
            ##     limit $(($UPLINK*$UPFACTOR*1000/8)) \
            ##     avpkt 1500 \
            ##     burst $((($UPLINK*1000/8/20/1500)+1)) \
            ##     min   $(($UPLINK*1000/8/20)) \
            ##     max   $(($UPLINK*$UPFACTOR*1000/8)) \
            ##     ecn \
            ##     probability 1
        done
 
 
# =============================================================================
# Filters
# =============================================================================
 
# -----------------------------------------------------------------------------
# LOCAL TRAFFIC
# Mark traffic generated by this host itself (INPUT + OUTPUT, but not FORWARD)
iptables -t mangle -A INPUT  -p all -i $DEV -j MARK --set-mark 0xff
iptables -t mangle -A OUTPUT -p all -o $DEV -j MARK --set-mark 0xff
 
# ("fw" means the handle refers to a MARK, rather than a qdisc)
tc filter add dev $DEV parent 1: protocol ip prio 1 handle 0xff fw classid 1:ff
 
# -----------------------------------------------------------------------------
# INTERNET->LAN (downlink)
#
# Note: We assume that LAN->LAN traffic is *not* forwarded through this host, 
# and so we need only check the destination of a given packet. We've already 
# taken care of this host's own traffic above.
 
iptables -t mangle -N DOWNLINK
iptables -t mangle -A DOWNLINK -p all -j MARK --set-mark 0x1
tc filter add dev $DEV parent 1: protocol ip prio 2 handle 0x1 fw classid 1:1
 
for SUBNET in 192.168.0.0/16 10.0.0.0/8 172.16.0.0/12 
do
    iptables -t mangle -A PREROUTING -p all -i $DEV ! -s $SUBNET -d $SUBNET -j DOWNLINK
done
 
 
# -----------------------------------------------------------------------------
# LAN->INTERNET (uplink)
#
# Note: Assumes that all downlink and private traffic have already been 
# classified, so no source checks are performed.
 
iptables -t mangle -N UPLINK
iptables -t mangle -A UPLINK -p all -j MARK --set-mark 0x22     # Default to medium priority
#for CHAIN in PREROUTING INPUT FORWARD OUTPUT POSTROUTING DOWNLINK UPLINK
#do
#    iptables -t mangle -I $CHAIN -p tcp --sport 12345 -j MARK --set-mark 0/0
#    iptables -t mangle -I $CHAIN -p tcp --dport 12345 -j MARK --set-mark 0/0
#done
 
for SUBNET in 192.168.0.0/16 10.0.0.0/8 172.16.0.0/12 
do
    iptables -t mangle -A PREROUTING -p all -i $DEV -s $SUBNET ! -d $SUBNET -j UPLINK
done
 
##
## HIGH PRIORITY ##
##
 
# TOS Minimum Delay (ssh, NOT scp)
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip tos 0x10 0xff \
    flowid 1:21
 
# ICMP (ip protocol 1) in the interactive class so we can do measurements & 
# impress our friends:
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip protocol 1 0xff \
    flowid 1:21
 
# Prioritize small packets (<64 bytes)
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip protocol 6 0xff \
    match u8 0x05 0x0f at 0 \
    match u16 0x0000 0xffc0 at 2 \
    flowid 1:21
 
# Prioritise ACK packets (but only if they are small)
# IP protocol 6,
# IP header length 0x5(32 bit words),
# IP Total length 0x34 (ACK + 12 bytes of TCP options)
# TCP ack set (bit 5, offset 33)
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip protocol 6 0xff \
    match u8 0x05 0x0f at 0 \
    match u16 0x0000 0xffc0 at 2 \
    match u8 0x10 0xff at 33 \
    flowid 1:21
 
# Traffic headed to robmeerman.co.uk (typically SSH proxying to else where)
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip dst 85.119.82.218/32 \
    flowid 1:21
 
# Traffic originating from the Xbox should be treated as urgent
tc filter add dev $DEV parent 1: protocol ip prio 20 u32 \
    match ip src 192.168.1.2/32 \
    flowid 1:21
 
 
##
## LOW PRIORITY ##
##
 
# # WiiU, while it's downloading purchases
# tc filter add dev $DEV parent 1: protocol ip prio 30 u32 \
#     match ip src 192.168.1.5/32 \
#     flowid 1:23
 
# TOS High Throughput
tc filter add dev $DEV parent 1: protocol ip prio 30 u32 \
    match ip tos 0x8 0xff \
    flowid 1:23
 
# If no other filter has classified the packet, then use FW markers (set by 
# iptables -j MARK). All UPLINK packets are marked as 0x22 by default (see 
# iptables command earlier)
tc filter add dev $DEV parent 1: protocol ip prio 40 handle 0x21 fw classid 1:21 # High priority
tc filter add dev $DEV parent 1: protocol ip prio 40 handle 0x22 fw classid 1:22 # Medium priority
tc filter add dev $DEV parent 1: protocol ip prio 40 handle 0x23 fw classid 1:23 # Low priority
 
 
# Reset counters, so that packet counts are in sync (it takes time to add 
# rules, and during that time the first rule added may be hit, leading to 
# confusing packet counts: "But these rules should always apply to the same 
# packets! How can their hit count be different?"
iptables -t mangle -Z

Transparent Web Proxy

sudo aptitude install squid

to install Squid v2.7.

Then edit /etc/squid/squid.conf so that

  1. the http_port tag is set to http_port 3128 transparent
  2. the http_access allow localnet is uncommented

Restart Squid (sudo service squid restart) and then foribly redirect web traffic to the proxy:

iptables -t nat -A PREROUTING ! -d 192.168.0.0/16 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 3128

The 1 NIC problem

At first I didn't have a PC with two network cards, so I found a way to do it with one network card and a lot of ugly hacks and tricks. Sadly I did this long before I wrote this page, so I can't recall the details. But for those having similar problems here was my solution.

My Solution

I use WinXP on my laptop, and happened to have a copy of VMware2) installed so I setup a new virtual machine with two NICs and inserted my trusty Knoppix Linux LiveCD3). Once booted I used the Linux IP Masquerade HOWTO to get things going.

Amazingly, this worked! I had 3 IPs on one NIC: 2 for the virtual machine running Knoppix, and 1 for Windows itself. Actually, IIRC, all 3 actually had seperate MAC addresses too.

I didn't keep this setup for long, as my laptop is portable and I didn't want it tied to the house.

My rc.firewall-iptables script

The famous (perhaps even “standard”) way of making a Linux platform into a NAT router is to use a script called rc.firewall-iptables from the Linux IP Masquerade HOWTO. While this definately works, it's a bit tricky to use, especially adding new port-forwarding rules which is something I do fairly regularly.

So I spent an afternoon doing a bit of BASH scripting and, based on the original script, produced the script below, which I hope some will find useful.

Download rc.firewall (14kB)

What's so special about it?

Well, it has a very nice block where you can set up portforwarding via simple lists using the Windows computer names, which means that if your network using DHCP and the IP addresses of your computers change sometimes, you'll have no problem if you simply schedule the script to run periodically. It also is nice in that it closes ports when the computer they are being forwarded to is offline.

Example of configuration block of script:

        EXTIP=`ifconfig eth0 | egrep -o '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | head -n1`
        echo "   External IP detected as $EXTIP";
 
        # Local services exposed
        LOCALTCPPORTS="22"
        LOCALUDPPORTS=""
 
        # PCs to forward connections to, using names in /etc/hosts or NetBIOS
        PORTFWPC[0]="Ikari"
        TCPPORTS[0]="80 26346 113 4899 1024 5190"
        UDPPORTS[0]="26346"
 
        PORTFWPC[1]="Kirara"
        TCPPORTS[1]="5443 2902 56881"
        UDPPORTS[1]="2902 56881"
 
        PORTFWPC[2]="Mum"
        TCPPORTS[2]="4662 26346"
        UDPPORTS[2]="4672 26346"

Notes

  • My internet connection is on eth0, my first network interface card (NIC), and the IP address changes when we have a power-cut or my ISP decides to cut us due to bad managment and faulty hardware :-|
  • I want port 22 of the gateway machine to be exposed. (Anything not listed there is closed to the public)
  • The computers are called “Ikari”, “Kirara” and “Mum”.

Running the script will produce output like so:

Loading simple rc.firewall version 0.78..

   External Interface:  eth0
   Internal Interface:  eth1
   loading modules:
----------------------------------------------------------------------
ip_tables, ip_conntrack, ip_conntrack_ftp, ip_conntrack_irc, iptable_nat, ip_nat_ftp,
----------------------------------------------------------------------
   Done loading modules.

   Enabling forwarding..
   Clearing any existing rules and setting default policy..
   External IP detected as 10.150.47.24
   Closing all external ports but allowing ICMP...
    - TCP 22 reopened
  Allowing existing and related connections to local servives, rejecting all other non-ICMP traffic
   Ikari found in /etc/hosts
   Forwarding incoming connections to Ikari (192.168.0.4) by port...
   - TCP 80
   - TCP 26346
   - TCP 113
   - TCP 4899
   - TCP 1024
   - TCP 5190
   - UDP 26346
   Using NetBIOS to ask for Kirara
   Forwarding incoming connections to Kirara (192.168.0.12) by port...
   - TCP 5443
   - TCP 2902
   - TCP 56881
   - UDP 2902
   - UDP 56881
   Using NetBIOS to ask for Mum
    Unable to obtain valid IP address, skipping Mum
   FWD: Allow all connections OUT and only existing and related ones IN
   Enabling SNAT (MASQUERADE) functionality on eth0

rc.firewall-iptables v0.78 done.

Notice that it skips “Mum” as it (the computer) is not on at the moment.

In case you're thinking “External IP of 10.x.x.x??”, you're quite right. But that's the IP my ADSL provider gives me, so for all intents and purposes, it's my external IP, even if it isn't what the rest of the net sees.

==

«< Request for Feedback ::: Feel free to contact me about this script, or anything else mentioned/implied by this page. »>

1)
The <hostname> text is literal, it seems that dhclient expands this at the right time somehow
2)
A “PC Emulator”, it creates a blank virtual PC for you to do what you like with.
3)
This is a bootable copy of Debian Linux, which is famous for having a complete toolset and great hardware auto-detection
unix/gateway.1381667905.txt.gz · Last modified: 2013/10/13 12:38 by robm