next up previous
Next: User space Utilities Up: PPP and IP tunneling Previous: History

The sk_tunnel Kernel Framework

General Design

The basic design is fairly minimalistic. It uses existing code and interfaces whenever possible. The crucial component which made this possible is the new generic PPP layer present in Linux 2.4.x.

The old (async-only) PPP implementation is tty-centric, it was implemented as a tty line discipline module. Thus, it could only be interfaced to tty device drivers. The new ppp_generic implementation provides a special object, a struct ppp_channel, which interfaces lower layers to the PPP protocol. The ppp_channel interface is network-centric. It communicates with the other layers by passing an sk_buff in a somewhat similar manner as Linux network device drivers do.

Thus, what's essentially needed is a ppp_channel interface to the data path of the socket. The X.25 connection control and any policy decisions remain the task of a user space process. That process actively or passively initiates an X.25 connection to a peer by means of the native socket API (connect() or accept() system call). When the control process has decided that it really wants to use the X.25 connection for PPP tunneling, it performs a special ioctl().

Inside the kernel, that ioctl() will register a ppp_channel at the ppp_generic subsystem. After that, the data path of the socket (which is usually connected to user space and accessible by means of the sendmsg() and the recvmsg() system call) is redirected to the ppp_channel. When the data path is redirected to the ppp_channel, the socket also honors certain PPP related ioctl()s which are needed by pppd.

A new struct sktn_channel is provided which serves as the bridge between the socket's protocol stack and the ppp_generic layer. In terms of the object oriented programming paradigm, it is a derived class of ppp_channel. In addition to the ppp_channel interface, it supports communicating sk_buffs with the protocol stack. Its design also accounts for two important generalizations:

First, it was felt that passing sk_buffs between a ppp_channel and an arbitrary existing Linux protocol stack would require similar methods as needed for the X.25 protocol stack. Thus, sktn_channel was designed to be independent from the protocol. It provides a library API which might be used to implement PPP tunnels on top of other Linux protocol stacks, too.

Second, PPP over X.25 (RFC 1598) is not the only standardized method to transport IP packets by means of an X.25 connection. Another method (RFC 1356) consists of encapsulating the raw IP packets directly inside the X.25 payload data. Passing raw IP packets between the upper layer and the X.25 protocol stack requires similar code as the PPP scenario. Thus, the sktn_channel provides another abstraction and isolates the code which is independent of the upper (PPP or raw-IP) layer. For raw-IP upper layer, the sk_tunnel framework also implements special sk_tunnel network interfaces. The sk_tunnel framework supports to attach sktn_channels to such interfaces, alternatively to ppp_generic.

 figure76
Figure: The sk_tunnel Framework - General Design

Thus, the API provides a generic socket tunnel paradigm which is not protocol family specific. It basically adds an ioctl to the native Linux socket API which attaches the data path of a connected socket to the upper layer. The kernel side implementation is a library approach which implements the core functionality independently from the protocol family. Thus, although the initial implementation is targeted towards PF_X25, other protocol families can share the code.

Flow Control

One of the core functionality supported by an sktn_channel is the mapping of output flow control between the different layers.

The Linux network core supports output flow control by maintaining a write memory account for each socket. When a new sk_buff is allocated by means of sock_alloc_send_skb(), the socket's memory counter is increased by the sk_buff's size. When the network device driver has send out the sk_buff it calls dev_kfree_skb(). This implicitly decreases the memory counter. There is high and low water mark for each socket (which can be adjusted by the SO_SNDBUF socket option). This is usually used to control the state of the writing process. When the allocated write memory exceeds the high water mark, the writing process is blocked. After enough buffers are freed such that the amount of allocated write memory is below the low water mark, the sleeping and writing process is woken up again.

Linux 2.4.x network interfaces flow-control the upper layer by calling the netif_stop_queue() and the netif_wake_queue() functions (or dev->tbusy flag for 2.2.x and below). These calls are honored by the Linux network scheduler which starts or stops sending new TX frames to the device.

Linux 2.4.x ppp_channel are similar. The device driver calls ppp_output_wakeup() in order to tell the ppp_generic layer that it is ready to accept more frames for transmission. A busy condition is communicated to ppp_generic by means of a return value from ppp_start_xmit().

The sk_tunnel framework takes care of this. First, it assures that an output frame received from the upper layer is charged to the tunneling socket's write memory account. Then, it inspects that account and compares it with the threshold value. Depending on the result and the attached upper layer, it calls the appropriate netif_wake_queue(), netif_stop_queue(), or ppp_output_wakeup() function or controls the return value of ppp_start_xmit().


next up previous
Next: User space Utilities Up: PPP and IP tunneling Previous: History

Henner Eisen
Tue Sep 26 22:25:35 MEST 2000