The power of Erlang bit syntax

By Simon Thompson
March 13, 2009 | Comments: 1

Just finishing off the chapter on bit syntax and pattern matching over bit strings for our Erlang book. We wanted to put in a realistic example, and chose a TCP segment as described here.

It's amazing how expressive the notation can be, and we get a definition which pretty much mirrors the diagram and explanation in the link above: nothing like doing it for yourself to convince you that it works.

decode(Segment) ->
    case Segment of 
	<< SourcePort:16, DestinationPort:16,
	   SequenceNumber:32,
	   AckNumber:32,
	   DataOffset:4, _Reserved:4, Flags:8, WindowSize:16,
	   Checksum:16, UrgentPointer:16,
	   Payload/binary>> when DataOffset>4
	->
	    OptSize = (DataOffset - 5)*32,
	    << Options:OptSize, Message/binary >> = Payload,
	    <> = <>,
 	    %% Can now process the Message according to the
	    %% Options (if any) and the flags CWR, ..., FIN. 
	    binary_to_list(Message)
    end.

What is so gratifying about this definition is that the pattern in the case statement is a readable (yet formal) definition of what a segment looks like. It begins with two 16 bit words representing the source and destination port, which are followed by 32 bit fields for sequence and acknowledgement number.

So far we have matched on byte boundaries, but next we match

DataOffset:4, _Reserved:4, Flags:8, WindowSize:16

giving the DataOffset, 4 bits which are reserved (and so matched to a "don't care" variable), 8 bits of Flags and so on. After matching some more fields the remainder of the binary is matched to Payload/binary. The match is also guarded by a check that the DataOffset is indeed at least 5.

The body of the clause also uses pattern matching. In the statement

<< Options:OptSize, Message/binary >> = Payload,

any Options are taken from the front of the Payload; if there are none then Options will be matched to the empty binary <<>> and the Payload will be the Message.

In either case, the pattern match

<< CWR:1, ECE:1, URG:1, ACK:1, PSH:1, RST:1, SYN:1, FIN:1 >> = << Flags:8 >>,

simultaneously extracts the eight one-bit flags from the Flags byte.


You might also be interested in:

1 Comment

(markup / rendering problem:) in firefox 3.0.14, WinXP SP 3,

the source code line

is rendered as:

> = >,

which had me baffled for a couple minutes.

News Topics

Recommended for You

Got a Question?