Source Wars - Return of the Puffy: What's New in OpenBSD 4.4

By Federico Biancuzzi
November 3, 2008 | Comments: 6

A long time ago in a galaxy far, far away...

Jedi apprentice Federico Biancuzzi contacted the Council and interviewed 27 Master Developers to talk about how they liberated OpenBSD 4.4 from the Empire. Details on the operation are not completely disclosed yet, but you can already see a picture of the Uniform, of the team Team, and of the elite PuffySet.

Puffy Sourcewalker chose to dedicate this victory against the Empire to Keith Bostic, Mike Karels, Kirk McKusick, and all of those who contributed to making Net/2 and 4.4BSD-Lite free. The Source runs strong in you!

The Masters discussed buffer cache improvements, the new malloc(), the work to make the math library more C99 compliant, what is new in the SCSI area, crypto support for softraid, a lot of fundamental work happened in PF, if Reyk joined the dark side, a new tool to merge configuration files during upgrades, the status of OpenCVS, some cool features of OpenSSH 5.1, the initial support for USB webcams, the never-ending work on improving and extending the sensors framework, and mooore.

May the Source be with you!

Kernel-land and Libraries

How did you modify the buffer cache subsystem to improve its performance so much?

Artur Grabowski: The goal with the work we did wasn't to improve performance, I suspected that there would be some performance loss and I already started preparing excuses for that when people started to whine. We were actually surprised by the performance gains.

The buffer cache before was a static amount of pages that were mapped into static amount of virtual space. What we did is to not keep the physical pages mapped until they are actually needed. We need this change to be able to have much more physical pages in the buffer cache, to be able to do some interesting tricks for DMA down to the hardware and also, so that we can map other pages than strictly buffer cache pages into buffers.

This was mostly just laying foundation for future work on the I/O subsystem.

I saw this entry in the changelog: "Disallow processes from mapping their own page 0". Would you like to tell us more about it?

Miod Vallat: This is a simple change, designed to make exploits through NULL pointer dereferences in the kernel more difficult.

On many architectures, the kernel (supervisor) address space and the userland address space are shared. The lower addresses are used for the userland processes, and the kernel uses a small part at the top of the address space.

If a malicious userland application uses the mmap() system call to map executable memory at address zero, fills this memory with code, and manages to get the kernel to dereference a NULL function pointer, it will end up running the code the application has set up.

There was such a malicious application presented a couple years ago, which caused a local root compromise through a bug in a ioctl handler in the kernel on i386.

This was caused by a missing pointer validation check -- the kernel ought to have checked the callback pointer before invoking it. Of course, that bug has long been fixed, but there might be other unnoticed occurences of such a bug.

On systems with so-called "split'' address spaces, this problem can not occur -- address zero in the kernel is different from address zero in userland. These architectures (such as sparc64) are immune to this kind of problem.

On vulnerable architectures, the ideal solution would be to disable the userland part of the address space when entering the kernel, and reenable them when returning to userland, or when copying data between kernel and userland is necessary. However, this would be quite painful to implement, and this would have a significant performance impact because of translation buffers invalidations.

So, after some thinking, it has been decided to go the simplest path and simply prevent userland code from mapping memory at address zero (and thus controlling its constants). Of course, this does not fix the potential problem per se; but it raises the bar, as getting the kernel to invoke a function pointer with a specific, not NULL-based, value is much more difficult.

How is the work on rthreads going on?

Philip Guenther: Kurt Miller and I managed to track down and close a race between pthread_join() and pthread_exit() in librthread that could crash your process, and then implemented wrapper functions for fork() and vfork(). I had hit the need for the latter while testing using rthread with the pidgin IM client: it crashed immediately at startup when it forked and then tried to do DNS queries from the child, as the structures used by rthreads to provide thread-local storage to libc's resolver needed to be reinitialized in the child. So librthread needed to wrap fork() for the same reasons as pthread, albeit without as much complexity.

On the kernel side, signal handlers are now properly shared between rthreads. There are still issues around signal handling with rthreads in 4.4, as the patches in that area weren't stable enough to beat the 4.4 tree lock, but they're starting to go in: -current already includes a race-free sigwait(), proper threads-per-uid tracking for the per-user NPROC rlimit, and closes a race in cleanup when a process with multiple threads exits. It's better, but there's still a goodly ways to go before rthreads can replace pthreads, of course.

How is work on Position Independent Executables (PIE) going on?

Kurt Miller: For 4.4 I started working on PIE support in OpenBSD. While 4.4 will not support PIE yes, some important progress was made during the release cycle. In particular improvements to were made so that the runtime linker would relocate the executable if it was loaded a different address then it was linked at. Also the kernel was extended to recognize and load a PIE program at a random address. Work for PIE is continuing on post 4.4.

Would you like to present the new version of malloc(3) you have developed?

Otto Moerbeek: The main change in malloc(3) has been the way used pages are being tracked. Malloc has a long history, and the main datastructure was written for an sbrk(2) based system where malloc gets contiguous memory form the kernel. When we moved to a mmap(2) based malloc afew years back, the pages given to malloc by the kernel got a random location. The central datastructure used by malloc was adapted for that, but suffered from being too complex and inefficient. So I rewrote that to make use of a data structure adapted for random page addresses, and made some changes along the way. Now we have a malloc that's faster, wastes less recources and has more features: it randomizes chunk allocation and cache maintainance. Furthermore it has the nice trick of being able to move some small allocations to the end of a page, so more buffer overflows (both reads and writes) gets caught since they are likely to access unmapped memory.

I heard that we can finally use more than 4GB of RAM on amd64. Is that true? What did you have to fix to achieve this remarkable result?

Tobias Weingartner: Ahh, bigmem. Well, it was removed like hours before the final 4.4 was cut. Turned out that there was a problem with the diff. I'm currently working on getting the final pieces finished so that we can re-enable the bigmem stuff.

The main reason that the bigmem diff was turned off again was the lack of testing from the community. It is sad that such things are happening. Possibly they think that OpenBSD is so good, that they don't need to test diffs anymore? :)

What does strtof(3) offer? Why did you add it to libc?

Martynas Venckus: strtof(3) is used for string conversion to float representation. Since it is one of the standard ANSI C99 functions, applications expect to use it.

It all started three months ago. Landry Breuil (landry@) needed strtof(3) for applications he was porting to OpenBSD. Landry wrote a simple wrapper with the bounding check, to call strtod(3).

Recently, Theo pointed me to David M. Gay's gdtoa implementation. I've replaced our ancient dtoa with it. This also brought us the real implementations of ANSI C99 functions strtof(3) and strtold(3), the extended-precision version.

I found an interesting entry in the changelog about a more accurate version of tan() in libm, and a bug fix for pow() and powf(). Would you like to tell us more?

Martynas Venckus: pow(3) and powf(3) gave incorrect results when x was close to -1.0 and y was very large. Correct results are close to -e and -1/e. tan(3) and tanf(3) were switched to use more accurate algorithm. Previously, the error was >1 ulp, and worst error was at least 1.45 ulp. This affected applications, using those functions. The changes were merged from fdlibm 5.3, which our libm is based on.

A very important effort that's being worked on is getting the math library more C99 compliant. Since the functions are standard ANSI C99 functions, applications expect to use them. Our libm misses quite a lot of them.

Some months ago I once again needed to get FreeMat (an application from ports. Steven Mestdagh (steven@) originally pointed me at it) up to date. FreeMat is quite a problematic application to work on, every release exposes a bug in either FreeMat, or in our base. Last releases needed fixes for gcc, and libm.

This time FreeMat (and other applications, such as clisp) needed tgamma(3), the Gamma function, which OpenBSD's libm didn't have. Some ports add the (usually broken) work-arounds for missing functions, i did not feel that's the way to go. So we were stuck with the older versions.

Recently I've added tgamma, and other math functions (single and double-precision versions) such as exp2 (which Charles Longeau (chl@) pointed me to), nan, remquo, fpclassify, is*, signbit to libc and libm. Applications were happy.

What work was necessary to make the statfs(2) system call compatible with large filesystems? What are its size limits now?

Otto Moerbeek: Basically the issue is just adapting some fields holding block counts to 64 bits. So, the maximum size statfs(2) can handle will be 2^63-1 * 512 (some of these numbers are signed). The multilication factor will be larger for file sysrtems using a larger blocksize. But to keep being able to run old executables using newer kernels, a compatiblity mechanism was introduced by assigning the syscalls new numbers and provinding backward compatible code for the old syscall numbers. Probably more timer went into writing and verifying the backward compatibility code than into the new code itself.


I read that you reduced the memory usage of fsck_ffs(8) by 20%. How did you achieved this result?

Otto Moerbeek: This was a joint effort. It started with Dale Rahn noticing fsck_ffs(8) used two arrays of bytes to hold some data for each inode. Both these arrays only used a few bits per byte, so they could be packed toghether into a single byte per inode. Dale made a prototype diff, which I took, reviewed, and adapted slightly. After testing and ok's it was committed during the hackathon.

What's new in the scsi subsystem?

Kenneth R Westerback: There are the usual batch of small improvements. For example ATAPI devices are now shown during probe by replacing 'SCSIn' with 'ATAPI' when the device claims to be ATAPI. And the bus initiator is also identified during probe, which allows some dmesg simplification and consistency.

SCSI tape drives can now be detached, making USB tape drives a bit more convenient.

Device discovery has been improved so that more fibre channel devices can be found, and dmesg spam is reduced for devices that don't happen to have media loaded during boot.

The largest improvement internally is more extensive use of the TEST UNIT READY command during device discovery. By more correctly using TEST UNIT READY to clear out spurious error conditions devices like the YE-Data USB floppy and various IOMEGA units are now recognized correctly and left online if media is loaded. This also required removing another quirk - ADEV_NOTUR, which is always a good thing.

While working on these issues the SCSI debug mechanism was enhanced to display outgoing and incoming data. This will help debugging future problems more easily.

A whole class of devices that returned odd results to indicate they are becoming ready are now properly allowed to finish coming ready, reducing further the number of 'drive offline' messages that the user will see.

Finally, another class of devices that wouldn't yield up their configuration unless the media was locked in place are now properly locked before the configuration is read. The exemplar device for this was the Blackberry Perl.

Crypto support for softraid? Please tell us everything about it!

Marco Peereboom: The crypto softraid discipline was written by Hans-Joerg Hoexer after Damien Miller added AES-XTS (tweakable block cypher geared towards disk encryption) support. During the general hackathon we were able to sit next to each other so that Damien Miller could answer AES-XTS specific questions and I could answer softraid specific questions. This enabled Hans-Joerg to make very rapid progress and he was able to get the code enabled albeit still experimental. The reason for it being marked that way is because we still lack the tools to manage the crypto discipline properly (think changing passwords, boot support etc). It has been put through its paces by quite a few folks and it seems to work just fine. More testing is obviously always welcome.

What's new in the NFS support?

Nikolay Sturm: At C2K8 I ported rpc.statd(8) and rpc.lockd(8) from NetBSD, giving us server side NFS file locking. Currently this is only interesting for people who use NFS clients that do file locking, like Linux. OpenBSD still does not support NFS file locking on the client side, which is much more complicated.


What's new in the trunk interface?

Marco Pfatschbacher: We imported IEEE 802.3ad/LACP support from NetBSD/FreeBSD. If trunk(4) is used on LACP capable switches, it enables the switch to also balance incoming traffic. The feature however is still "work in progress" as I haven't been able to test and debug it thoroughfully yet.

Why did you implement a tcpstate tracker in PF which does not look at sequence numbers?

Henning Brauer: The full-blown tcp state tracker we have in pf does a lot of work. Namely, it checks the tcp sequence numbers against the allowed window. When pf doesn't see all of the connection (as in, packets in one direction traverse your pf firewall, but you never see packets in the other direction, they can take another path in some scenarios), it doesn't see the window size announcements and thus cannot match the sequence numbers it sees to the window. This situation, and only this, is what the sloppy states are for.

The sloppy state tracker does the bare minimum: move the state's state forward solely based on the tcp flags it sees. It also detects the above mentioned half connection and moves the other side of the state forward based on the flags from the side it sees.

Initially, I wanted to gain performance by only doing the full-blown state tracking on one side of the firewall and sloppy on the other in case of forwarded connections. The state linking code makes that possible. I even implemented that at c2k8, the state on the outbound side was automagically downgraded to sloppy (the term "autosloppy" came up). Then I ran my benchmarks. And it was considerably slower. We think this was due to instruction cache trashing. Then I benched sloppy on both sides versus full-blown on both sides. No difference. The full blown tracker does more magic, causes the CPU slightly more work. But that is really not the bottleneck; it is memory access. And the sloppy state tracker needs to access as much memory as the full-blown one. So instead of idling waiting for the next page of memory the CPU does the extra checks :) Needless to say that I sent my autosloppy code directly to /dev/null.

Why did you disable by default PF counters on table addresses?

Henning Brauer: Most people do not use and do not need the per-entry counters on tables. Yet, they take a considerable amount of memory - per table entry, and tables can be very very big. Now the counters are only there when they are explicitely requested - scarce ressource, kernel memory, saved for the typical uses.

You rewrote PF state logic?! Why? How?

Henning Brauer: This was a big undertaking. Ryan and I had been talking about it for years, and we had worked on prerequisites for ages. At n2k8, the network mini-hackathon in japan, it was due. pf was using a single struct for a state entry, with three addresses embedded, both endpoints and one address that, depending on context, could be the address to nat to or to rdr to. This was very hard to follow and the source of some errors. It also imposes the cost of NAT (as in, any for of NAT, be it nat, rdr, binat) to everybody, wether using NAT or not, and dictated the "rdr inbound, nat outbound" rule.

Now the state itself doesn't have any address in it any more. Instead, it has pointers to 'state keys' which contain family (v4/v6), addresses, procotol (tcp/udp/...) and ports. These state keys can get (and are in some scenarios) shared by multiple states. They have a list of states attached, so states are linked to state keys and state keys are linked to states. Interface bound states are implemented by that logic too, in the list the if-bound states are before the floating ones, and the first match wins.

The state itself now has two state key pointers: one we call "wire side" and one we call "stack side". "wire side" has the addresses as seen on the wire. "stack side" is the other side, when, for inbound packets, pf is done with it. For outbound packets we see it on the stack side first of course. Deciding wether any form of NAT has to be done is now as simple as comparing these pointers. The state table now also allows for nat (src address/port rewrite) and rdr (destination) at any time, inbound or outbound, though we still enforce the 'nat outbound, rdr inbound' further up in pf -- for now.

Finally, the factored-out state keys contain a "reverse" pointer, pointing to the state key that is, surprise, the exact reverse of the state key in question, as in, adresses and ports swapped, everything else the same. We now store a pointer to the state key used in the mbuf header. For forwarded connections, on the outbound side, we can take the pointer to the state key from the mbuf header, follow its reverse pointer to the state key to use (we reverse adresses and ports for outbound lookups, that is needed to correctly match the return traffic later) and skip the state lookup completely. If the reverse pointer isn't there we fall back to the traditonal lookup and then establish the state key linking by setting up the reverse pointers. Since the state keys are not the states itself but contain lists of states (that, most of the time, have just one entry) we can do that without violating the semantics of interface bound states. And since we save state lookups we gain performance - considerably. Unfortunately the state key linking had to be disabled last minute before the 4.4 release as bugs showed up way too late in the process, but it is enabled in -current and will be in 4.5. And in the 4.5 interview I'll tell you about the other crazy linkings we do based on this :)

Anything else about PF?

Henning Brauer: Much more. For example, rule accounting now has a counter to record how many states in total have been created by a rule.

The kill states feature in pfctl(8) now supports two additional match targets: Kill by rule label or state ID. Killing by label can be very convenient.

We save valuable kernel memory by collapsing two 8 bit ints into one in the PF state structure.

The scrub rules can now use 'tagged' as matching criteria, allowing you to tag packets on the inbound interface and later on the outbound one scrub them based on that tag. Also scrub rules can now modify the ToS (Type of Service) field in the tcp header, used for queueing decisions by many devices (namely, many better switches).

pf consults the net.inet.(tcp|udp).baddynamic lists for NAT port allocations now to not use these ports.

There is a new "divert" keyword which allows packets to be redirected to a local socket without modifying the packet. This allows for easier implementation of transparent proxies in userland, since they don't need to call back into pf any more to find the real destination address as they need to when using rdr.

This release has seen more work on pf than we had in a long time. There were many cleanups, reorganizations, little fixes and the bigger changes outlined above, and there are many plans for cool stuff to do based on these changes in the future. I spent a lot of time with kernel profiling and benchmarking to find bottlenecks in pf, and we found and fixed a few. We also learned some lessons in the process, I did many changes that we thought would help that actually hurt or had no effect. Future performance improvements will mostly come from redesigning parts of pf or things pf uses (e. g. pool) or other stuff in the network stack.

Applications and Management

I saw this in the changelog: "Make syslogd(8) drop messages when writing to a pipe that is too slow to process input". Why did you implement this feature? Isn't it dangerous from a security point of view?

Marco Pfatschbacher: The pipe logging feature has always been a "best effort" mechanism. If the receiving process is too slow and the pipe buffer runs full, we used to close the pipe and fork a new process. The logmessage was lost anyway, because syslogd(8) does no blocking wait on logging pipes. Instead of reforking we now drop the message and try to write to the pipe process again when the next message arrives. syslogd(8) is now also creating a rate limited error message in this case.

What is the new tool called ypldap(8)?

Pierre-Yves Ritschard: YPldap is an upcoming solution whose role is to be able to allow logins to OpenBSD boxes based on user information stored in LDAP directories. OpenBSD's stance on pluggable authentication and authorization mechanism as always been that it should not be able to compromize the security and always respect the POSIX semantics.

Other operating systems have favored solutions such as PAM and NSS to solve that problem, these systems rely on dynamically loadable modules to extend libc functions, thus allowing injection of third party code into the address space of libc. OpenBSD's response to PAM was bsd_auth, a simple protocol which forks programs and talks through stderr and stdout.

The rationale behind YPldap was that and extension for retrieving user and group info already existed in libc: NIS. The extension mechanism relies on RPC, and respects all POSIX semantics.

I took advantage of the presence of Theo and maja@ at the c2k8 hackathon who wrote much of the OpenBSD yp tools to delve into them and understand the guts.

ypldap was written to replace ypserv and relies on a simple configuration files which maps ldap directories to yp maps. Initially it relied on OpenLDAP libraries, but aschrijver@ stepped in and wrote the ldap search code needed by ypldap.

It's not in 4.4 but now active and linked to the tree and development will continue to provide a complete tool for 4.5.

I read that you modified netstart(8) to check that hostname.if files have the correct permissions. Why? What problem did you notice?

Todd Fries: Someone, perhaps Theo, prodded me to take a look at file permissions.

It was becoming clear that /etc/hostname.* contained secure key information, and as such it should not be world readable. Specifically wireless keys. We do not let people read the information from the kernel via ifconfig(8), so why should we let them read it via /etc/hostname.* files instead?

How did you implement redundancy in dhcpd(8)?

Bob Beck: This was relatively straightforward - since the dhcp protocol allows for multiple answers to be sent to a request - the client gets to pick one, I put this in by borrowing from reyk's hmac'ed synchronization code that was added to spamd(8) a little while back - allowing the dhcp servers to send hmac authenticated messages to each other to synchornize their lease files. It does assume that you start with the lease files the same, but in that case when all the servers offer up a lease, the client will accept it from one, and server will then broadcast that information to the other servers who synchonize their lease files (enabling them to answer future requests and know about this client's existing lease.

Is there any interesting change in the ports system? What is the new target for ports makefiles "update-or-install"?

Marc Espie: There is actually very little that's new in the ports tree infrastructure per-se, which shows that it's getting mature. "update-or-install" is very simple: it means just that, I was getting annoyed at trying to make sure something IS installed and up-to-date, when I don't remember whether I installed it in the first place.

One milestone for 4.4 though is that we finally got all the infrastructure documented, including MODULES and such, which means a developer will find his way more easily for creating new ports and updating old ones.

Most of the changes for 4.4 were small reliability stuff. Getting stuff to work in weird corner cases, or fixing little things that could go wrong during some installs or updates. There are better things brewing, but they won't happen until 4.5 (and maybe not in 4.5 if we're unlucky enough).

What's new in the install process?

Kenneth R Westerback: For the traditionalists, recent breakage in the Sparc miniroot install media was finally fixed. For the rest of us serial console support was expanded to most architectures and enhanced with more automatic speed detection; FTP is now using keepalive packets so fewer restarts are required; the size of partitions being created is now displayed in human scaled units; and the ntp setup questions are clarified.

In addition, OpenBSD can now be installed on Extended Partitions in any architecture that supports MBR disk partitioning.

This release includes a new tool that makes it easier to modify configuration files during an upgrade. How does it work?

Antoine Jacoutot: It is called sysmerge(8). Actually, let me take this opportunity to thank Tobias Weingartner (weingart@) for finding the name... I got lots of weird name proposals before his ;-)

Now, this is not a "new" tool per se but rather a rewrite of an existing tool: mergemaster, which was already available as a package (the diff and merge loop functions were adapted from mergemaster, the rest has been rewritten). So, it is basically just a simple shell script. It works a lot like its parent and it is used to update configuration files after an upgrade to a new release (or current snapshot). Several OpenBSD specific changes were made and several safe-guards were added, but if you use to be a mergemaster addict, you should feel at home or at least, close to it.

Until now, merging the new configuration files was done using a patch file available on the "Upgrade Guide" or manually. Needless to say, the patch file cannot be applied in every situtation and the manual merge is a lot of work. In the end, I realized that most people needing an easier way to update their configuration files were either using the mergemaster package or writing their own script.

Note that sysmerge(8) is not the perfect solution, it is rather an additional solution. That said, several improvements are on the way...

The way it works is really easy. Basically, it populates a fake root from the new updated sets (etcXX.tgz and/or xetcXX.tgz) which is then compared against the current installation. I won't go into details, it's better if you refer to the man page.

What is the status of OpenCVS?

Tobias Stoeckmann: Since OpenBSD 4.3 release we have made a lot of progress in creating a full replacement of GNU cvs for the OpenBSD project. One of the finest piece of work coming from outside the developer crowd is the support for so called "trigger files" from Jonathan Armani. Trigger files are responsible for sending out mails to source-changes, for example. Although the syntax in these files changed in original cvs among versions, we were able to create a parser which handles both "on the fly", keeping compatibility with our tree (top priority of course) and foreign cvs configurations which based their repository on newer versions.

Joris Vink improved OpenCVS' speed in general and especially on server-side dramatically by removing the need to create temporary files for a fresh checkout, as well as keeping information about directories in memory. Although this sounds like a possible waste of memory, we have a very slim CVS implementation, which copes with very large repositories easily.

Talking about my part in the show, I further improved the handling of branches (and subbranches as well as multiple branches from same revision) as well as the creation and handling of the history file. Also further improvements have been made in reliable handling of malicious RCS files, which -- although complying to RCS specifications -- would render a lot of CVS commands to good sources for core-files.

Yet we are unfortunately missing some elements of CVS protocol which made an official release together with OpenBSD 4.4 impossible. The biggest show stoppers might be missing common options to some of the commands like log, rlog etc. which are used by developers and some of the programs in our ports tree. Talking about the internals of OpenCVS, we are also missing support for transmission of diffs between server and clients when it comes to modifications (although sending the whole file works, it's obviously a waste of resources).

Please remember that we have a bunch of AnonCVS mirrors running OpenCVS, so if you are about to work with your source trees, consider running "opencvs" (no excuses, it's installed by default! ).

It seems you did a lot of work on aucat(1). Would you like to give an overview of the enhancements you worked on?

Alexandre Ratchov: I've been working on a framework for non-blocking audio i/o, mixing, demultiplexing, resampling and various format conversions. Instead of adding yet another binary to OpenBSD we preferred replacing aucat(1) by a front-end to this framework, it's a kind of swiss army knife for audio. It focuses on fully supporting *all* devices especially those with fixed rates, large number of channels or unusual encodings (eg. envy(4)); basically it can be used as a minimalistic command-line multi-tracker. For instance, you could simultaneously play 4 audio files (of different formats) while recording in full-duplex each of your 4 input channels into separate files. All necessary conversions are done on the fly.

Besides that, aucat(1) contains most of the required ingredients for an audio server. From the developer perspective it allows to quickly prototype audio servers, new audio APIs and more generally to evaluate various approaches to audio on OpenBSD. The long term purpose of this is to rework audio conversion code, improving usability of audio applications; this will also be the opportunity to simplify the audio sub-system with emphasis on robustness and correctness.

What are the major new features and changes included in OpenSSH 5.1?

Damien Miller: OpenSSH 5.1 was our biggest release is quite a few years. This was due to a lot of work being done at two hackathons: the n2k8 network mini-hackathon in Japan and the general c2k8 hackathon in Edmonton. I attended both of these and got to work face to face with Markus Friedl at n2k8, Darren Tucker and Alexander von Gernler (both at c2k8). We got a lot of work done - fixing some very old and tricky bugs and adding a few new features.

There really are too many features and bugfixes to mention. A full list is in the release notes, but here is a selection:

Fixing the old bugs was the most difficult and (for me at least) the most rewarding of the work. The oldest of these dated back over seven years and turned out to be due to a protocol deficiency: it was impossible to mark a SSH protocol 2 channel as "half closed for reading", so a command like: "ssh -2 localhost od /bin/ls | true" would not detect the early closure of the channel (caused by "true" finishing quickly). Markus added an extension to the protocol to allow signalling of this condition. Markus also sped up ssh/sshd by about 10%, using a clever trick to avoid extra malloc/free/memcpy when receving channel data.

I spent a bit of time improving the multiplexing support. The most user- visible of this work is usable ~[char] escapes on multiplexed sessions, which never used to work. I also added a MaxSessions knob to sshd_config to allow a server administrator to modify the number of supported multiplexed sessions per transport session, audited the server to make it cope better with out-of-file-descriptor conditions when setting up sessions (e.g. if an admin increases MaxSessions too far) and fixing up the client so it can deal more gracefully when told that a session setup has failed.

Darren added an "effective configuration" test mode to sshd that allows dumping of the configuration that results from the application of "Match" restrictions is sshd_config(5). Apart from allowing for better debugging of rules, this is a great help for our regression tests.

I added CIDR address matching to sshd_config "Match address" blocks and to "from" restriction clauses in ~/.ssh/authorized_keys. Quite a few people had asked for this over the years and it is now possible to write sshd_config(5) rules like:

PasswordAuthentication no
Match address
        PasswordAuthentication yes

We also added support for a "df" command in sftp(1), via a protocol extension that implements a statvfs(3)-like operation in sftp-server(8). This was based on a patch by Miklos Szeredi and polished by Darren Tucker and myself. The protocol extension is also useful for userspace filesystems that use the sftp protocol.

openssh-5.1 also includes documentation for all the divergences from' and extensions to the standard SSH protocols, including full documentation for the ssh-agent protocol (which was never standardised at all). This was quite a bit of work but will help anyone building a SSH implementation that wants to interoperate against OpenSSH.

How does the new ssh fingerprint visualization system work?

Alexander von Gernler: By setting the option "VisualHostKey yes" in your ~/.ssh/config, you make ssh(1) display you a nice little ASCII art every time you log in to a host. The image corresponds to the server's key, and it is generated out of the hex fingerprint in a deterministic way. The idea is that you learn the image over time (e.g. you log in to that host every day), and you will be able to reject images that don't look like the one you know of for this specific server.

Even though it may sound tempting, you can only reject servers based on the image you see, not positively identify them. The reason for this is that the amount of information conveyed in one image is significantly smaller than the amount of bits in an SSH server key fingerprint. But the pictures are perfect to tell keys from each other that are really different.

Although everyone likes the algorithm at first sight and it is very unlikely that the computation of the images will have to change, I still have no proof that the algorithm is really good for all situations. To give the good feeling of many people a solid backing, I am researching on the algorithm together with Dirk Loss, a guy who enjoys analyzing such things. He already came up with some Markov models, and we'll see what we can dig up together.

If you want to see how easy it is to spot dupes in your known_hosts file using nothing more than the powerful pattern recognition engine your brain has to offer, just try out the following command to get a good grasp on what fingerprint visualization is all about:

$ ssh-keygen -lvf ~/.ssh/known_hosts | less
Anyway, thanks fly out to Dan Kaminsky, whose talk at some CCC conference brought me up the idea to do this at all!


What's new in cwm?

Okan Demirmen: cwm(1) has again undergone many changes, but probably the biggest noticeable change for the user is the move to a configuration file, cwmrc(5). This has paved the way for new features in 4.4, such as configurable mouse bindings, 'gap' support, vi(1) keybindings and others.

While new features are nice, an in-depth review of the cwm(1) code base helped us find and fix quite a few obscure and very annoying bugs, from window cycling problems to non-us/uk keymap issues. Major parts of the code have been completely re-written and/or re-worked to be more efficient, 'calm' and more readable, including the documentation. Our users, testers and contributors have been instrumental to cwm(1) in 4.4.

Why did you replace the via X.Org driver for VIA video cards with the openchrome(4) driver?

Matthieu Herrb: The situation with via drivers has been bad for serveal years. From time to time, users of VIA chipsets reported that the via(4) driver didn't work for them, but the openchrome one did. Since the via(4) is not really being developed anymore, we decided to switch to openchrome(4), which looks more promising for future support. Since then the openchrome developpers have moved their development repository to X.Org (on the freedesktop servers) and their driver is an official component of X.Org 7.4. We're also looking forward to see how the VIA open source initiative announced in april, with the documentation and code they are making available is going to help the developpers of the openchrome driver. For now it looks like VIA didn't disclose anything really new or important, but let's wait a bit.

How is the work on DRI/DRM going on?

Matthieu Herrb: Owain has done a trendemous work to port the latest drm code to OpenBSD, and fix bugs that he found there. The code in 4.4 starts to be useable on some subset of the intel and radeon chipsets both on i386 and amd64. It's compiled in but disabled by default in 4.4, so that users can try it on their hardware, without having to rebuild everything from scratch. There are 2 main reasons not to enable it by default for now: first there are still some cases where drm will make the X server lockup or crash because of bugs, and the second reason is that, even with the work we've done in libdrm to make it work with the privilege separation in the X server, we're not confident enough for now in the security of the drm code to be compatible with OpenBSD's "secure by default" approach.

But in the long term, we believe that X.Org's approach to move all stuff that needs hardware access privileges (mode setting and for some cards 2 D acceleration) to the kernel using the drm interface is effectively going to get a more secure X. But this is a lot of new code that needs to be ported, audited and robustified before it will be OpenBSD's default.

Hardware Support and Drivers

Would you like to give us an overview of your work on preliminary support for USB webcams?

Marcus Glocker: This release will ship with very basic UVC support in the uvideo(4) driver which is enabled for alpha, amd64, i386, macppc, and sparc64 architectures in GENERIC. Most UVC compatible devices will attach. They may initialize correctly but the fewest will really work smoothly since a lot of stuff had to be tested and fixed in post-4.4. Also the USB isochronous support for high-speed devices has been added post-4.4, so devices will just attach to full-speed, which is restricted to lower image resolutions.

In the -current state of the uvideo(4) and the ehci(4) driver the isochronous USB support is still not working perfectly. We still have bugs in the recently implemented high-speed isochronous support. We have especially open issues on Intel USB Host Controllers with isochronous transfers, so most uvideo(4) devices attached to an Intel USB Host Controller will not deliver a clean stream currently.

Some devices do not initialize correctly which means we still need to fine tune our initialization sequence. BULK USB transfer support is still missing in uvideo(4). This is mostly because of the lack of BULK capable devices.

It would be nice if one of our uvideo(4) developers could get a BULK capable UVC device donated, e.g. an EeePC 701 which includes a BULK cam, so we can start an attempt to add BULK support.

Of course reports of users which are facing problems are always welcome. If you send us in a problem report we would like to see a full dmesg with "option UVIDEO_DEBUG" enabled in the kernel configuration. Some V4L applications are included in our ports tree (graphics/luvcview and graphics/fswebcam). We recommend them for simple testing.

What type of cellular modems are supported by this release?

Yojiro Uo: umsm(4) on the 4.4 supports various wireless high-speed modems, and it also supports the following modems which needs special sequence to enable modem functions:

  • Hauwei E220/E618 HSDPA modem (aka Emobile D01HW/D02HW)
  • Options GlobeSurfer Icon 7.2 and its variants
See 'man 4 umsm' to get the complete list.

I read that you have removed the pccom(4) driver and replaced it with com(4). Why? What are the advantages of com(4)?

Mark Kettenis: Having two drivers for the same device in the tree isn't desirable from a maintenance standpoint. The idea has always been to switch all platforms to use com(4) and remove pccom(4) from the tree. Unfortunately some infrastructure to make com(4) work on i386 was missing. I just added the missing code. The direct benefit to our users is probably not very big, but consistency across platforms is a good thing!

What is new in OpenBSD/sparc64?

Mark Kettenis: We've made quite a big step forward in hardware support. OpenBSD 4.4 now supports Sun's new sun4v architecture and runs on machines with UltraSPARC T1 and T2 processors. These processors are radically different from Sun's older UltraSPARC CPUs, providing support for virtualization through the sun4v hypervisor architecture. This required quite extensive changes in the lowe-level kernel code. Unlike Solaris, the same OpenBSD kernel runs on the older sun4u architecture and the new sun4v architecture.

OpenBSD 4.4 is also the first release to support Fujitsu SPARC64 processors. In fact, it is the first (and only) open source operating system to run on the SPARC64-V processor found in Fujitsu's PRIMEPOWER machines. Support for SPARC64-VI/VII processors was added too, but unfortunately OpenBSD triggers a serious hardware/firmware bug in the Sun/Fujitsu Enterprise MX000 machines. Clearing the resulting fault requires assistence from a support engineer, so running OpenBSD on these machines is not recommended. Hopefully this issue will be resolved for OpenBSD 4.5.

Last but not least, support for older Sun hardware has been improved too. UltraSPARC IV processors are supported now (and UltraSPARC IV+ should work as well) and OpenBSD now runs on many mid-range and some high-end systems like the Sun Fire V1280 and Enterprise 10000. There is a good change it will even run on Sun Fire 3800/4800/4810/6900 and E2900/E4900/E6900 machines. Anyone with access to such hardware, please try!

What does this release offer regarding ACPI support and power management?

Marco Peereboom: Not much new here. Most ACPI hacking went into replacing the parser with a brand new one that is far more readable. The ACPI drivers were adapted to work with the new parser. The new parser also works with far more boxes than the previous one. We now run relatively well on historically problematic boxes such as HP/Compaq/Asus etc. Work remains to be done in the ACPI code because we still sometimes route interrupts wrong because of misinformation in the BIOS or a bug we haven't discovered yet.

What's new in the sensors framework?

Constantine A. Murenin: As with every recent release, we've had several new sensor drivers added to the tree and enabled in the GENERIC kernel. But I feel that this release is somewhat special in a sense that OpenBSD 4.4 is the first release of any operating system to include and enable the drivers for the following two sets of devices.

The first one of these two unique drivers is sdtemp(4), which supports the temperature sensors compliant to JEDEC JC-42.4. These can either come as stand-alone temperature sensor chips, or as part of the SPD chips, and are to be used in the memory modules. These chips have been in production for a while, but are not yet common in the memory modules you might have in your boxes. The driver was written by Theo de Raadt after he managed to get a few chips at his disposal, soldered them onto a board, and connected to the I2C/SMBus via an older memory module (it appears to be the easiest way to access I2C for an external component), writing a driver afterwards. He actually posted a story about it with some photographs, which is worth a look.

The second driver is km(4) for AMD Family 10h processors, currently popular under the Opteron Barcelona and Phenom brand names. This has been written and tested by me after I got a new quad-core Phenom box to do some hacking. I recall that the chip itself must have about a dozen temperature sensors, but AMD has made available only one value for our disposal. The driver itself is pretty simple, and can pose as a good guide for someone wishing to start playing with sensors. Other than its direct purpose of monitoring the temperature of the CPU, it might also be useful in comparison of the sensor frameworks, as it demonstrates how few lines of code are needed to write a sensor driver for OpenBSD, and how straightforward the logic is.

Neither of these two drivers are in Linux yet. NetBSD does have the driver for the Family 10h chips, but it hasn't been tagged for a stable branch yet. ;-)

As far as the total number of drivers that use the sensors framework goes, the number is now at 68 drivers in OpenBSD 4.4! It's a calculation based on the number of files that call the sensordev_install(9) function; for example, lm(4) can attach on both I2C and ISA busses, and supports a huge number of chips from multiple vendors, but it is only counted as one of the sixty-eight drivers, since the sensordev_install() call is made only in the bus-independent code inside /sys/dev/ic/lm78.c. In the previous release, OpenBSD 4.3, we had 61 drivers using the framework, so this time it is 7 new drivers in a 6-month release cycle, which continues to look quite impressive, in my opinion.

You might also be interested in:


Where can I find the details about MX000 problems ?

@kib, you can find the discussion on the bugtraq mailing list.

It was mostly ignored and treated as a non-issue by ignorant people.

What's the state of SMP in OpenBSD? FreeBSD and NetBSD have been making a lot of strides in that direction. Just wanted to know how OBSD handles/scales on multiple cores.

OpenBSD still uses one big lock, like FreeBSD 4.x.

They might choose rthread to exploit SMP.

The missing bigmem support for amd64 means, that I can't install OpenBsd 4.4 on an amd64 box with 8GB RAM ?

I think what would happen is that the box would just use 4GB.

I suggest to download an iso from the -current directory, so you could see how it works with the patch.

Then diff the 2 dmesg.

News Topics

Recommended for You

Got a Question?