Ottawa Linux Symposium, Day 2

93

Author: David "cdlu" Graham

The second sitting of the 7th session of the Ottawa Linux Symposium saw several interesting, highly technical discussions. Here are my reports on Trusted Computing, the ext3 filesystem, the e1000 network driver, and SELinux.

The morning session

I attended a session in the morning called Trusted Computing and Linux. It was led by Emily Ratcliff and Tom Lendacky of the IBM Linux Technology Centre. The two presenters switched off frequently throughout their presentation.

Ratcliff described trusted computing as an industry standard for hardware to be addressed. Take peer-to-peer networking, for one. In theory, trusted computing could protect peer-to-peer file sharing networks from being infected with file poisoning attacks, where bogus files are shared in an effort to corrupt people’s downloads and make them unusable.

The concept could also be used to have a user ask a remote computer to check a local terminal to see if it is clean before logging in.

Lendacky introduced the TPM – Trusted Platform Module – as a physical device that provides protection capabilities. The TPM is generally a hardware chip that can be described as a cryptographic co-processor. It protects encryption and signature keys. It uses non-volatile memory to store endorsement, storage root, and data integrity keys and volatile memory to store platform configuration registers.

Ratcliff explained that the TPM works by a performing a series of measurements. The BIOS measures the boot loader. The boot loader measures the operating system kernel. The operating system kernel manages the applications through a software stack. At each level, the next step is checked for integrity against the TPM.

The bootloader’s responsibility is to measure the kernel and configuration files prior to handing over control to the kernel. This functionality is now available for both popular Linux boot loaders, grub and lilo.

Lendacky said that kernel 2.6.12 incorporates TPM chip support. There’s only meant to be one way to use TPM, he said, and that is through the software stack.

Ratcliff introduced TrouSerS, the code name for the TPM software stack. It includes an access control list that allows an admin to allow or not allow remote users to access a system’s API.

There were a number of questions following the presentation. The first was “How can TPM keys be validated?”

The answer, according to the presenters, is by the user entering a password. That prompted another question, asking if passwords were the most secure option available, since they tend not to be very secure.

Ratcliff referred to the authentication system known as attestation. TPM chips are meant to be credentialed by TPM chip manufacturers. The system gets a platform credential, and together an attestation ID is achieved. In theory. Manufacturers are not keeping the public side of TPM keys and the system is not working as intended as a result.

The TPM keys have to be manufacturer verifiable to work. Trusted systems where this is particularly important are things like bank machines and automatic voting machines.

The ext3 filesystem

My first session after lunch was entitled: “State of the art: Where we are with the ext3 filesystem”, presented by Mingming Cao and Stephen Tweedie.

Cao discussed the Linux/extended 3 journaling filesystem. Although it is a young filesystem, she said, it is increasingly widely used. The people working on the filesystem are trying to bring a faster, more scalable system to the ext3 filesystem.

Cao listed some features ext3 has acquired in the 2.6 kernel. Among them is online resizing – changing the size of the partition without taking the drive down, and extended attributes.

As a means to fight the problem of filesystem fragmentation, Cao explained a system of block pre-allocation, where files can be allocated an amount of space on disk appropriate to their eventual needs and can thus hopefully remain contiguous on disk.

Cao spent a good deal of time explaining extents and related work. Extents allow delayed block allocation until more information is learned, allowing more contiguous file allocation. This is especially useful for temporary files.

Cao said the ext3 team wants to improve the ext3 filesystem, but that this could result in some filesystem format changes. Because of the nature of file systems and filesystem changes, adoption of any revisions would be likely to be very slow.

Among the work in progress is a reduction in file unlink/truncate latency. Truncating large indirect-mapped files is slow and synchronous, explained Cao.

Time stamps under ext3, until recently, were updated about once per second. Ext3 had no way to store high resolution time stamps. The kernel is capable of storing nanosecond time-stamps on an extended inode, but time-stamps measuring only seconds on normal inodes.

Solutions proposed for ext3 are parallelised directory operations, and serialising concurrent file operations in a single directory.

The future holds more improvements for ext3 in the mainline kernel distribution, with a 64 terabyte maximum partition size coming.

Cao expected to get a copy of her presentation up on the ext2 project website soon. Questions on the presentation were answered by Stephen Tweedie, who explained that ext2 and ext3 are, for all intents and purposes, exactly the same filesystem. If an ext3 filesystem were to be mounted under Linux kernel 1.2 as an ext2 filesystem, provided it didn’t exceed normal ext2 parameters of the time for file and partition sizes, it would be able to mount fine, albeit without use of the filesystem’s journaling features.

A member of the audience queried why we don’t just go directly to a 1024-bit filesystem, citing the progression of 12 to 16 to 32 to 64 bit filesystems he’d seen in his career. Tweedie replied that any filesystem that large would be simply unmanageable, taking weeks to fsck.

A case for the e1000

Intel’s John A. Ronciak presented a talk called “Networking driver performance and measurement, e1000: a case study”.

Ronciak’s goal in his case study was to improve the performance of the e1000 gigabit ethernet chip under kernel 2.6.

He found through his studies that kernel 2.4’s performance with the chip outperformed kernel 2.6 in every test and under every configuration in terms of throughput and he thus concluded that kernel 2.6 still has room for improvement.

In its day, a 10/100 network interface card sitting on a 32 bit, 33 MHz PCI bus could bring a system to its knees from input/output overload. Today, the same can be said or done with a 10 gigabit ethernet device on a modern motherboard, noted Ronciak.

Ronciak noted that Linux lacks a decent common utility for generating performance data across platforms and operating systems, to, as he put it, compare apples with apples when measuring performance. Lacking such free tools, he showed us data collected using a program called Chariot by IXIA as a test tool.

His results showed that kernel 2.4 always outperformed kernel 2.6 in data throughput performance, and the performance within the 2.4 and 2.6 kernels varied widely between different revisions and by whether or not NAPI or UP configuration options were used in the kernel.

NAPI, he said, is an interface commonly used to improve network performance. Under kernel 2.4, NAPI caused CPU usage to go down for the same amount of throughput, while his results found that with NAPI still, CPU usage actually went up against a NAPI-less but otherwise identical kernel at the same throughput.

In measuring performance, he cautioned, the size of the frame is an important factor. With large frames, the data going through with packets is significant enough to measure, though with small packets, it is more useful to measure packet counts than actual data shoved through a connection. His slides showed a chart to emphasise this point.

Ronciak found that in his initial testing, NAPI actually caused an increase in lost packets, though with a change in NAPI weight values, packet loss could be reduced. The problem, he explained, was that the input buffer was not being cleared as fast as the data was coming in, resulting in lost packets. A driver change was required to fix the problem. He suggested that a modifiable weight system for NAPI would be useful in some circumstances, but noted that the issue is up for debate.

Among the problems Ronciak found with NAPI is a tendency to poll the interface — see if there is any new data waiting for the kernel — faster than the interface could handle incoming requests, resulting in wasted system resources and less efficient operation. His suggested fix for this problem is to allow a minimum poll time delay based on the speed of a network interface.

Ronciak noted that one thing he learned with this project that he could pass on is never to be afraid to ask the community for help. A call to the community got his testing and code a number of bug fixes, patches, and improvements as simple as whitespace cleanup.

His conclusions were that he intends to work to continue trying to improve network performance under Linux, but that he is looking for help in this matter. Ronciak also is looking to further improve NAPI.

He reiterated at the end of his presentation the need for a free standard measurement for network performance across platforms. He is also seeking help with finding new hardware features which could help solve some bottlenecks in network performance.

Later, back at the BOF

In the evening at a BOF session, the US National Security Agency (NSA)’s Stephen Smalley gave an update on the status of the NSA’s SELinux kernel.

Smalley said that the last year has been a major year for SELinux. A year ago, SELinux was included in the Fedora Core 2 release, but was not enabled by default. Since then, he said, it has been included in both Fedora Core 3 and Fedora Core 4, and has shipped enabled rather than disabled.

SELinux can now scale to large multi-processor systems, said Smalley, and IBM is looking to evaluate SELinux for certifications allowing the US government to use it formally.

SELinux is exploring a multi-category security system allowing users to be more involved in the security policies of the system, Smalley explained.

A more in-depth look at SELinux can be had this winter at the SELinux Symposium in March 2006 in Baltimore, Maryland.

Category:

  • Linux