Saturday, December 5, 2009


My girlfriend and I use SmugMug for all of our pictures. It's actually been amazingly helpful for organizing and storing all of our photos, let's us easily show our photos to everyone else, and is pretty cheap.

However, when we first started using it there weren't a lot of great clients for uploading pictures from Linux. There are a few command-line clients, but I found that they weren't greatly documented, and I didn't like the way they were written. In particular, they were written in Python (which is great!), but they did not use an object-oriented API. Because of this, they didn't feel very natural, and seemed hard to extend.

So I decided to implement my own little library for accessing the SmugMug API's: SmugAPI. It implements a very object-oriented Python class, with the ability to manipulate albums, images, categories, and sub-categories. It also comes with a command-line client for uploading files, smugtool. All of this is fairly well documented, and I even include unittests. Finally, all of the code is under the GPLv2.

SmugAPI is available in GitHub. Let me know what you think, and if you have feature requests, bugs, or other problems, please let me know about it!

Monday, October 5, 2009

Splitting patches with git

Continuing on my trend of learning new things with git, I've now learned how to split already committed patches into multiple sub-patches. I'll explain that here, along with some cool other commands I learned along the way.

First, the concept for splitting patches is fairly well explained in the man page for rebase (man git-rebase, search for "SPLITTING COMMITS"). Basically, you start an interactive rebase (git rebase -i), then mark the commit you want to split with "edit". When the rebase hits that point, it will stop and allow you to modify things. The important point is that at an interactive rebase point, you can add as many (or as few) commits as you want before resuming the rebase. This is what makes splitting commits possible.

Now you should run "git reset HEAD^", which will put your source tree in a state right *before* the current commit would have been applied, and then applies the current commit. It's equivalent to checking out the changeset right before this commit, and then applying the current commit by hand (with, patch -p1). With this in place, you can now use "git add -p" to add hunks of patches into particular commits. Again, the man page for "git-add" explains this pretty well, but using the -p flag makes "git add" interactive and allows you to choose, hunk-by-hunk, which hunks of patches you want in the current commit. It even has this really cool feature called "s", where it will split patch hunks into smaller patch hunks for you.

Once you've git add'ed all of the hunks for the current commit, commit it with "git commit -s". You can repeat the git-add/git-commit cycle as many times as you want, creating a number of different commits. Once you've split up your commit how you like it, you just run "git rebase --continue" to continue the rebase, and you are on your way!

UPDATE: fixed up bogus reference to git rebase HEAD^; it should have read git reset HEAD^

Sunday, February 8, 2009

FOSDEM 2009, Day 2

Notes for FOSDEM Day 2

Sunday, February 8

Solar Control with 1-wire open hardware
Speaker: Wookey
- Install solar thermal panels on the roof of the house to replace hot
water heater
- Solar panels get hot and do heat exchange, not electricity generation
- Roof panels cost around 1000 euros
- Use standard hot water tank
- Use thermosiphon
- Hot water is lighter than the cold water in the tank, so it comes up
a tube (without a pipe), and replaces hot water in the tank
- Can manage the system with a commercial controller, but costs 388 euros
- Instead, use a home-brew solution instead
- 500 MHz processor, 384MB memory
- I/O expansion bus with I2C, etc
- 1-wire sensors to do temperature detection
- Good to about 100 meters
- 1-wire bus is actually 3 wires - 1 data, 1 ground, 1 power
- 14 kb/s
- Multiplex 8 1-wire buses
- scan 13 devices/second
- measurement 94-750ms
- Hardware
- 5 wires - I2C data, I2CCiK, GND, +5V, IO0
- Sensors wrapped around pipes
- Sensors around tank
- Software
- Debian
- modprobe i2c-pxa, i2c-dev
- /dev/i2c-0, /dev/i2c-1
- 0 is general, 1 is for power
- I2C addresses are fixed when the board is soldered
- 8-bit addresses, but bottom bit specifies read/write (so 7 address
- i2cdump from lm-sensors or i2c-tools
- modprobe pcf8574a
- One Wire FS
- FUSE filesystem
- OWFS daemon
- Perl binding, tcl binding, shell
- Does temperature
- # owdir -s 4304
- # owread -s 4304 /28/temperature
- Logging data
- First cut - manual rrdtool
- rrdtool create
- rrdupdate to fill in the data
- rrdtool graph
- OK, but pretty manual. Also does all processing on embedded
board, which is very slow (floating point on ARM)
- Munin
- Hides rrdtool details
- Remote graphing; just fetches data from embedded board, does
processing offline
- Plugin system
- Munin ends up working well, nice graphs
- Control
- Interesting optimization problem; when to turn pumps on and off.
Better to run the pump continuously, or turn on and off rapidly?
- Current algorithm is simple; turn on pumps when tank is hotter than
heat exchange panel
- Reliability
- Original system uptime 87 days
- owserver crashed once
- Survived disk full condition
- Some 1-wire problems
- Reading data not always reliable
- When the sensor has problems, goes to 85C and stays there
- Unfortunate because 85C is actually a valid reading
- Nominal sensor accuracy is 0.5 C
- In practice 2C difference
- Can reach 0.5C in ideal conditions, though
- Lesson: need to clamp sensors very well to what you are measuring
- Bigger Project
- 2 tanks, 2 overflow
- RJ45 sensors
- Open hardware design for digitemp sensors
- Future?
- Reliability during power failure situations
- Local User Interface (LED, LCD)
- More information about solar tank temp, bath status
- House sensors to control heating, cooling
- Inputs - buttons to say "bath mode", "leaving house"
- Other Related Software
- DIYzoning
- Misterhouse
- temploggerd
- WT app

Speaker: Simo Sorce
- Free Integrity, Policy, Audit
- Goal is to make it simpler to manage all of the above
- Using standard protocols
- Target - system administrators
- Identity management is the realm of proprietary vendors
- Not fully free
- Need an open source solution for security and freedom
- Identity management problem
- Need single source for identity
- Single sign-on/single password
- Single data store for auditing and reporting
- Single point of management
- Implementation problems
- Synchronization and integration
- Windows and Unix
- Distribution of data and credentials
- Distributing change
- Single point of failure
- Integrating interfaces
- FreeIPA components
- Directory is LDAP
- Storage mechanism to perform fine-grained access control
- Organize identity and allow group relationships
- Distribute information across clients
- Replicate information on multiple servers
- Avoid single point of failure
- Chose LDAP since it is standard, extensible, flexible
- Authentication is Kerberos
- Single Sign On with delegation - carry on your credentials from
machine to machine, service to service
- Tested standard, validated secure
- Somewhat extensible - can add other authentication devices
(smartcards), and new encryption algorithms
- Once you introduce kerberos, then you need DNS and NTP
- Client and server must be within 5 minutes of each other to prevent
spoofed/expired tickets
- Host tickets based on DNS name, so client must have one
- Implementation
- Fedora Directory Server
- MIT Kerberos
- Apache + mod_nss, mod_auth_krb, mod_proxy
- Python and Turbogears
- Custom FDS plugins and CLI tools
- Clients can use nss_ldap, pam_krb5
- Avoiding synchronization problems?
- Kerberos information is in LDAP as well as KDC
- Directory structure
- Very flat structure
- Splits accounts from kerberos information in the directory
- Allows users to change some of their own information (based on ACLs)
- Management interfaces all revolve around the directory
- Web UI goes through mod_nss, mod_auth_krb, mod_proxy, GUI, xmlrpc,
then finally to LDAP
- CLI goes through mod_nss, mod_auth_krb, xmlrpc, and then LDAP
- But, installation was still complex
- ipa-server-install makes it much simpler
- Multiple servers
- Replication on directory server
- Because kerberos data in the directory, don't need to replicate
kerberos (kprop)
- 2 simple commands
- ipa-replica-prepare
- ipa-replica-install
- Future
- Add Audit Server - can use AMQP
- Add Certificate Authority
- Add Policy
- Makes interaction much more complex
- Luckily, most of the complexity is hidden from clients (and
administrators) in the FreeIPA core
- However, still a bunch of complexity, so new client agent
- System Security Services Daemon (SSSD) + IPA plugin
- caching, offline operations, etc.
- Host Based Access Control in LDAP
- Roles in LDAP
- New UI with plugin system
- DNS integration
- Dynamic updates
- Integration with Certificate Authorities

Speaker: H. Peter Anvin
- Syslinux is a suite of bootloaders
- PXELINUX - network PXE booting
- EXTLINUX - ext2/ext3 - ext4 coming
- Only for x86/BIOS platform
- Core is in assembly
- Originally written for floppies, needed to be small
- Work underway to fix this
- Sophisticated menu systems
- Extensible via module API
- MEMDISK - disk emulator
- Emulates disk in memory
- Allows booting of legacy INT 13h OSes, mainly DOS
- Used for diagnostics
- Collaborate with etherboot
- Enhanced capabilities for network booting
- http, ftp, nfs, AoE, iSCSI (in addition to TFTP)
- isohybrid
- Allows ISOLINUX .iso to boot from USB stick
- x86 ancient
- Released in 1981; IBM AT 1986; PS/2 1987
- Most BIOS interfaces date from this era
- Boot from floppy and hard disk only - 510 bytes
- 1993 El Torito booting (CD)
- 2 modes
- Disk image on disk, boot disk image
- Native mode, access whole CD - not widely supported until late 90's
- 1997 PXE
- Original PXE were problematic
- Nowadays, works pretty well/standard
- USB drives
- Lots of bugs still
- Tricks can help
- Syslinux history
- 1994 original syslinux implementation
- Designed for boot floppies
- Small to fit, so assembly
- Take DOS OS, make Linux boot floppy
- Add online help support
- 1999 PXELINUX support
- PXE only allows 32K for Network Boot Program (NBE)
- 2004 EXTLINUX - ext2/ext3 general purpose loader
- 2006 graphical menu system
- 2008 gPXELINUX, ISOLINUX "hybrid" support (iso that also works on
USB key)
- What's good?
- Designed for dynamic systems
- System discovery at boot time
- Keep to PC established principles
- Sophisticated user interface
- Problems
- Large core of assembly
- x86/BIOS only (because of assembly)
- Dynamic discovery comes at a price
- Can't read a kernel from another disk
- gPXE plus PXELINUX in one image
- gPXE contains an extended PXE interface for PXELINUX
- Can http, ftp, nfs, AoE, iSCSI, SFP (in addition to TFTP)
- Drop-in replacement for pxelinux.0
- Needs TFTP server for initial bootstrap
- But can be skipped if NIC ROM can be reflashed
- Syslinux module API (COM32)
- Small C library (klibc)
- Similar to normal userspace C code
- Main limitation is only sequential, readonly file access
- Common modules
- UI
- Complex menu system - does everything
- Simple menu system - what most people use
- Graphics library makes the same code work for graphics, text, or
- File format modules
- Loading new types of loadable binary objects
- Module describes where various bits go in memory, then syslinux
"shuffle" library puts everything in place
- Shuffle library computes a set of move operations to put things in
the right place in memory
- Example: Microsoft SDI format
- Boot WinPE with syslinux - Windows kernel + ramdisk
- 199 lines total; 139 non-blank/comment lines
- Policy modules
- Example policy:
- Boot kernel X on 64-bit machine
- Boot kernel Y on 32-bit with PAE
- Boot kernel Z otherwise
- Module is 129 lines long; 70 non-blank/comment lines
- Diagnostic modules
- Modules available to show hardware/BIOS state
- e.g. pcidump
- Move dumping data into libraries, and make available to modules
- Already code in Syslinux
- Probe PCI bus
- Map devices to modules
- Build an initramfs with the modules you need
- But...boot devices on USB, firewire? Harder to discover, because you
need a driver for the bus itself
- Ongoing work
- Lua interpreter to write policies
- readdir support - right now, no directories
- Move filesystem code out of assembly - needed to support advanced
filesystems like btrfs
- Switch core from assembly to make it portable; required for EFI
- Core components
- First stage loader, disk/network I/O, BIOS extender (protected mode),
shuffle system (rearranges RAM) need to remain in assembly
- Everything else (command-line, config parser, kernel loader/parser,
filesystem drivers) should be written in C
- Syslinux needs help
- Too large for one person side project
- Significant number of regular contributors
- Documentation help

Speaker: Theodore T'so
- What's good about ext3?
- Most widely use linux filesystem
- People trust it
- Diverse developer community
- important because distros need to understand it well enough to
be comfortable supporting it
- What's not good about ext3
- 16TB filesystem size limitation (32-bit block numbers)
- 32000 limit on subdirectories
- Second resolution timestamps
- Performance limitation
- Traditionally cared more about data integrity than performance
- Is ext4 a new filesystem?
- ext in ext2/3/4 stands for "extended"
- Collection of new features that you can individually enable
- ext4 driver supports all features
- But can have ext3 fs and mount as ext4, which works just fine
- 2.6.29 can mount an ext2 fs with ext4 driver (code from Google)
- Google doesn't care about journals, so can mount ext4 without journal
- ext4 fork was to make sure that ext3 remained stable
- Started with ext3, added new code
- e2fsprogs supports ext2, ext3, ext4
- New features
- Extents instead of indirect blocks - most important
- Delayed allocation
- Multiblock allocation
- Persistent allocation
- Subsecond timestamps
- Greater than 32000 subdirs
- NFSv4 version id's for caching (reliable caching)
- Store file sizes in FS block size, rather than 512 sectors
- Allows huge files; 16TB files on 4k block filesystems
- ATA TRIM support - when deleting a file, can tell block device to use
for something else
- Journal and group descriptor checksums - reliably put which part of
inode table is in use (speeds up fsck)
- Ext2/3 indirect block map
- In the inode, room for 15 block pointers
- First 12 map direct blocks
- File less than 12 blocks long (48k on 4k fs), location of blocks
stored in first inode
- Bigger files allocate indirect block
- Slot 12 holds address of indirect block
- 256 blocks
- Slot 13 is double indirect block
- 256 indirect blocks
- 256 direct blocks
- slot 14 is triple indirect block
- 256 double indirect
- 256 indirect block
- 256 direct blocks
- Inefficient for large files
- Long time to delete because has to read all indirect blocks and
free block pointers in all blocks
- Ext4
- Extents
- Extents are efficient way to represent large file
- Extent is a single descriptor for a range of contiguous blocks
- Logical 0 block, length 1000 blocks, physical 200
- 12 bytes ext4_extent structure (from ClusterFS)
- Address 1EB fs (48 bit physical block number)
- Address 16TB file size (32-bit logical block number)
- Max extent 128MB (16 bit extent length)
- Up to 3 extents stored in inode directly
- Vast majority of files (99%) live in the inode
- For greater than 3 extents, convert to B-tree
- Block allocator changes
- Extents works best if files are contiguous
- Delayed allocation and allocating multiple blocks at a time makes this
much more likely
- Block allocator looks at disk to try to find free space to fit number
of blocks we want to allocate
- Makes fs more resistant to fragmentation
- Responsible for most of ext4's performance improvements
- Problem: ext3 journal mode semantics means that data is written
before inode
- Avoids security concerns
- Application programmers came to depend on this
- ext4 does this differently, because of delayed allocation, we don't
push out to disk until page allocator (30 seconds)
- And staged, so it may take a while
- In laptop mode, 2-5 minutes to write to disk
- POSIX allows this
- Open question: fsync? fdatasync?
- Persistent pre-allocation
- Useful for databases and video
- Pre-allocate 1GB on disk ahead of time, contiguousally
- Useful for package updates (rpm, deb), reduce fragmentation
- Useful for file grown by append (like logfiles); can pre-allocate
space, and then the logfile contiguous
- Available via posix_fallocate, but...
- On older FS's, will just write lots of blocks of 0 (very slow)
- Changes i_size field, meaning that it's size (as reported by stat)
looks much bigger than it actually is currently using
- Need glibc direct access to Linux system call
- Avoid i_size change
- Fail on old FS's
- e2fsck Performance
- Not explicitly engineered
- But huge improvements
- Fewer extent tree blocks to read instead of indirect blocks
- Uninitialized block groups means don't have to read portions of the
inode table

FOSDEM 2009, Day 1

Since I live in Europe now, and especially since I live about 20 minutes by train from Brussels, I decided to go to FOSDEM 2009 this year. This was good for me; I actually haven't been to a conference since 2002, so it was interesting to see how this was run. It was also good to see what is going on in the Free Software world outside of my little sphere, and maybe get me excited in some new projects. I'll write two posts with my notes, one from the first day (Saturday, February 7), and a second one about Sunday, February 8th. The notes are a little dry, so I might consider doing a third post with my overall thoughts.

Saturday, Feb 7

Relicensing SunRPC code
Speaker: Simon Phipps, Sun
Problems of success:
- Some old code in Linux
- Open Network Computing Remote Procedure Call - 29 years old
- Most liberal license 29 years ago
- Use the code, but no profit
- Debian noticed - asked Sun to re-license
- Difficult to re-license because it is so old
- As of a couple of weeks ago, got permission to re-license

Key Note
Speaker: Mark Surman, Mozilla
free. open. future?
- What is Mozilla for?
- Common: build Firefox, Thunderbird, Mozilla
- Mission statement
- guard the open nature of the Internet
- build open-source software
- promote innovation
- Roadmap for 2010
- openness and participation
- data safer and more useful
- mobile
- firefox
- Roadmap for 2060
- openness and participation
- ??
- How far can free and open go?
- 2003
- Web in danger
- IE 98% share - monopoly
- online apps only for IE
- freedoms matter
- now - IE < 70%
- letting consumers change their experience, let's users "hack"
- conceptual map of free software can go far
- What's next?
- 2009 - big for free software and mobile
- mobile battle
- how can open win?
- hardware (like OpenMoko)
- software
- web
- network
- cloud
- pricing
- permissions
- Conclusions
- Need better conceptual map for the mobile space and beyond
- Strong values, freedom beyond just code
- free software, that people want to use
- Users as hackers, anyone can bend anything
- What else?
- pick something in the map
- open source and education (build into university curriculum)
- Franklin street statement
- mapping freedom to web services

Speaker: Bdale Garbee
- Contributor to Debian since 1994
- Debian Project
- Association of individuals
- Why does it matter?
- Freedom
- Stable, functional community despite appearances
- External appearance sometimes skewed because of a vocal minority
- Large number of architectures and packages
- Uniform process for quality testing, bug reporting, open to
- Many derivative (downstream) distributions
- History
- Started by Ian Murdock
- 1993 - 0.01
- 1994 - 0.91
- 1995 - 0.93rc5
- 1997 - Debian Social Contract
- 1998 - 2.0 released
- Debian Linux Manifesto
- Developed in the spirit of Linux and GNU
- Promise to put distribution together and maintain it
- Design process is open to ensure high quality
- Linux is not a commercial product, and never will be one, but can
compete with commercial products
(this doesn't make much sense nowadays)
- Debian Social Contract
- 100% free
- Give back to the communities
- Don't hide problems
- Prioritize users and free software
- Public Bug Tracker
- First uniform way to report bugs
- Since 1994
- Open system, anyone can open or close bugs
- Email in, web out
- Policy Manual
- Set of standards
- Tools that assist in packaging
- Constitution
- Organizational Structure
- Division of power
- Individual developers have the most power
- Observations
- Values before Vision before Strategy before Objectives
- Internal social contract potentially as useful as an external one
- Future
- Lenny
- No radical changes
- Attract new contributors
- HP
- stays connected to Free Software through Debian

OpenWRT: UCI and beyond
Speaker: John Crispin and Felix Fietkau
- UCI: Unified Configuration Interface
- Designed originally for OpenWRT
- Simple configuration system to cover 90% of cases
- Human readable/writable
- Based on typed sections, option/value pairs, and lists
- First implementation in awk
- SNMP access available
- API for C, shell, Lua
- Used (and is requirement) for all OpenWRT base packages
- LuCI
- MVC (Model, View, Controller) webapp
- OpenWRT configuration interface
- Validation framework
- Data storage
- Configuration split by packages, stored in /etc/config
- Can access through UCI pointers
- e.g. uci get network.lan.ipaddr
- Changes stored, then committed/reverted
- If you make a mistake, can just reboot the appliance to get back old
- Other backends planned (databases, etc)
- Overlays supported (commit/revert example above is example)
- Shell access (uci command)
- C
- Simple wrapper
- API with limited functionality
- Regular C API
- Direct access to data structures
- libucimap
- Automatically converts configuration files to/from C data structure
- Lua binding
- Efficient scripting
- What's next?
- More application support
- Automatically generated configuration interfaces based on config files
- Remote backends
- Locking for simultaneous access

Speaker: Sven Krohlas, Ian Monroe, Lydia Pintscher
- Amarok 2.0
- Context information for music - central, moved to middle in Amarok 2
- Plasma based
- New playlist on right-side
- Lots of different layouts possible
- Lots of new drag-n-drop functionality
- Podcasts from BBC
- Jamendo integration - CC licensed music
- Audiobook integration
- NPR plugin
- Radio station from friends?
- Amarok 1.4 used DCOM (interprocess communication)
- Amarok 2.0 Javascript implementation called qtscript
- Javascript QT binding
- Script console
- QT API, classes relevant to Amarok
- Javascript language is OK for development
- Integrating services into Amarok
- Scripting interface is powerful
- Lyric scripts
- Service
- Example
- Fetches lyrics - charts with only free music
- Small XML file describing data
- Spec file, tells icon, author, comment, service
- Code file
- Levels (display levels)
- Downloader class

Speaker: Max Spevack
- Personal
- 4.5 years at RedHat
- Manage RedHat's global community team
- Goal of Fedora?
- Increase open source development community
- Potential population is inverse pyramid
- Top-level is users
- Next level will search and solve their problem (help themselves)
- Next level turns the user into a participant of a community (help others)
- The last gap is difficult - Fedora wants to focus there
- Fedora Four Foundations
- Freedom
- Strong commitment
- RedHat building business around community
- All the code is licensed in an OSI approved license
- 2/3 packages are maintained by community
- Release Engineering processes to make community engaged
- Distribution composition - pungi has a nice API - easy to use
- First
- Innovative community
- Examples
- plymouth - uses KMS
- nouveau driver
- virtualization
- Pulseaudio
- Friends
- Fedora Infrastructure team
- Entirely volunteer driven
- Open
- System Admin team
- New contributors
- Fedora Scholarship
- Active in open source, starting at university
- Features
- Improve Fedora QA
- Getting stuff vs. building capacity to allow people to do things
- Too much time getting the distro done, not enough time maintaining
the community
- Fedora Test Days
- Fedora 11
- 20 second startup
- DeviceKit alpha
- ext4
- DeltaRPM
- MinGW cross-compile
- Python 2.6

Hacking with Modular Hardware: the BUG
Speaker: Ken Gilmer
- Bug labs
- Electronic building blocks for personalized hardware devices
- Examples
- GPS alarm clock (wake up based on location instead of time)
- Stereo camera system
- Hardware is all open
- Lower barriers to innovation
- Use Linux, Java, OSGi
- Base Unit
- Buttons - software programmable
- 2.6.27 kernel
- Basic interface
- Battery
- Modules available
- Accelerometer
- Touch screen LCD
- Wifi
- Audio
- von Hippel - All hardware interfaces exposed
- Poky Linux
- Includes Gnome Mobil
- Can put custom images
- OSGi
- Standard to disconnect services from implementation
- Specification for components to talk to each other
- Notifies applications when modules are added/removed
- Abstract services
- Different modules can provide the same service
- Dragonfly SDK
- Like Eclipse
- Website to publish applications
- OpenEmbedded tools for Eclipse
- BitBake Commander for building images

Intel's Graphics Projects for the Year
Speaker: Eric Anholt
- Lots of good progress, but lots of problems
- vblank broken
- No GEM support
- OpenGL 2.0, vertex shaders don't work
- High end applications are slow with GEM
- vblank
- Applications want to wait for vblank
- Copy back to front
- Copy part of back to front
- GL gives you
- Wait for n vblanks
- X gives you nothing
- Which head should we vblank?
- GL doesn't know which head to vblank either
- What happens when one of the heads turns off?
- Too slow
- Doing it right
- Schedule vblank swap ioctl
- Cancel vblank swap ioctl
- Mesa GLSL is really a proof of concept
- Written by a graduate student
- Mesa GLSL only implements part of GLSL
- Mesa GLSL lacks optimization
- Mesa GLSL IR is a poor mapping onto current hardware
- Written against 3 year old hardware
- New compiler needed
- In progress
- Implements all of GLSL
- New IR for modern hardware
- Designed for optimization
- Pull some Gallium ideas into Mesa (incrementally)
- Kernel Mode Setting
- Works today if you are lucky
- Slow performance in the console
- Detect hotplug interrupts
- Media
- MPEG2 support implemented today
- More codecs in progress
- Convert XVMC to GEM
- Memory Management
- GEM is good
- Performance issues on older hardware
- Flushing is painful for large applications
- Can do better by keeping objects pinned

Speaker: Raphael Pinson
- Configuration management
- Sitewide configuration
- Database
- Puppet
- Local configuration
- Editing configuration
- Most configuration files in Linux are text files
- Easy to edit with an editor
- Human edits can conflict with graphical tool editing
- Approaches to editing
- Keyhole - one-off, like sed, awk, etc.
- Greenfield - configuration from a database, centrally managed
- Templating - configuration from database, fill in machine specific
- Missing pieces
- Handle configuration data directly
- Policy delegation
- Remotable API
- Design goals
- Deal with configuration data in current place
- Expose an abstract tree view of configuration data
- Preserve "unimportant" data (comments, etc)
- Describe new formats easily and safely
- Language neutral implementation
- Focus on configuration editing, and not interpretation
- Bidirectional languages
- Transform a configuration file into a tree via a "lens"
- Modify tree
- Write the tree back to the configuration file via the "lens"
- init/close
- get/set
- match
- insert
- rm (a subtree)

Thursday, January 22, 2009

Stupid Git tricks

I've been using git somewhat for a couple of years now. Until recently, though, I was only a very casual user, and I didn't know the full power of git. I'm now maintaining a git tree at work, and I'm having to learn some of the more powerful features of git. This may be an ongoing thing as I learn more about it, but for now, here are some of the commands that I've had to learn, and that are extremely useful:
  1. git checkout -b
    This is really at the core of git, so it's almost not worth talking about, but I'll start here. This creates a new branch off of the branch you are currently working on, and changes over to that new branch. git is all about working with branches, so the recommended way to do things is to have many "topic" branches; one branch for each different topic you are working on or with. Then later on it's easy to merge branches together, push and pull them remotely, etc.
  2. git commit -a -s
    Commit any outstanding changes to the current branch. This is pretty self-explanatory, except for the fact that -s adds your Signed-off-by line automatically.
  3. git rebase -i HEAD~6
    This is somewhat of a baroque command, but it's so powerful, you'll wonder how you ever lived without it. A "rebase" tells git to checkout some previous version (in this case, HEAD - 6 commits), then replay thte commits on top of that. Where this gets really interesting, however, is with the -i flag; this is an interactive rebase. This allows you to do various operations to the individual commits before they are replayed. Your three options are pick, edit, or squash. "Pick" just means to take the commit as-is. "Edit" means that you want to edit the commit in some way before replaying it. This can be as simple as editing the commit message, or as complex as adding new code into the commit. "Squash" means to take this commit, and merge it into the previous one, so they now look like one commit.
  4. git add --patch
    I actually haven't personally used this one yet, but it was pointed out to me by a co-worker. git add is the command you use when you want to add some changes to a changeset before it is committed. The "--patch" flag allows you to choose just specific hunks of the differences that git finds, so if you have some debugging or whatever left in your tree, you can just automatically throw it away. Very cool.
  5. git merge
    This is a command that lets you merge multiple branches into your current branch. It tries very hard to do automatic conflict resolution; if it has to give up, it leaves you in a place where you can fix up thte conflict by hand, and then continue the merge.
  6. git pull remote_branch local_branch
    This command lets you pull *any* remote branch onto *any* local branch. That means someone can point you to their private repository, and you can pull their changes onto your branch locally. Very handy for combining trees.

Monday, January 19, 2009

KVM performance tools

I've recently been working on tracking down some performance problems with KVM guests. There are a few tools available, but in this entry I'll stick to the two I've found most useful recently: kvm_stat and kvmtrace.

Let me start with kvm_stat, since that one is far easier to work with. kvm_stat is basically a python script that periodically collects statistics from the kernel about what KVM is up to. Unfortunately, it is not packaged in the Fedora kvm package (I should probably file a bug about this), but the good news is that it is very easy to get. You just need to check out the kvm-userspace git repository (git clone git://, and the script is at the top-level of the directory.

To get kvm_stat to actually do something useful, you first have to mount debugfs. In all likelihood, your distro kernel has this turned on, so all you really have to do is:
mount -t debugfs none /sys/kernel/debug
If you are going to do this often enough, it's probably a good idea to add that to your /etc/fstab.

Once you have debugfs mounted, you can now use kvm_stat. If you just run "kvm_stat", it periodically outputs columns of data, sort of similar to vmstat. I've found this kind of hard to look at, though. So what I've been doing instead is using the -l flag of kvm_stat, and piping the output to a text file:
kvm_stat -l >& /tmp/output
The -l flag puts kvm_stat into logging mode, which is harder to read on a second-to-second basis, but easier to read in a spreadsheet later. After I've collected data for a while (mostly during the tests I care about), I use OpenOffice to read that data into a spreadsheet (hint: use "space" as a delimiter, and tell it to "Merge Delimiters"). Now, it's fairly easy to see what your guest has been doing, from the host's POV. Note that these are cumulative numbers; if you have multiple guests running, this is all of the data from all of the guests.

There are quite a few fields that kvm_stat outputs; I'll talk about the ones I think are relevant:
  • exits - I *think* this is a combined count of all of the VMEXIT's that happened during this time period. Useful number to start with.
  • fpu_reload - The number of times a VMENTRY had to reload the FPU state (this only happens if your guest is using floating point)
  • halt_exit - This is the number of times that the guest exited due to calling "halt" (presumably because it had no work to do)
  • halt_wake - This is the number of times it was woken up from halt (it should be roughly equivalent to halt_exit)
  • host_state_reload - This is an interesting field. It counts the number of times we had to do a full reload of the host state (as opposed to the guest state). From what I can tell, this gets incremented mostly when a guest goes to read an MSR, or when we are first setting up MSR's.
  • insn_emulation - The number of instructions that the host emulated on behalf of the guest. Certain instructions (especially things like writes to MSR's, changes to page tables, etc) are trapped by the host, checked for validity, and emulated.
  • io_exits - The number of times the guest exited because it was writing to an I/O port
  • irq_exits - The number of times the guest exited because an external irq fired
  • irq_injections - The number of IRQ's "delivered" to the guest
  • mmio_exits - The number of times the guest exited for MMIO. Note that under KVM, mmio is much slower than a normal I/O exit (inb, outb), so this can make a significant difference
  • tlb_flush - The number of tlb_flush's that the guest performed.

The other tool I've started to use is kvmtrace. This tool does generally the same as the kvm_stat tool, but it does it at a much finer granualarity. From the output, you can see not only that it did a VMEXIT, but also that it did a VMEXIT because of an APIC_ACCESS (or whatever). This can be powerful, but it also generates a lot more data to sift through.

Using this tool is a little more complicated than the kvm_stat one. Luckily, it is packaged in the Fedora kvm RPM, so that part we get for free. To run this beast, you'll want to do something like:
kvmtrace -D outdir -o myguest
What this does is to tell that you want all output files to go to "outdir", and have them named "myguest.kvmtrace.?". You'll get one file for each CPU on the system. The last statement is actually quite important; generally, your best bet is going to be to pin the guest to a particular CPU on the host, so that your results don't span across multiple CPUs. Now, this is the raw, binary data for each CPU on the system. You next need to convert that into something that a human can look at. For this job, there is kvmtrace_format. You can do all kinds of clever things with kvmtrace_format, but what I've found the easiest so far is to use the "default" format file (which generates all events), and then dump that out to a file. So, for instance, I ran:
kvmtrace_format user/formats <> myguest-kvmtrace.out
Note that user/formats is from kvm-userspace at git:// (again, it's not in the Fedora kvm package, which I should probably file a bug about). That ends up dumping all of the output to myguest-kvmtrace.out, which turns it into a *huge* file. From here, I just did a bunch of processing with sed, grep, and awk to look for things that I care about.

Friday, January 16, 2009

Remote Sane

Well, this is going to be my occasional blog on technical things. Hopefully I'll keep up with it, although I'm notoriously bad for doing so. In any case, here's the first one.

Recently Jen asked me to set-up scanning at home. We have an HP PSC 1510 All-in-One Printer/Scanner/Copier. It's currently hooked to my main router via USB. Now, I could easily have done all of the scanning from the main router via the "scanimage" command. But I really wanted to be able to scan from the other computers on my network, including Jen's Windows laptop. Enter saned, a small daemon that allows remote clients to access your scanner. However, getting this working had it's share of pitfalls.

Server Configuration

To start with, my main router is a Fedora 9 i386 box. The first order of business, of course, was to get scanning working locally. Luckily this is pretty easy nowadays; I just needed to install:


A quick "scanimage -L" shows that the scanner was detected, and a quick "scanimage > image.pnm" shows that it actually scans things. Great, the first part is over.

For the next part, I needed to setup saned. While saned is packaged with the sane-backends package, it is unfortunately not well integrated into the system. To get it working, I basically had to add it to xinetd, add the IP addresses I wanted to be able to scan from, add a new user, and finally make sure that when the device was plugged in, it got set to the appropriate user. Those steps are described below; note that I had a lot of help from the following web pages:

The first order of business is to get xinetd to start up saned. This is accomplished by adding a new file /etc/xinetd.d/saned, which looks like this:
service sane-port
socket_type = stream
server = /usr/sbin/saned
protocol = tcp
user = saned
group = saned
wait = no
disable = no
Then I did a quick restart of xinetd via "service xinetd restart". Next, we have to make sure that the "saned" user exists on the system; I did this with:
useradd -M -s /sbin/nologin -r -u 491 saned
Which creates a new user without a home directory, with no login shell, and with UID 491. With that in place, I was able to test what I have going so far with a quick "telnet 6566". If this doesn't spit any errors, then we at least have the daemon up and running.

With that in place, the daemon is running, but the problem is that the permissions on the USB device won't allow the saned user to actually access it. This is where things get tricky; in order for this to work, we really want to make sure the USB device is owned by saned *every* time it gets plugged in (or powered up). In modern Linux, including Fedora, the way to do this is with a custom udev rule. In my case, I created the file /etc/udev/rules.d/10-saned.rules, and put the following in:
ACTION=="add", SUBSYSTEM=="usb", NAME="bus/usb/$env{BUSNUM}/$env{DEVNUM}", MODE="0664", OWNER="saned", GROUP="saned"
What this says is that for any add action on the usb subsystem, change the owner to saned and the group to saned. Now, this rule isn't as refined as it could be; in particular, I probably only want this rule to run when the device being plugged in is in fact my USB scanner.

After adding in the above udev rule, I just had to power off my scanner, wait a few seconds, and then power it back up. Once I did that, I had to check that the /dev device was owned by the right user. To do that, I first had to find out which USB device my printer shows up as. That's easily accomplished by a quick "lsusb", and then looking for the Bus and Device number. In my case, that was Bus 002 and Device 009, so I just did a quick "ls -l /dev/bus/usb/002/009", and ensured that that /dev node was owned by saned:saned.

Finally, I had to add which clients I wanted to be allowed to scan over the network. To do that, I just added:
to /etc/sane.d/saned.conf.

Client Configuration

Now that I had the server all set up, it was time to try out my remote clients. The first thing to try was obviously my Linux laptop. This was actually a breeze; I just had to add the internal name of my server to /etc/sane.d/net.conf, and then xsane finds the scanner when it starts up.

Windows was a bit harder, but not much. Basically I went to and downloaded the SaneTwain bridge there. After unzipping that package, I was able to start up the example executable and add my saned machine as a remote source, and it all worked!