Open Source

Keeping applications open

Tuesday, August 23rd, 2005

Just an interesting post I made to OSNews in response to someone saying that IE “starts quickly” while Firefox “takes forever.”

Just to clear up, the only reason IE starts faster in Windows is because IE is technically “always running.” The only thing that has to “start” is creating a Window with an “IE control” in it.

I get the same behavior on Linux by running galeon -s when my X session starts. This runs galeon in “server mode,” which means it’s always in memory, and when I run Galeon (on my laptop, I press ALT+F1 to run my browser), it starts in < a half-second. If Firefox had a similar mode, it could offer you the same thing. As for OpenOffice.org, it's true that the start time is relatively slow. I'm sure they'll get around to optimizing it. Personally, I think the obsession people have with start times on Linux and Windows machines is due to a basic design flaw with most Window managers. Applications should really only start up once; if you start an application multiple times in a day, you're essentially performing redundant computation. The program can sit in memory and if it really is not used in awhile, it will get paged out anyway due to our modern Virtual Memory implementations. In OS X, for example, you can get the same effect as "galeon -s" or IE's "preloading" simply by not quitting an application after all its windows are closed. This leaves the application running, and when you open a new window it will be nearly instantaneous. (Strangely enough, many old Windows/Linux freaks are sometimes "annoyed" by this aspect of OS X, since in the Linux/Windows world up to now, closing all windows of an application is equivalent to closing the application itself).

Yay, ddrescue saves the day

Monday, August 22nd, 2005

It turns out GNU ddrescue did just the trick, and I was able to get 98% of the data off that dead hard drive. Now to set up a new Windows system for these poor bastards (why, oh why, can’t everyone run *nix?). I need a cup of coffee first.

GNU ddrescue and dd_rescue and dd_rhelp, what the?

Friday, August 19th, 2005

Wow. I hate when shit like this happens.

Apparently there are three tools out there to help with the same thing. First, there’s dd_rescue, the tool I was using earlier (which ships with Ubuntu in a debian package called… ddrescue). Then, there’s dd_rhelp, a shell script which is a frontend to ddrescue and which implements a rough algorithm to minimize the amount of time waiting on bad block reads.

Then, there’s GNU ddrescue, which is a C++ implementation of dd_rescue plus dd_rhelp.

I only just realized this and so now I’ve compiled a version of GNU ddrescue to pick up my recovery effort. It’ll probably help with one of the partitions that seems particularly messed up.

So far the nice thing about GNU ddrescue is that it seems faster, and more responsive. Plus, it has a real logging feature, such that if you enable it and then CTRL+C the app, you can restart it and it’ll automatically pick up where it left off.

UPDATE: wow, good thing I switched. GNU ddrescue is significantly faster just in terms of raw I/O performance. I jumped from 4GB of this partition being rescued (which took 30 minutes with dd_rescue) to 6GB in the last ten minutes. It seems at least 3x faster. I also like that the GNU info page describes the algorithmic approach in-depth.

Fried hard disk ruins weekend

Friday, August 19th, 2005

So, one of my employers ended up with a fried hard disk, for the second time in a row. The main reason is that the PC this HD is contained in sits in a corner with little-to-no airflow.

In order to recover the drive, I am actually taking a different approach from my last recovery effort, mainly by necessity. This disk is seriously damaged–lots of bad sectors, and its partitions are not readable by any NTFS driver, be it Microsoft’s or the open source one. This makes simply using the wonderful R-Studio tool I used last time currently impossible, due to the fact that it won’t even see the drive properly within Windows, and will hang all over the place.

Indeed, what I needed to do is drop down a layer of abstraction: away from filesystems, and into blocks and sectors. Unfortunately, in the Windows world this drop down is difficult, so I had to use my Linux laptop to make this jump.

I found a wonderful tool to help me out called dd_rescue, which is basically a dd with the added features of continuing on error, allowing one to specify a starting position in the in/out files, and the ability to run a copy in reverse. These features allow one to really work around bad sectors and even damaged disk hardware to get as much data as possible out.

Unfortunately, the use of this tool was encumbered by my laptop’s relatively simple bus design. Apparently, if I stuck two devices on my USB bus (like two HDs I was using for this process), the bus would slow to a crawl, and the copy would move along at an unbearble 100kB/sec. I tried utilizing firewire and USB together, but got only marginal improvements. What befuddles me is that in the end, the fastest combination I could come up with is reading from the Firewire enclosure with my laptop and writing to the firewire enclosure of my desktop across the LAN utilizing Samba. Very strange indeed. Now my performance is more like 6MB/sec, factoring in all the breaks dd_rescue takes when it encounters errors. I have 6GB of the more critical partition written, but it’ll probably take a couple hours to have a big enough chunk that I can test R-Studio’s recovery of it.

The only reason I’m even writing about this is because I find it hilarious how many layers of abstraction I am breaking through to do a relatively low-level operation. Think about it:

  1. My broken IDE drive is converted to Firewire by a Firewire-IDE bridge.
  2. My Firewire PCMCIA adapter is allowing my notebook to take in that connection.
  3. The Linux kernel is allowing firewire to be accessed via various ieee1394 ohci drivers.
  4. The Linux kernel is abstracting the firewire disk as a SCSI disk, using emulation.
  5. The SCSI disk is being read by dd_rescue and written to a file, which exists in the path /mnt/smb/image/sdb5.
  6. That path seems local, but is actually a mount point. That mount point seems physical but is actually handled by a Samba driver.
  7. The writes by dd_rescue to that image file are being sent through the kernel’s TCP/IP stack, and flying through my switch, and being accepted by Windows XP’s network stack.
  8. Windows XP is writing that data to an NTFS drive, which is itself connected by a Firewire-IDE bridge (and therefore all the above steps’ equivalents for Windows apply).

I am surprised with that many layers, that this copy is even working. I really should have just taken a machine apart and connected these drives directly by IDE, to save myself a few layers.

Microsoft’s anti-competitive behavior

Monday, August 8th, 2005

This /. article has responses from Microsoft Linux Lab manager Bill Hilf. I responded to this post from a Microsoft employee. My response follows. Read the rest of this entry »

N-way parallel mail retrieval with getmail and bash

Friday, August 5th, 2005

I wrote a pretty sweet script tonight. It parallelizes the getmail retrieval process, while still printing prefixes so I know which accounts download which messages. This means that instead of my mail fetching process taking sum(i1,…,in), where i is the length of time for a given mail retrieval, my fetching process now takes max(i1,…,in).


#!/bin/sh
GETMAIL='python2.3 -Wignore /usr/bin/getmail'
unwanted() {
  grep -E -v '(Copyright|getmail|Simple)';
}

echo "N-WAY GETMAIL RETRIEVER SCRIPT:"
$GETMAIL \
  --rcfile=/etc/getmail/account1 \
  2>&1 | sed -e \
  's/.*/account1................: &/g' \
  | unwanted &

$GETMAIL \
  --rcfile=/etc/getmail/account2 \
  2>&1 | sed -e \
  's/.*/account2................: &/g' \
  | unwanted &

...

$GETMAIL \
  --rcfile=/etc/getmail/accountN \
  2>&1 | sed -e \
  's/.*/accountN................: &/g' \
  | unwanted &

wait

PIDA: Python Integrated Development Application

Wednesday, August 3rd, 2005

PIDA 0.2.2 was released recently. This is truly a novel development in the Python/OSS world. What PIDA provides is a nice plugin system and the “makings” of an IDE. So, in a nice IDE you have a class browser, an integrated debugger, a profiler, maybe even a RAD-like GUI builder, an interpreter console, etc. The one piece that tends to be most controversial in every IDE is the text editor. This is one-part because UNIX people are really crazy about their text editors, but two-parts because text editors are very important programmer tools, and no one wants to learn a different text editor for every language one uses.

vim happens to be awesome for C programming, which is probably why a lot of UNIX hackers use it. But for Python, more advanced support would be nice. PIDA can run and connect to a vim server instance in order to allow you to have an “add-on IDE” for vim.

But even more interesting to me is the culebra plugin, which provides a code-completion-savvy GtkSourceView inheritor, which has the initial support for fancy Intellisense-like features.

I’ve already spoken to the developers of PIDA, and they said they would very much be interested in seeing Python Intellisense features brought to VIM. When I started thinking about different approaches to doing this, I realized that the whole OSS community could benefit from a general Python module that enhances the Python introspection features (and perhaps combines them with source code parsing) to make available nice productivity-enhancing features. I was thinking of calling this beast “Pyductivity.”

More on that later. For now, check out PIDA.

Exa: a new architecture for Xorg

Tuesday, August 2nd, 2005

This is exciting news. A Trolltech developer has modified KAA, the acceleration architecture used in Keith Packard’s experimental “kdrive” Xserver, to work with the traditional Xorg tree. He announced this new development in an e-mail that makes it clear it is extremely easy to get drivers to use Exa to gain Apple/Windows-like graphics performance.

I know that the unichrome project generally doesn’t bother itself with these very desktop-oriented features (their focuses are more on MediaPCs, etc.), but I think this may be an excellent way for me to begin hacking the new modular Xorg tree I mentioned last time. If I added Exa support to the unichrome driver, would that mean transparency and full-on graphics acceleration for my X desktop, what I’ve long been waiting for? We’ll see.

Xorg goes modular, is now approachable

Tuesday, August 2nd, 2005

It seems that Xorg, as of the 7.0 release, has been split into monolithic and modular development trees. The modular model allows you to compile individual components related to Xorg separately from the whole X server, so that you don’t need to do a two-hour compilation just to work on this or that driver or this or that library.

This is good news for me. My biggest craving lately was to put my C skills to use by diving into a big project that has effects on the Linux desktop, and Xorg is certainly the biggest in that sense. However, in the past I was always put off by the huge amount of groking one needs to do just to understand the Xorg compilation process. After class is over, I’m gonna start diving in.

Mark Shuttleworth on Ubuntu and Debian

Monday, July 25th, 2005

I just watched a short talk on Ubuntu Linux given by Mark Shuttleworth. Ubuntu is the distro of Linux I have run for the last year or two. Quite an amazing talk, really. You can watch the whole thing here. He is quite a fascinating figure, but more fascinating is how clear his explanation of the Debian-Ubuntu relationship is.

If Ubuntu reaches the level that Mark hopes it will, it will truly be an amazing distribution. And I had never known about the other things Ubuntu works on, like Bazaar-NG, which allows distributed revision control (probably one of the coolest concepts in Open Source I’ve heard of in awhile). Rosetta is another one of those projects, which is a very cool web-based system for allowing translation of free software projects.

It’s an interesting time to be tracking desktop OSS indeed.