First Time Linux


The following is a more-or-less random collection of notes that I collected along the way, to remind myself how I did stuff.

File splitting

Tip: to split a file into transportable pieces (to fit on removable media or to email in manageable chunks), use the split command, eg: split -b 10m -d bigfilename chunkprefix to give numbered chunks of 10 megabytes. Then you can join them together again with cat chunk01 chunk02 chunk03 > bigfilename.

Audio extraction

Tip: to extract the audio from an avi file (eg just to take the sound from a movie made by a digital camera), you can use mplayer eg: mplayer -ao pcm -aofile outputfile.wav mymovie.avi to create a .wav file of the sound. Then you can use a tool like Audacity to select all or part of the file and export as Ogg Vorbis. If you want to encode as mp3, you'll need extra, (patent-encumbered) software like lame.

Update: the command syntax has now changed slightly - use mplayer -ao pcm:file=outputfile.wav mymovie.avi instead.

Laptop touchpads

Tip: To disable 'tapping' on the touchpad, you have to manually edit the file /etc/X11/xorg.conf to include the line: Option "TouchpadOff" "2" in the InputDevice section for the touchpad - see documentation at /usr/share/doc/synaptics-0.14.1/README. Another recommended setting is Option "HorizScrollDelta" "0" which prevents Mozilla from navigating forwards and backwards with accidental brushes on the bottom edge of the touchpad. The option VertScrollDelta is a bit more useful, scrolling documents with vertical brushes on the right-hand edge of the touchpad, as this is a bit less likely to be accidental, and actually quite handy for web pages, console windows etc.


Tip: To go through a list of files replacing wordA with wordB, use sed:
for i in "*.html"; do sed -i -e "s/wordA/wordB/g" $i; done
This loops through all html files in the current directory executing a sed command (global regular expression replace, s/find/replace/g) on each file.

Update: There's an easier way to do it, which works as long as the file list isn't too long:
sed -i -e "s/wordA/wordB/g" *.html

Update2: If you want to do the same changes to all pages in the tree (including subdirectories), use the amazingly flexible find command:
find . -name "*.html" -exec sed -i -e "s/wordA/wordB/g" \{\} \;
The end is a bit messy, it uses curly brackets {} to reference the file found by find but the bracket need to be escaped with \ and so does the ending semicolon.


Tip: To make apropos (or man -k) work, you need to build the references first, by running (as root) /usr/sbin/makewhatis .

Verifying file checksums

Tip: A handy way to check md5 checksums against what they're expected to be is to use the java md5 checker tool from the downloads section and run it against both the downloaded file and the downloaded md5 file, like this: java -cp md5.jar md5.Md5Check downloadedFile md5File. This is handy when your current operating system hasn't got a command like md5sum but it has got java.

Star menu

Tip: To change or reset the "Recently Used Applications" section of the KDE menu, go to "Configure Your Desktop" and look under LookNFeel - Panels - Menus. Here you can select whether the list shows most recent or most frequent, and the number of applications to show.

Removable storage

Tip: To change what happens when you plug in USB drives, or insert CDs etc, look in the star menu under System - Configuration - Hardware - Removable Storage. This dialog points to scripts which can be executed to launch applications, for example launching digikam when a camera is plugged in.

Modifying the bash prompt

Tip: To change the appearance of the console prompt, edit the file .bashrc in the home directory. For example, the line
PS1='\u \W $ '
removes the default host name and just displays the username and current directory before the $. You can also add colours to make the commands stand out, for example:
PS1='\[\033[1;33m\]\u \W $ \[\033[0m\]'
makes the prompt stand out in yellow type. For more info see Chapter 2 of the Bash How-To at

Ripping audio CDs using copy and paste

Tip: To convert your audio CDs into ogg or mp3 format, you can use a very handy feature of Konqueror and the KDE. Simply go to the "Services" tab of Konqueror (select Window -> Show Navigation Panel to reveal the tabs if they're not visible). Then go to the Audio CD Browser and you will see all the tracks on the CD in WAV format. If you've got internet access, these tracks should automatically be labelled with the artist and title, retrieved from the CDDB information on the internet (clever). But it gets better. There are also some directories shown, which aren't really on the CD but are "virtual" folders. Among them is an "Ogg Vorbis" directory containing an ogg file for each of the tracks, again named using CDDB. Simply copy and paste the file(s) you want, or drag and drop, and hey presto, on-the-fly reading, ripping and encoding into ogg format.

If you have the tool lame installed, and also liblame, both of which come from the plf libraries, there will also be an MP3 directory which contains mp3 files. Very simple, very clever. Apparently something to do with kioslaves and the audiocd:/ protocol, don't ask me.

To configure the encoding options, see "Configure your Desktop" from the star menu, and look under Sound -> Audio CDs - there are tabs for each of the installed encoders with custom settings.

Finding modified files

A quick way to find all modified files in any subdirectories is to use the bash find command, giving the number of days to look back. For example to find any files changed in the last 3 days, do:

find . -mtime -3

The . tells it to look in the current directory (and subdirectories), and the -mtime parameter tells it to look at the last modification time. The -3 means "up to 3 days ago", whereas just a 3 would only show those modified 3 days ago, not those modified since.

Deleting specific files

You can use the output of a find command to delete files meeting specific criteria. Maybe you want to delete items from the Trash which are more than a certain age, or maybe you want to clear out all files of a specific type from a directory tree. For example, to delete all the massive tif files (produced by hugin) from the Trash directory:

cd Desktop/Trash
find . -size +30M -name "*.tif" -exec rm \{\} \;

Again, this \{\} \; is a bit ugly but necessary to take the filename parameter from the find command and then complete the -exec expression. Basically, find searches for all files below the current directory which match both a size of greater than 30 MB and a filename ending in ".tif" - then it passes the found filenames to rm which deletes them.

Uuencoding and uudecoding

When a binary file, like a picture or a sound file, is sent by email, it has to be converted into a text-only form so that it can be sent through the email systems. The standard way of encoding the binary data into an ASCII form is called "uuencoding", and the email program should automatically do the encoding and decoding without you worrying about it. Occasionally, however, something goes a bit wrong and instead of a binary attachment you just get screenfuls of characters, like this:

Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="foto.jpg"


and so on. To understand how this works and reconstruct the original attachment, it's good to try to uuencode a file yourself and see what the file looks like. Choose a small, binary file (gif, jpg, png or something like that) and take a copy into a new directory. From there, uuencode it into a text file:

$ uuencode -m pic.jpg myphoto.jpg > pic.txt

This specifies base64 encoding (-m), to encode the file pic.jpg and specifies that in the text file it should give the name myphoto.jpg (different so that it doesn't overwrite our original when we decode). The output is then piped (>) to a text file called pic.txt.

If we now look at the file created, we see a simple header, lots of lines of characters all 60 characters long, and then an ending line of four equals signs:

begin-base64 644 myphoto.jpg

Now we can try to decode this text file and create a new binary file, called myphoto.jpg, which should be the same as the original pic.jpg:

$ uudecode pic.txt
and we can check that it worked:
$ diff pic.jpg myphoto.jpg

If it doesn't say that the files differ, then it has worked. The files should be identical.

Using this as our guide, we can take the characters from the email, and save them as a text file with a suitable header line (like "begin-base64 644 attachment.jpg") and a suitable end line ("===="). Then simply run the uudecode command as before to recreate the binary file:

$ uudecode attachment.txt

And hey presto, you can now see the emailed picture!

Grepping for text

A very useful search tool is grep, which searches text files for a given search term, and displays all the lines of the file in which it appears. Using the following:

$ grep "search term" *.html

looks in all the html files in the current directory for the phrase "search term", and shows you the context too - very useful. Further flags, however, make it much more useful: -i specifies a case-insensitive search, so it would also find occurrences of "Search Term" or "seARch teRM". Also, -R specifies a recursive search, to also search in all subdirectories and sub-subdirectories as well. And -w matches word boundaries, so it would not show files containing "search terms" or "search terminated". These flags can be combined, like this:

$ grep -iRw "search term" *

where we now specify a path of * rather than *.html to search through all directories.

Executing commands on each file listed in a file

Sometimes you want to execute the same command on a set of files, and the filenames are listed in a file. Perhaps this file was generated by redirecting the output of a grep -l command, so the output file contains a list of all files which contain (or don't contain, with -L) the specified expression. Now you've got your list of suspects, how do you execute a command on each of them?

The answer is to pipe the output of cat into a bash while loop, using the read command to read each line of the file in turn. For example, to copy a dummy file over each of the found files:

$ cat filelist.txt |while read line; do cp -vf dummyfile.txt $line; done

Command line calculator

If you just want to add a couple of numbers together, and you're already at the command line, it would be a shame to fire up kcalc just for a single sum. Instead, you can use the built-in bash functions, although beware that these are very picky about spaces between the terms and about escaping things like *. Some examples:

$ expr 5 + 18
$ expr 15 \* 7
$ expr 13 / 5

This is unfortunately limited to integer arithmetic, and doing anything as complex as a power calculation (eg 27) requires a different format:

$ let "x=2**7";echo $x

Other options include the extra tool bc or if you have perl installed you can use that:

$ perl -e "print 2/7"
0.285714285714286$perl -e "print sqrt(2)"

Comparing file trees from Dos/Unix

It's often useful to be able to compare a tree of files (for example, two versions of a website or a code project) and kdiff3 is an excellent tool for this. However it does seem to stumble when one tree is from a Dos-based system (with dos-based line endings) and the other is from a unix system (with different line endings). If you cat two such files, they look identical, but if you ll them, they have different file sizes. If you diff them, they look completely different unless you use the --strip-trailing-cr command option. But kdiff3 doesn't seem to have such an option, so the whole tree looks inconsistent and you have to search through each file to see the message "File A and B have equal text, but are not binary equal". That's awkward.

In the absence of an option in the settings to ignore the line feed characters, I needed a way to convert all the line feeds from one set of files. Using khexedit I could see that my first tree used a "0d 0a" pair as the line feed characters, whereas the second tree just used "0a". So I used sed to remove the "0d" characters from the first tree. Correction, I used sed on a copy of the first tree (of course!!).

In this case I was comparing a set of java files, so I used this:

$ find . -name "*.java" -exec sed -i 's/\r$//' \{\} \;

This removed all the "\r" characters at the end of all lines of all java files (see the 'sed' tips above for more on how this works). Then when I did a "Directory -> Rescan" in kdiff3, it managed to match all the identical files and show them green. That makes the really different files stand out much more clearly.

Resizing images

To create thumbnails of images for the web, or to create smaller versions of photos suitable for emailing, we can use imagemagick:

$ find -iname "*.jpg" -exec convert -resize 800x800 \{\} \{\} \;

Note that this converts the images in place, so only run this command in a directory containing copies of the photos. The dimensions 800x800 are maximum sizes, so a portrait photo will get converted to eg 600x800 and a landscape photo to eg 800x600.

Resizing images (2)

Another task for image resizing (apart from shrinking photos to make thumbnails) is to make pictures bigger to see more detail. Except sometimes you don't want the result to be interpolated, you just want to see the pixels bigger. Strangely, the following doesn't work:

$ convert -resize 800% -interpolate nearestneighbor tiny.ppm notsotiny.png

But instead, you need to use a different operator (don't ask me why):

$ convert -scale 800% tiny.ppm notsotiny.png

Sharing files

We've already discussed sharing files between computers in the networking page. But what about the apparently simpler task of sharing files between users on the same machine? Should be easy, right? So, let's assume the files are somewhere under the directory /home/alice/, so user Bob can't see them. That's good. We don't want to share the whole of Alice's home directory, because then Bob would be able to see everything. Alice can't copy or move them to Bob's directory, because she hasn't got permissions to do that either. There's also the complication that the files are big and won't fit nicely on the root partition. So the superuser creates a new directory, called /home/share/. He does a chmod 777 /home/share/ so that everybody has read-write access to the directory. Great! So Alice can move her files there, Bob can read them, and everyone's happy. Except that after lunch, neither Alice nor Bob can even read this directory! What happened over lunch?

The answer is msec, which is a (Mandriva-specific?) security tool which periodically checks the system for funny goings-on. And one of the things this daemon checks for by default, is the permissions on directories under /home/. So after an hour goes by, the daemon quietly resets the permissions on /home/share so that neither Alice nor Bob can access it. And how to stop it? To do this you need to go into the Mandriva control centre, and under "Security" there's an item called "Tune permissions on system". In here you can't edit the rule which it follows for /home/*, but you can add a new rule especially for the directory /home/share, and specify the permissions you want to set. Now the periodic checks still run but this one directory has an exception which allows it to remain world-readable.

Creating booklet pdfs

It's often useful to print out a document in booklet format, which means fitting two pages onto each A4 sheet and rearranging the order so you can fold the sheets into a booklet. For example, a 4-page document will get ordered so that pages 1 and 4 are on one side of the paper, and pages 2 and 3 are on the other side. Nice. A lot of printer drivers do this for you too, but if you don't want to rely on that, you can generate a pdf in advance which you know is going to be how you want it.

Firstly, if you're using a non-KDE application like OpenOffice, you need to export it to pdf first. Now open the resulting pdf file in Kpdf. From here, you should be able to print it, but in the print dialog you can specify "Print to File (PDF)" as the printer. In the "Properties" of this Printer you can specify the number of pages per sheet, but we won't use this option. Instead, go to the "Filters" pane of the Properties dialog, click the funnel to add a filter, and select "Pamphlet printing" from the dropdown. You'll need to install the psutils package to enable this option. Now print the document, specify the filename, and you should have the pages nicely shrunk and rearranged, ready for printing.

If you don't have the psutils package installed, the options for multipage printing will be greyed out when you print from kpdf, so you won't be able to select 2 sheets per page. Also when you select the "pamphlet printing" or "multipage printing" filters from the filter options, it will say something like "unavailable: requirements not satisfied". Unfortunately it doesn't say which requirements aren't satisified - it turns out it actually needs a tool called psnup. And that is contained in the package psutils, so installing that package will magically make the multipage options appear.

Combining and editing pdfs

Another pdf-related note, you might want to combine several pdfs into one document - this makes makes printing easier and can save paper by running the documents after each other instead of as separate print jobs. Or maybe you'd like to delete one page of a pdf, or remove the password protection. The tool with all these tricks and many others is called pdftk and is a powerful command-line tool that's easy to use. Install pdftk and read the help using pdftk --help to get some examples.

Removing passwords from pdfs

If a pdf is protected with a user password, you have to enter the password each time you open the pdf. This can get annoying if there are several pfs and you have to open them repeatedly. So it would be nice to be able to remove the password protection to make them more convenient to use. One way to do this is using pdftk, as described above (assuming you know the password already), using pdftk input.pdf input_pw password out output.pdf but this only works for some pdfs. In particular it can't deal with pdfs protected by AES encryption. So we need another method.

There is another java-based tool called Multivalent, which can deal with AES encryption, and this has a decypt function which you can run from the command line. This brings another complication though - the password which you need to read a pdf isn't necessarily the same as the password you need to modify a pdf, and for Multivalent's decrypt tool to work, you need to know not just the user password but the OWNER password. So if you don't know the owner password, you can't remove the user password protection from the pdf.

Fortunately there is another way to remove the password protection, which in hindsight is rather obvious. Simply open the password-protected pdf using kpdf, which prompts for the user password to read it. Once it's open, you can then print the pdf, but select to print to a pdf file instead of to a printer. This new pdf file which you create has no password protection on it, so you can then use tools like pdftk to merge the pdf files or otherwise manipulate them for easier reading / printing.

Stopping Firefox / Iceweasel from calling Google

In Firefox or Iceweasel, the default behaviour when a URL is mistyped and not valid (like htp:// is to call Google and give it what you typed in. Now, that might not always be what you want. Maybe you just want to be told that it's an invalid url and you can correct it yourself. This is probably most likely to happen if you've set up keywords and then typed in the keyword wrongly in the url bar. Google doesn't need to know that. So how to fix it?

The setting to control this behaviour is hidden in the configuration page, so go to the url about:config and search for "keyword" - you'll see two entries. Firstly, keyword.enabled is a boolean flag, which is true if the feature is enabled, and false if it's disabled. Then, keyword.URL gives the URL to use if the flag is true (some url by default).

So to disable the feature, double-click the keyword.enabled setting to change it to false, and that's it! Now, when you mistype your keyword url, you'll just get a simple message "The URL is not valid" if it can't be parsed.

Exporting emails as txt from Thunderbird / Icedove

Surprisingly, there is no feature in Thunderbird to let you save your emails as a text file. Saving as text is a useful way to keep a simple backup of your emails which you can easily grep (along with other files) and can easily backup. Plus it saves a lot of space over Thunderbird's mbox format because you don't need to save all the attachments and you don't need all the verbose header information. Ok, so how can you save as text? Using a suitable plugin.

For Thunderbird 2 there was a plugin called "SmartSave", but this has apparently been discontinued and no longer works with the Thunderbird 3 series. Step in another plugin called "ImportExportTools" from Kaosmos, to fill the gap. Installation is simple as a regular user, just go to your Add-ons in Thunderbird and select the .xpi file. Then restart and you get a neat set of options in the right-click menu of the email folders. For what I wanted, it was simply "Export all messages in the folder" and then "Plain text format (one file)". Select the directory and done. It's a slight shame there's no customisable separator between the emails, and it would be nice if the little red "stop export" button disappeared when the export finishes, but no complaints, it does the trick.

Exporting emails as txt from Gmail

Gmail doesn't provide an option (that I can find) to export your emails in any format, but you can enable POP access and configure Thunderbird to access Gmail. Then you can use the trick above to export from Thunderbird to text file using a plugin.

The snag is that for some unknown reason, certain emails or their attachments cause Gmail's POP access to fall over, and halts the download. The only (awkward) fix seems to be to try to determine which email it is which is causing the problem (itself not an easy task), delete that email, and try again accessing the emails from Thunderbird until it works. Then you can restore the deleted emails in Gmail back to your Inbox so that they're not actually deleted.

So how do you find the emails which bother Gmail? The error message (something like "RETR command did not succeed", "Unable to retrieve") includes a message id like "rfc822msgid:<>" - and this longalphanumericstring identifies the email. Except this string isn't shown in the email so it's not obvious which one it is. The trick is to enter the whole string, including the rfc822 and the angled brackets, into the search box of Gmail and then the culprit will be shown. Select, delete and retry from Thunderbird. Wash, rinse, repeat, until you've got all your emails. Yes, it's a royal pain, and really seems to be a gmail problem. Why it can't just ignore the attachment or ignore that email, I've no idea.

And what next with the deleted emails? Well, if they're important, you can copy-and-paste the contents from the emails into your text file, but again that's a bit of a pain to get them in the right place and remove the additional cruft which appears in the browser window. Or you can just forget them. But if you don't want to lose them from Gmail then don't forget to undelete them!