New article – Python’s optparse for human beings
In this article I cover in depth Python’s optparse module. It presents most useful recipes and is a good handbook when parsing command line options in a Python program.
In this article I cover in depth Python’s optparse module. It presents most useful recipes and is a good handbook when parsing command line options in a Python program.
This article extends optparse‘s documentation. optparse is a Python’s module that allows your program to easily parse command line options it receives. In addition, it takes care of some of the very common tasks, such as handling -h command line option.
optparse is one of those modules that are an absolutely must have for almost every project. However, because we start new project so seldom, it is difficult to remember all those recipes that we come up with, every time we add a new command line option. This article is an attempt to bring together such recipes.
optparse‘s official documentation lacks some very important information. On the contrary to official documentation, in this document I am trying to give you more hands-on kind of documentation. This article is a cookbook of things that you do with optparse most often.
Every time we use optparse, we use it to do a number of things. Obviously we want our program to support command line options. What varies is the type of options we want to support. Some options doesn’t have additional arguments (boolean options). Others are mandatory. Some require one or more arguments. Finally, we may want options to depend on each other – i.e. we may want one option to depend on presence of another.
These are most common things that we would like to implement, but there are more. Lets try to build a list of all such functionalities that we may want to implement.
However, before we dig into implementation details of each and every one of these bullets, lets see the basics of using optparse.
No matter what you do with optparse, it all starts with importing optparse and instantiating OptionParser class.
import optparse parser = optparse.OptionParser()
To add an option, we should use OptionParser‘s method called add_option(). It accepts a large number of parameters. For now, we will see only the most basic. We will see more advanced parameters later in this article. Note that we should call add_option() for every option that we would like our program to support.
Obviously the most important parameter tells what command line option we would like to support. Lets say that for sake of this article we would like our little Python script to support -n command line option. Also, we would like -n to have longer sibling –new. –new and -n would have the same meaning, but one is shorter and the other longer and more verbose.
This is how we add these options.
parser.add_option('-n', '--new')
Note that by all means this is not enough to be able to parse even simplest case of command line options. This is only the basics.
One more thing that is very common when using optparse is to give a short help string that would tell user what this particular command line option does. optparse will print this string if user runs your program with -h.
To specify such help string pass parameter named help to add_option(). Like this:
parser.add_option('-n', '--new', help='creates a new object')
Once we’ve added all the options we want our program to support, we should tell optparse that it is time to do actual parsing. This is how we do it.
(opts, args) = parser.parse_args()
parse_args() method returns a tuple of objects. First object, opts, contains all values we’ve received via command line. We will learn how to use this later in the article.
args is a list of words that contain everything that left after options that parser recognizes. This is useful if you want your program to support indefinite number of arguments – like in cp or mv Unix commands, where you specify options first and then a long list of files to copy/move, followed by a destination directory.
This is how our script looks like so far.
#!/usr/bin/python import optparse parser = optparse.OptionParser() parser.add_option('-n', '--new', help='creates a new object') (opts, args) = parser.parse_args()
Running it with -h command line option, would produce following result.
alex ~/works/args -> ./args.py -h Usage: args.py [options] Options: -h, --help show this help message and exit -n NEW, --new=NEW creates a new object alex ~/works/args ->
args.py is the name I had given to the script. Take this little nice help screen. Note that we didn’t do a thing to make it appear. This is where power of optparse begins.
What we saw until now were common things that you do with optparse to operate it. No matter what options you would like your program to support and what are relationships between the options, you have to instantiate OptionParser and have to call parse_args() method.
However, few methods that we’ve seen so far, are useless by themselves. We use them to do more specific things, which we will study in this section of the article.
I call such options boolean options because eventually we want some boolean variable to indicate if optparse saw the option or not.
For the sake of the demonstration, lets say we want our script to support -b option. When this option is specified, we want some variable to be True. This is how we do it.
parser.add_option('-b', help='boolean option', dest='bool', \ default=False, action='store_true') (opts, args) = parser.parse_args()
Note three new named parameters that we’re passing to add_option(). dest specifies the name of the variable that will hold True or False value once optparse runs into -b. default specifies default value of the variable, in case optparse doesn’t find -b. Finally action tells optparse what to do when it runs into -b. store_true action tells optparse to place True value into bool once it detects -b.
Additional action that we may want to use called store_false. It tells optparse to set a value of bool to False. Note that when we use store_false, we better change default value to something other than False – otherwise you won’t be able to see the difference between when -b is there and it is not.
Once parse_args() finishes, you can access the variable via opts.bool. You can drop the default value of the variable. In this case, if you don’t specify -b command line option, value of opts.bool will be None.
optparse itself doesn’t support mandatory options. As optparse‘s documentation states, mandatory options are a bad practice and should be avoided. However, I think that at the end, it is your choice to make and modules such as optparse should give you the tools to do things that you want to do. This said, this is what I usually do to have a mandatory option.
For a single variable, just don’t set default value for it and check if opts.<variable name> is None, after opt_parse(). Like this:
parser.add_option('-m', help='mandatory option', dest='man', action='store_true') (opts, args) = parser.parse_args() if opts.man is None: print "A mandatory option is missing\n" parser.print_help() exit(-1)
Obviously, when a mandatory option is missing, we want to do something about it. This can be anything of course, but most likely you want to tell about the mistake and print a help message. This is exactly what happens in lines 7-9.
But what if you have multiple mandatory options? Well, we can do exactly the same, but for several options. Lets have a look.
parser.add_option('-m', help='mandatory option', dest='man', action='store_true') parser.add_option('-p', help='mandatory option', dest='pan', action='store_true') (opts, args) = parser.parse_args() # Making sure all mandatory options appeared. mandatories = ['man', 'pan'] for m in mandatories: if not opts.__dict__[m]: print "mandatory option is missing\n" parser.print_help() exit(-1)
Here, we have two mandatory options: -m and -p. In lines 9-14, I build a list of all mandatory options (represented by their destination property name), run through the list and see if any of them is None. If so, it prints an error message, help message and returns.
With optparse we can parse options with an argument and even several arguments. This is how we do it.
parser.add_option('-s', help='arguments', dest='opt_args', \ action='store')
This example is somewhat different from what we’ve seen before. Here, use an action named store. This action, tells optparse to store option’s argument in specified destination member of opts.
How about having an option with two or three arguments. It is doable as well.
parser.add_option('-M', help='multiple arguments', dest='multi', \ action='store', nargs=2)
As you can see, you can tell optparse how many arguments you want some option to support. To do this, we pass nargs parameter to add_option(). Its default value is 1. This is why we could omit it when we registered option with a single argument. However, if we need two or more options, nargs is a must.
In this case parse_args() will place all arguments in a tuple. I.e. after we run parse_args(), opt.multi will be a tuple containing all arguments that user has passed to our program.
By default, all members of opts object will be strings. I.e. when you specify a destination for your argument, its type is string, by default. However, it doesn’t have to be this way. You can change the type of the destination variable for the argument by passing add_option() a parameter named type. Note that default type is string. Despite you can always convert string into anything you want, you can ask optparse to do it for you. This is how.
parser.add_option('-s', help='arguments', dest='opt_args', \ action='store', type='int', default=10) parser.add_option('-M', help='multiple arguments', dest='multi', \ action='store', nargs=2, type='string')
Here we’ve specified int type for -s and string type for -M. Note default value of -s. It should be of the same type as option itself.
Also note -M. Although optparse turns two arguments into a tuple, it does not support different types for several arguments. So, when an option has multiple arguments, their type should be the same.
As with mandatory options, optparse does not give you much help setting dependencies between various options. If you still want to have some relationship between options, you have to implement it manually, after parse_args() is over.
Luckily this is not very difficult thing to do. Remember how we’ve implemented mandatory options? We can do the same here.
parser.add_option('-A', help='option A', dest='a', action='store_true') parser.add_option('-B', help='option B', dest='b', action='store_true') (opts, args) = parser.parse_args() if opts.b and not opts.a: print "Option B requires option A\n" parser.print_help() exit(-1)
Here, option B requires option A. If B is there, but A is missing, script will produce an error message, print help screen and exit.
optparse has a neat feature allowing you to group options. You can create as many groups as you want. Each group will have a short description that would appear on help screen. Also, you can use option groups to group options inside of your program, making it easier to understand and more readable. This is how you do it.
First, we create an option parser as we usually do.
parser = optparse.OptionParser()
Next we create a new options group.
group1 = optparse.OptionGroup(parser, 'Options group 1') group2 = optparse.OptionGroup(parser, 'Options group 2')
Note that to create an option group, we have to specify parser that we will use and a short description of the options in this option group. The description will later appear in the help screen.
Now we should add the actual options. This time however, we add them to groups instead of adding them to parser.
group1.add_option('-t', help='group 1 option', dest='t', \ action='store') group2.add_option('-d', help='group 2 option', dest='d', \ action='store')
add_option() method of class OptionGroup is exactly the same as add_option() method of class OptionParser.
Finally, we add groups to parsers and call parse_options().
parser.add_option_group(group1) parser.add_option_group(group2) (args, opts) = parser.parse_args()
Now, lets have a look at the help screen.
alex ~/works/args -> ./args.py -h Usage: args.py [options] Options: -h, --help show this help message and exit Option group 1: -t T group 1 option Option group 2: -d D group 2 option alex ~/works/args ->
See how -t and -d standout, each in its own group.
You can specify your own usage string. This is the first line of help screen. You do this by passing usage parameter to OptionParser constructor.
parser = optparse.OptionParser(usage='Usage: %prog <options>')
If we run this code, this is how our help screen would look like.
alex ~/works/args --> ./args.py -h Usage: args.py <options> . . .
As you can see, you can use %prog mnemonics inside of usage string. It will be substituted with the name of the program.
Another thing that you can do is specify default value of the option in its help screen. To do that you can use %default mnemonics inside of help string of an option. Like this:
parser.add_option('-w', default='hello', dest='t', \ action='store', help='this one has a default value [%default]')
This will make help line in help screen for this option to look like this:
-w T this one has a default value [hello]
parser.add_option('-s', help='single argument', dest='single', \ action='store') parser.add_option('-M', help='multiple arguments', dest='multi', \ action='store', nargs=2)
Lets have a look at how the help screen would look like for two options above.
-s SINGLE single argument -M MULTI multiple arguments
Not a pretty sight. See how optparse described parameters that each option receives. In fact, optparse take this description from name of the destination argument. To generate such description, it converts its name into a string and changes it to upper case.
This may somewhat work for an option that receives one argument. But it certainly will not work for an option that receives mutliple arguments, such as -M. Luckily, there’s a solution for this problem. add_option() receives a parameter named metavar. It tells optparse how to describe option argument in help screen. So, instead of calling add_option() the way we did it for -s and -M, we should call it this way:
parser.add_option('-s', help='single argument', dest='single', \ action='store', metavar='<ARG>') parser.add_option('-M', help='multiple arguments', dest='multi', \ action='store', metavar='<ARG1> <ARG2>', nargs=2)
This makes the help message for these two options look like this:
-s <ARG> single argument -M <ARG1> <ARG2> multiple arguments
Now this is much better.
You can have a description of what your program does. When having it, it will appear in help screen between usage line and options description. To do that, pass description argument to OptionParser constructor, when creating OptionParser object. Like this:
desc="""This is a description of %prog. optparse uses Python's textwrap module to format the text, so don't bother adding new line characters, as optparse will prettify your description in its own way.""" parser = optparse.OptionParser(description=desc) parser.add_option('-s', help='single argument', dest='single', \ action='store') (args, opts) = parser.parse_args()
Running this little program with -h, will produce following help screen.
alex ~/works/args --> ./args.py -h Usage: args.py [options] This is a description of args.py. optparse uses Python's textwrap module to format the text, so don't bother adding new line characters, as optparse will prettify your description in its own way. Options: -h, --help show this help message and exit -s SINGLE single argument alex ~/works/args -->
Note how optparse reformatted the description string. It uses Python’s textwrap module to format your description, producing nice, 80 character long lines of text.
Also note that as with usage parameter, you can use %prog mnemonics here. It will be substituted with name of your program.
Epilog will appear after options description. To specify epilog, pass epilog parameter to OptionParser‘s constructor. Note that as with description string, optparse will prettify the text you use with textwrap.
Like help screen, optparse can also generate version string, responding to –version option. Unlike with help screen where optparse did most of the job, here optparse merely prints what you’ve specified as version string.
To specify version string, pass version parameter to OptionParser‘s constructor. You can use %prog mnemonics inside of version string. Here’s an example.
parser = optparse.OptionParser(version='%prog version 1.0') (args, opts) = parser.parse_args()
Running this script with –version option, produces following result.
alex ~/works/args --> ./args.py --version args.py version 1.0 alex ~/works/args -->
optparse produce version string only if you run it with –version option. If you want to your program to print its version with -v option, you will have to add -v option manually and then call print_version() method of OptionParser, to produce the version string.
There is a bunch of other things that you can do with optparse. You can have your own actions, your own formatter – an object that formats the help screen. You can use callbacks that get called when optparse runs into certain option. However, I think I’ve covered 99.9% of what you may need. In case I missed something, send me an email to alex@alexonlinux.com.
This article introduces signals in Linux to the reader. It explains the nature of signals, how to use them and gives few small use examples.
Perhaps any engineer developing for Linux encounters this problem. What’s is the right way to terminate the program? What are the ways to receive notifications from operating system about events that occur.
Traditional Unix systems have the answers ready. The answer to these questions is signals.
This article addresses these questions. Here, I’ll try to explain what signals are, their nature. We’ll talk about what are the right ways to handle signals, what signals to handle and what are the pitfalls of signal handling in Linux in particular.
Signal is a notification, a message sent by either operating system or some application to your program (or one of its threads).
Each signal identified by a number, from 1 to 31. Signals don’t carry any argument and their names are mostly self explanatory. For instance SIGKILL or signal number 9 tells the program that someone tries to kill it.
In addition to informative nature of signals, they also interrupt your program. I.e to handle a signal, one of the threads in your program, stops its execution and temporarily switches to signal handler. Note that as in version 2.6 of Linux kernel, most of the signals interrupt only one thread and not the entire application as it used to be once. Moreover, signal handler itself can be interrupted by some other signal.
Each one of signals can be in one of three states:
When manipulating signals and managing signal configuration, it is often easier to manage a so called signal mask. It is a bit-mask, where each bit has a corresponding signal. There are 32 (actually 31, 0 doesn’t count) different signals, thus we can use single 32-bit integer (unsigned int) to keep information about 32 signals. This is exactly what operating system does. Moreover, signal masks used as arguments in different system calls, thus we will have to work with signal masks.
The C library assigns default signal handlers. This means that even if you leave signals untouched, your program will process signals and will respond to them according to default behavior. I will describe default signal behavior a little later in this article.
Signals, as their name implies, used to signal something. There are several types of signals, each indicating something of its own. For instance SIGINT that I already mentioned, tells your program that someone tries to interrupt it with CTRL-C.
Dedication of each signal is a matter of semantics. I.e. you may want to decide what action shall be associated with each one of the signals. You may decide that some signal will cause your program to print something or draw something on the screen. It is up to you, most of the time. However, there is a common convention of what each and every signal should do. According to this common convention SIGINT expected to cause your program to terminate itself. This is the default response for SIGINT signal and it is in your interest to keep it this way. It is a question of usability. No one wants a program that cannot be interrupted.
Another way of using signals is to indicate that that something bad have happened. For instance when your program causes a segmentation fault, operating system sends SIGSEGV signal to your application.
Signals have several different usages. For instance debuggers rely on signals to receive events about programs that being debugged (read more about this in my article How Debugger Works). Signals is one of so called IPC – Inter Process Communication mechanisms. IPC used to, as the abbreviation implies, to allow processes communicate with one another.
Another common use is when user wishes that our program will reinitialize itself, but not terminate. In this case, user can send our program a signal from the terminal, using a program called kill. You may be already familiar with this program. It used to kill processes. The truth is that it sends a signal. Each signal has a number that identifies it. By default it sends signal 15, SIGTERM, but it can send just any signal.
Lets see most common and their use.
In general I think it would be a good advice to avoid changing signal handler for these signals. Default signal handler for these signals generates core file. Later, you can use core file to analyze the problem and perhaps find a solution. Overwriting signal handler for one of the exception signals, will cause your program to ignore this signal and an exception that has caused the signal. This is something that you don’t want to do.
In case you still want to handle exception signals, read my How to handle SIGSEGV, but also generate a core dump article.
These two signals are special. You cannot change how your program handles these two.
SIGKILL, on the contrary to SIGTERM, indicates abnormal termination of the program. You cannot change how your program handles it. It will always terminate your program. However, you can send this signal.
SIGKILL’s value is 9. This is why kill -9 <pid> shell command is so effective – it sends SIGKILL signal to the process.
SIGSTOP used when debugging. When you debug your program, operating system sends SIGSTOP to stop your program, for instance in case it reaches a breakpoint. Operating system does not let you change its handler because you may cause your program to be undebuggable.
There are several interfaces that allow you to register your own signal handler.
This is the oldest one. It accepts two arguments, first signal number (one of those SIGsomething) and second pointer to a signal handler function. Signal handler function returns void and accepts single integer argument that represents a signal number that has been sent. This way you can use the same signal handler function for several different signals.
Here is a short code snippet demonstrating how to use it.
#include <stdio.h> #include <stdlib.h> #include <signal.h> void sig_handler(int signum) { printf("Received signal %d\n", signum); } int main() { signal(SIGINT, sig_handler); sleep(10); // This is your chance to press CTRL-C return 0; }
This nice and small application registers its own SIGINT signal. Try compiling this small program. See what is happening when you run it and press CTRL-C.
Using signal() you can set default signal handler for certain signal to be used. You can also tell the system that you would like to ignore certain signal. To ignore the signal, specify SIG_IGN as a signal handler. To restore default signal handler, specify SIG_DFL as signal handler.
Although this seems to be everything you may need, it is better to avoid using signal(). There’s a portability problem with this system call. I.e. it behaves differently on different operating systems. There’s a newer system call that does everything signal() does and also gives slightly more information about the actual signal, its origin, etc.
sigaction() is another system call that manipulates signal handler. It is much more advanced comparing to good old signal(). Let us take a look at its declaration
int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact);
Its first argument specifies a signal number. Second and third arguments are pointers to structure called sigaction. This structure specifies how process should handle given signal.
struct sigaction { void (*sa_handler)(int signum); void (*sa_sigaction)(int signum, siginfo_t *siginfo, void *uctx); sigset_t sa_mask; int sa_flags; void (*sa_restorer)(void); };
sa_handler is a pointer to the signal handler routine. The routine accepts single integer number containing signal number that it handles and returns void – same as signal handler registered by signal(). In addition, sigaction() let you have more advanced signal handler routine. If needed sa_sigaction pointer should point to the advanced signal handler routine. This one receives much more information about the origin of the signal.
To use sa_sigaction routine, make sure to set SA_SIGINFO flag in sa_flags member of struct sigaction. Similarily to sa_handler, sa_sigaction receives an integer telling it what signal has been triggered. In addition it receives a pointer to structure called siginfo_t. It describes the origin of the signal. For instance, si_pid member of siginfo_t holds the process ID of the process that has sent the signal. There are several other fields that tell you lots of useful information about the signal. You can find all the details on sigaction‘s manual page (man sigaction).
Last argument received by sa_sigaction handler is a pointer to ucontext_t. This type different from architecture to architecture. My advice to you is to ignore this pointer, unless you are writing a new debugger.
One additional advantage of sigaction() compared to signal() is that it allows you to tell operating system what signals can handle signal you are registering. I.e. it gives you full control over what signals can arrive, while your program handling another signal.
To tell this, you should manipulate sa_mask member of the struct sigaction. Note that is a sigset_t field. sigset_t type represents signal masks. To manipulate signal masks, use one of the following functions:
To conclude, I would like to show a small program that demonstrates sigaction() in use. I would like the program to register signal handler for SIGTERM and then, when it receives the signal, print some information about the origin of the signal.
On the other hand, I will use Python interpretor to send the program a signal.
Here is a program.
#include <stdio.h> #include <signal.h> #include <string.h> #include <unistd.h> struct sigaction act; void sighandler(int signum, siginfo_t *info, void *ptr) { printf("Received signal %d\n", signum); printf("Signal originates from process %lu\n", (unsigned long)info->si_pid); } int main() { printf("I am %lu\n", (unsigned long)getpid()); memset(&act, 0, sizeof(act)); act.sa_sigaction = sighandler; act.sa_flags = SA_SIGINFO; sigaction(SIGTERM, &act, NULL); // Waiting for CTRL+C... sleep(100); return 0; }
And this is what happens when we try to run it. First we run the program.
~/works/sigs --> ./a.out I am 18074
While it sleeps I ran Python shell and killed it.
~ --> python Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> import signal >>> print os.getpid() 18075 >>> os.kill(18074, signal.SIGTERM) >>>
Here’s the rest of the output of the program.
~/lab/sigs --> ./a.out I am 18074 Received signal 15 Signal originates from process 18075 ~/lab/sigs -->
We can see that it recognized the process that has killed it and printed its process ID.
I hope you found this article interesting. If you have questions, don’t hesitate to email me to alex@alexonlinux.com.
Just wanted to share with you. My web-site is currently in the middle of crisis. Recent changes in the web-site caused its Google ranks to drop so dramatically that I barely see visitors.
I think adding another domain caused most of the problem. Having two domains pointing to the same web-site created so called duplicate content, in Google’s vocabulary. It appears that Google has specific rules for adding new domain to the web-site. Obviously I didn’t follow them and now number of visitors viewing this web-site has dropped to one third of what it used to be.
Apparently, having two domains pointing to the same web-site is OK, but you should have one primary domain and the rest pointing to original with 301 redirection (some HTTP jargon). Unfortunately, I learned about this a little too late.
Anyway, I think I fixed everything that could have caused a problem. As far as I understood, it takes them couple of days (weeks?) to note changes so the crisis should be over soon.
This whole damn thing is so new to me, so if you have a word of advice to a novice web-master such as myself, please spit it out
This article explains how to backup and restore your Linux installation. One thing special about method described in this article, is that it allows you to preserve disk space – you compress and decompress the data on the fly, so you need much less disk space to create the backup and store it.
Backing up
Restoring backup up disk
Restore backed up partition
Backing up Linux installation and restoring it, is perhaps one of the most fundamental tasks that every system administrator has to deal with.
Here are some important points about backup methods that we will discuss in this article.
If these bullets talk to you, then you’re reading the right article.
The actual command that does the backup is quiet simple. However, before backing up we have to decide what to backup. Here are our options.
What is good for you depends on a structure of a partition table on the hard disk that contains Linux installation.
One common configuration (and my favorite) is when you have only two partitions on the hard disk, one for Linux installation and the other for swap. In this case it is probably wiser to backup your entire hard disk. Indeed, hard disk space occupied by swap partition will be wasted, but on the other hand restoring Linux installation from full hard disk backup is much easier. Easier means less commands to execute to restore the installation and this usually translates into smaller chances of unsuccessful data restoration.
Another common configuration is when you have several partitions on the hard disk. In this case, we may want to backup only the partition that contains Linux installation. The command that does the backup is nearly the same, in this case. But restoring the data will be more difficult. The chances are that you will not have any problem whatsoever, so don’t let those couple of extra commands to scare you.
Bottom line is that it depends on how much information you want to backup. If your hard disk, in addition to Linux installation, occupied by some data, it is probably wiser to backup only the Linux partition. In any case I’ll demonstrate you how to backup both, entire hard disk and the partition.
Second thing you have to figure out is the device file that represents hard disk or partition that you want to backup. Usually, mount command without any arguments should give you the answer. Have a look:
alex ~ -> mount /dev/sda1 on / type ext3 (rw,relatime,errors=remount-ro) tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755) /proc on /proc type proc (rw,noexec,nosuid,nodev) sysfs on /sys type sysfs (rw,noexec,nosuid,nodev) varrun on /var/run type tmpfs (rw,nosuid,mode=0755) varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777) udev on /dev type tmpfs (rw,mode=0755) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev) devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620) alex ~ ->
mount command, without any arguments, gives you a list of mounted file-systems and corresponding device files. For each file-system, it tells you <device file> on <directory name> type <fs type>. Since we are backing up Linux installation, we are looking for device file that mounted on root directory (/). In my case this is /dev/sda1.
Note that usually device files with number and the end represent a single partition. If you want to know device file of the entire hard disk, strip the number off. For instance, /dev/sda1 represents first partition on /dev/sda.
In case you’re still not sure what is the correct device file, device files that start with /dev/sd usually stand for SCSI and USB disks. Device files that start with /dev/hd stand for IDE disks. And remember, if you want to backup Linux partition, use device file whose name ends with a number. For entire hard disk, use device file whose name ends with a letter.
Before you make actual backup, you have to make sure that the hard disk or a partition you are backing up is not in use by any programs. This is important because if your hard disk is in the middle of something while being backed up, this will inevitably create a data corruption when you restore it. Depending on what kind of things you do with your hard disk, the corruption can be quiet serious.
The best way to avoid a corruption is to boot from a Live Linux CD (Knoppix, or Ubuntu Desktop installation CD). If you know that your hard disk is not heavily used, you may try to backup live, mounted hard disk. In this case, try to bring down services that may use the hard disk while its being backed up.
If you’re backing up mounted hard disk or a partition, run sync before doing the backup. This will flush the data from memory buffers to disks (Linux keeps portions of information from hard disks in memory to speed up hard disk access), reducing chances of data corruption even further.
Prepare the media that will contain the backup. This may be NFS mount or another hard disk. Just make sure it is accessible.
One of the issues with backups is that usually you need as much disk space for the backup as the size of the hard disk or a partition you’re backing up. Obviously, you can compress the backup, but normally you still first create backup and then compress it.
Luckily, there’s a way to backup and compress data on the fly, so that you need as much free space as a compressed backup will occupy. This is how you do it:
alex ~ -> dd if=/dev/sda | bzip2 -c > /media/sdb/sda.bz2
Lets try to understand how it works.
First we use dd to read data from /dev/sda. When you read something from a device file, Linux kernel returns you the content of the actual disk. Same thing happens if you read from device file that represents a partition. Only this time, Linux returns you the content of the partition.
Usually we run dd with at least two arguments: if specifies input file, and of specifies output file. It appears that if skipping one of them, dd will use standard output or standard input instead. Hence dd command we use here, sends contents of the device to standard output.
Output of dd command, being sent via pipe to bzip2 command. bzip2 is a compression program. When called with -c command line switch, it compresses whatever it reads from its standard input and sends the result to its standard output. This is why we send bzip2‘ output to /media/sdb/sda.bz2 file. This is the file that will contain the backup once the command finishes.
Note that the file should not be on the disk or partition that we backup. For that reason, I mounted additional hard disk to /media/sdb directory.
First of all, you will need a Live Linux CD. Knoppix CD or Ubuntu LiveCD should do the job. It is needed because you need a Linux to restore the backup, but you cannot use Linux installed on your hard drive because, well, you are going to erase it with a restored backup.
When restoring the Linux installation from backup, your life will be so much easier if you use same hard disk and partition. Even if it is a different hard disk, try replacing old one with the new one and not add it as additional disk. The reason for this is device files allocation.
When Linux kernel detects several hard disks in your computer, it allocate device files for them. Device files allocation are /dev/sda for first disk, /dev/sdb for second disk, etc. When you install Linux on one of them, it will reference itself by its device file. But here is the catch. Name of the device file that represent certain hard disk is position dependent. I.e. if you switch in places first and second disks in your computer, disk that has been represented by sda until now will be represented by sdb and disk that has been represented by sdb will be represented by sda. But, internally, Linux installation on any of them would still refer to itself by old name. As a result, you probably won’t be able to boot your system.
Bottom line is that you have to make sure that hard disk that you restore your Linux installation to, being represented by the same device file as before. I.e. if when you created the backup, your Linux installation was on /dev/sda, make sure that now, when you restore it, you are restoring to device that will be represented by /dev/sda when you boot your system.
Boot from your Linux CD. Once there, mount device that has the backup. Figure out what device file will hold your Linux installation. Remember that if you add new disks, device files that represent your old disks may shift.
Now to the actual command.
This is actually the simplest case.
First, we start with mounting the disk that contains the backup. In my case this is /dev/sdb.
knoppix@Microknoppix:~$ mount /media/sdb
Next we restore the data.
knoppix@Microknoppix:~$ bzip2 -c -d /media/sdb/sda.bz2 | dd of=/dev/sda
First of all, I’ve been using Knoppix CD to restore the system. First command mounts /dev/sdb, the disk that contains the backup. Second command is the one that acutually restores the Linux installation.
As with backup command, it doesn’t use extra disk space. It extracts the archive that contains the installation and writes it to the disk, all on the fly. You are already familiar with bzip2‘s -c command line switch. It tells bzip2 to use standard input or output. -d command line switch tells it to decompress the data. Because of -c, it redirect its output to dd that picks the data up and writes it to /dev/sda.
You can boot your restored system right after the above command finishes.
This is a little trickier. Things are very easy when restoring entire hard disk. You don’t have to care about partition table and boot loader. Your backup contains everything you need for a happy and working Linux system. All you have to do is to write the data to the disk and you are done.
When restoring single partition, you have to take care of everything. You have to create partition table. You have to install boot loader. However, despite obvious drawbacks, backing up and then restoring single partition has some very nice advantages compared to entire disk backup.
First, backing up a single partition requires less disk space. Second you have an option to restore your installation to a larger partition. Lets an example session that demonstrates both data restoration, partition resizing and boot loader installation.
Once again, we start with mounting the disk that contains the backup.
knoppix@Microknoppix:~$ mount /media/sdb
Demonstrating how to create a partition table is slightly out of scope of this article, so lets assume that we already have partition table ready. We will restore the backup to /dev/sda1.
knoppix@Microknoppix:~$ bzip2 -c -d /media/sdb/sda1.bz2 | dd of=/dev/sda1
Now the data is where it should be. Lets see how we can resize the partition.
For the sake of this article, I’ve been experimenting with a small VMware based computer. So I use rather small disks. In this case, partition I’d backed up is 8GB long, but /dev/sda1 is 10GB long. So after we’ve restored the data on the partition, we have to resize the file-system on it, to utlize entire partition. Otherwise, it would think that it is still 8GB long.
knoppix@Microknoppix:~$ resize2fs /dev/sda1 resize2fs 1.41.3 (12-Oct-2008) Please run 'e2fsck -f /dev/sda1' first. knoppix@Microknoppix:~$ fsck -y -f /dev/sda1 . . . knoppix@Microknoppix:~$ resize2fs /dev/sda1 resize2fs 1.41.3 (12-Oct-2008) Resizing the filesystem on /dev/sda1 to 2409742 (4k) blocks. The filesystem on /dev/sda1 is now 2409742 blocks long.
As you can see, resize2fs, the tool I’ve been using to resize the file-system has asked me to run e2fsck first. e2fsck is a tool that checks file-systems for errors and fixes them – it is similar to chkdsk tool on Windows. Its output was rather long, so I skipped it – errors it has found are a result of me doing backup of a mounted file-system. Luckily, since I stopped all processes that may access the disk, when I backed it up, no data was corrupted and errors found by e2fsck were superficial.
Once fsck was over, I could do resize2fs. Note that without any arguments resize2fs resizes a file-system to a maximum available size – this is exactly what we wanted.
There’s one thing left to do. That is to install GRUB. In theory, with certain GRUB configurations you may skip this step and try to boot your restored Linux installation right away. However, based on my experience, you better do this step and make sure everything works no matter what configuration you have. So, here’s what we do.
We start with mounting the newly restored file-system. However, to be able to mount the partition, we have to create a temporary mount point directory. So, this is what we do.
knoppix@Microknoppix:~$ mkdir /tmp/sda1 knoppix@Microknoppix:~$ sudo mount /dev/sda1 /tmp/sda1/
And now we can install GRUB. This is how we do it.
knoppix@Microknoppix:~$ sudo grub-install --root-directory=/tmp/sda1 /dev/sda
This command installs GRUB on /dev/sda. –root-directory command line switch tells grub-install to use kernel images and configuration from specified directory. We want to tell grub-install to use kernel images and configuration from the Linux installation that we’ve just restored. This is why I specified /tmp/sda1 as root directory.
Finally, we want to unmount the device and reboot. This is what we do:
knoppix@Microknoppix:~$ umount /tmp/sda1 knoppix@Microknoppix:~$ sudo reboot
This is it. Now, if everything goes well, your system should boot into restored Linux installation. Just remember to remove the Live CD from the CD-ROM, before you boot.
I can’t beleive all the things that you can do with Python. Obviously, whatever you can do with Python, you can do with a whole bunch of programming languages out there. What I am really impressed with, is the ease.
Few weeks ago I was looking for a way to programatically send myself SMS on certain occasions. I was (and still) developing a system that would check for certain conditions on a server and alert me when something is wrong.
Eventually, I came up with a Python script, 40 lines long Note that you have to pay for the messages, but thats just few pennies. I bet there’s a way to plug into ICQ’s protocol and send SMS for free, but I don’t think you can do this with only 40 lines of code. Read the rest of this entry »
I planned this as a short post, but it ended up quiet long so I post it as I do with the rest of big articles.
This article explains and demonstrates how to create a new application that communicates over SSH. To the article.
As you know, you can use SSH for two things. First, there’s a remote access. You can get access to a command line on a remote computer. Second use is for transferring files. OpenSSH suite comes with a handy tool called scp which allows you to copy files to and from a remote computer over SSH. Files being transferred securely, without exposing their content to someone who may be sniffing our traffic. And of course there’s a WinSCP program that does the same on Windows.
For a long period of time I thought that this is it. OpenSSH comes sewed with two major features: remote access (ssh) and file transfer (scp). Adding a custom feature would require deep understanding of SSH and security, I thought. Apparently I was wrong. It appears that creating a new application that uses SSH as a communication channel is as easy as a pie. Read on.
One of the neat features of ssh is that it allows you to run commands on a remote computer. To do this, type in ssh command with its arguments as if you were connecting to a remote computer and append to it the command you would like to run. Like this:
alex ~/works/ssh -> ssh alex@192.168.1.67 df
This would run df on a remote computer – 192.168.1.67 in our case. Here’s something important. The output of df command will be transferred over SSH to our computer and we will see it as if we were running the command on our local computer.
Same thing will occur if we will run command that requires input on a remote computer. We will type in the arguments here, on our local computer, and ssh will transmit the input over encrypted with SSH channel to the remote computer and feed it to the commands running on the remote computer, as if the input was received from a keyboard on a remote computer. Take a look at the example session that demonstrates this:
alex ~ -> ssh alex@192.168.1.67 'read var; echo ${var} > file.txt' alex@192.168.1.67's password: hello world alex ~ -> ssh alex@192.168.1.67 ls alex@192.168.1.67's password: file.txt alex ~ -> ssh alex@192.168.1.67 cat file.txt alex@192.168.1.67's password: hello world alex ~ -> ssh alex@192.168.1.67 rm -f file.txt alex@192.168.1.67's password: alex ~/works/ssh ->
First command that I’ve entered reads a line of text and saves it in shell variable var. Then it writes the content of the variable into file named file.txt. Second command shows that file.txt was indeed created. Third command shows the content of the file: “hello world” in our case. Finally, the last command deletes the file.
Note that all these commands, their input and output being transferred over SSH, i.e. being encrypted.
Apparently, there’s nothing magical about scp. It does not encrypt files before transferring them. In fact it knows nothing about encryption. This is how it works.
It uses ssh to invoke an instance of itself on a remote computer. To let itself know that it is a remote instance, it uses two undocumented command line switches, -f and -t. Then it uses special protocol to communicate with a remote instance of itself. The communication protocol and the actual data, all being transferred over SSH. To make it happen, it uses ssh‘ input/output streams as a transport for sending and receiving data to/from remote computer.
Simple isn’t it? In fact, creating an application that runs on top of SSH and transfers and receives encrypted data, seems to be easier than even creating TCP/IP client-server with sockets. Lets try creating our own application.
Actually, transferring files to a remote computer is a very demonstrative example. So, I will demonstrate an application that sends a file to a remote computer and remote computer saves it on its local disk. I’ve written it in python because it is simpler for me, but you may write it in any programming language. The principles I’ve been using here, which I will explain later in the article, are universal and can be used with any programming language.
As with scp, I’ve created only one program. It has two modes of operation. When invoked without command line arguments, it assumes it is a local instance that should transmit the data. When invoked with command line switch -r, it assumes it is a remote copy and receives data saving it in file.
Lets see the code.
#!/usr/bin/python import os import sys import popen2 def BeRemote(): print "Being remote" f = open('/home/alex/works/ssh/data2.dat', 'w+b') for s in sys.stdin: f.write(s) f.close() def BeLocal(): print "Being local" child = popen2.Popen3('ssh alex@localhost' + '/home/alex/works/ssh/ssh_client.py -r') f = open('data.dat', 'rb') for s in f: child.tochild.write(s) f.close() # Shutting down remote client. # By closing its stdin, we're causing it to exit # its main loop. child.tochild.close() child.wait() if len(sys.argv) > 1 and sys.argv[1] == '-r': BeRemote() else: BeLocal()
The code starts in line 28 with checking if it has been invoked with command line switches. If it sees -r command line switch, it runs BeRemote() function. Otherwise it runs BeLocal() function.
BeRemote() is rather simple function. It creates a file named data2.dat in directory /home/alex/works/ssh/ (this is where I’ve been developing this project on my computer). Then it enters a loop where it writes everything it reads from standard input into the file (lines 10-11). It will continue saving the data as long as its standard input is open. Once it will be closed, it will close the file and exit (line 12).
BeLocal() is a little more complex. First it spawns ssh connecting to localhost with username alex. Now I know that it is expected to connect to a remote computer, but for the sake of demonstration lets pretend that localhost is a remote computer. It tells ssh to run program named ssh_client.py with -r command line swithc. ssh_client.py is the name of the script, so in essence it runs itself. -r switch tells it to run in remote mode, that is receiving mode.
It uses popen2 python module to spawn the ssh process. This is similar to using popen() function in C. Basically, it runs a process, opens pipe and redirects process’s input and output through the pipe. Our program sits on the other end of the pipe and uses it to transmit the data over SSH.
Next, the function opens the file (data.dat) we want to transmit and sends it through the pipe (lines 18-21).
Final step is more interesting. It has to tell BeRemote() when it has reached end of file. This could be done with a special communication protocol between remote and local instances of the program – this is what scp does. I did something simlier. Instead of notifying remote that it has reached end of file, I simply close one side of the pipe (line 25). This causes remote side to think it has reached end of file and exit. At the same time, local side waits until remote side will exit (line 26) and then exits itself.
Lets see our little python script in action.
alex ~/works/ssh -> ls -la total 3836 drwxr-xr-x 2 alex alex 4096 2009-03-22 22:36 ./ drwxr-xr-x 6 alex alex 4096 2009-03-22 00:46 ../ -rw-r--r-- 1 alex alex 4000000 2009-03-22 01:28 data.dat -rwxr-xr-x 1 alex alex 578 2009-03-22 22:36 ssh_client.py* alex ~/works/ssh -> ./ssh_client.py Being local alex@localhost's password: alex ~/works/ssh -> ls -la total 7836 drwxr-xr-x 2 alex alex 4096 2009-03-22 22:36 ./ drwxr-xr-x 6 alex alex 4096 2009-03-22 00:46 ../ -rw-r--r-- 1 alex alex 4000000 2009-03-22 22:36 data2.dat -rw-r--r-- 1 alex alex 4000000 2009-03-22 01:28 data.dat -rwxr-xr-x 1 alex alex 578 2009-03-22 22:36 ssh_client.py* alex ~/works/ssh -> md5sum data.dat 2e02ecd84be565fa22216e8398ff9b63 data.dat alex ~/works/ssh -> md5sum data2.dat 2e02ecd84be565fa22216e8398ff9b63 data2.dat alex ~/works/ssh ->
First I do ls -la to demonstrate that I have only two files in current directory – the data file data.dat and the script. Next I ran ./ssh_client.py. It tells us that it is a local side and asks us for a password. Actually, the password request comes from ssh. To avoid being asked for password, you can use identity files as I explained in my SSH crash course.
Then, it transfers the file and exits. After it exits I did ls -la once again and it clearly showed that we now have a new file named data2.dat. We can see that its size is the same as data.dat‘ size. Later I confirmed that this is indeed the same file with md5sum.
We saw how easy it is to create encrypted communication channel using SSH and ssh. I hope you’ve found this article useful and interesting. In case you have any questions, don’t hesitate to contact me at alex@alexonlinux.com.