Thursday, December 27, 2012

C++ : argc and argv

So far, all the programs we have written can be run with a single command. For example, if we compile an executable called myprog, we can run it from within the same directory with the following command at the GNU/Linux command line:
./myprog
However, what if you want to pass information from the command line to the program you are running? Consider a more complex program like GCC. To compile the hypothetical myprog executable, we type something like the following at the command
line:
gcc -o myprog myprog.c
The character strings -o, myprog, and myprog.c are all arguments to the gcc command.
(Technically gcc is an argument as well, as we shall see.)
Command-line arguments are very useful. After all, C++ functions wouldn't be very useful if you couldn't ever pass arguments to them -- adding the ability to pass arguments to programs makes them that much more useful. In fact, all the arguments you pass on the command line end up as arguments to the main function in your program.
Up until now, the skeletons we have used for our C++ programs have looked something like this:

#include <iostream>
int main()
{
    return 0;
}

From now on, our examples may look a bit more like this:

#include <iostream>

int main (int argc, char *argv[])
{
    return 0;
}

As you can see, main now has arguments. The name of the variable argc stands for "argument count"; argc contains the number of arguments passed to the program. The name of the variable argv stands for "argument vector". A vector is a one-dimensional array, and argv is a one-dimensional array of strings. Each string is one of the arguments that was passed to the program.
For example, the command line
gcc -o myprog myprog.c
would result in the following values internal to GCC:
argc = 4
argv[0] = gcc
argv[1] = -o
argv[2] = myprog
argv[3] = myprog.c
As you can see, the first argument (argv[0]) is the name by which the program was called, in this case gcc. Thus, there will always be at least one argument to a program, and argc will always be at least 1.
The following program accepts any number of command-line arguments and prints
them out:

#include <iostream>

using namespace std;

int main (int argc, char *argv[])
{
    int count;
    cout<<"This program was called with \"\".\n"<<argv[0];
    if (argc > 1)
    {
        for (count = 1; count < argc; count++)
        {
            cout<<"argv[%d] = \n"<< count<< argv[count];
        }
    }
    else
    {
        cout<<"The command had no other arguments.\n";
    }
    return 0;
}

If you name your executable fubar, and call it with the command ./fubar a b c, it
will print out the following text:
This program was called with "./fubar".
argv[1] = a
argv[2] = b
argv[3] = c

Now for the explanation.

Let's say your program is named prog, and you execute it with: prog -ab -c Hello World. You
want to be able to parse the arguments to say that options a, b and c were specified,
and Helloand World are the non-option arguments.
argv is of type char **—remember that an array parameter in a function is the same as a pointer.
At program invocation, things look like this:

Here, argc is 5, and argv[argc] is NULL. At the beginning, argv[0] is a char * containing the string "prog".
In (*++argv)[0], because of the parentheses, argv is incremented first, and then dereferenced. The effect of the increment is to move that argv ----------> arrow "one block down", to point to the1. The effect of dereferencing is to get a pointer to the first commandline argument, -ab. Finally, we take the first character ([0] in (*++argv)[0]) of this string, and test it to see if it is '-', because that denotes the start of an option.

For the second construct, we actually want to walk down the string pointed to by the current argv[0]pointer. So, we need to treat argv[0] as a pointer, ignore its first character (that is '-' as we just tested), and look at the other characters: ++(argv[0]) will increment argv[0], to get a pointer to the first non- - character, and dereferencing it will give us the value of that character. So we get *++(argv[0]). But since in C, []binds more
tightly than ++, we can actually get rid of the parentheses and get our expression as*++argv[0]. We
want to continue processing this character until it's 0 (the last character box in each of the rows in the above picture).
The expression c = *++argv[0] assigns to c the value of the current option, and has the value c. while(c) is a shorthand forwhile(c != 0), so the while(c = *++argv[0]) line is basically assigning the value of the
current option to c and testing it to see if we have reached the end of the current command-line argument.
At the end of this loop, argv will point to the first non-option argument:

No comments:

Post a Comment