Starting C programming with Linux

 

Basic Checks

To start programming on Linux, you need a Linux installation, which has development packages installed. If you are not familiar with Linux installation, I would recommend getting help from friends/teachers or PLUG members.

Following example illustrates how to check existence of development tools. You should get a similar output.

$ which gcc
/usr/bin/gcc
$ which g++
/usr/bin/g++
$ which make
/usr/bin/make
$ which vi
/usr/bin/vi
$ which pico
/usr/bin/pico

If you have afore-mentioned tools available, you are set to start. The last two, vi and pico are text editors. You can do with any text editor of your choice. However if you don’t know where to look for, above are two very basic choices. Personally, I use kedit which is bundled with KDE.

C/C++ Compiler

The C compiler on Linux is a part of compiler suite, known as GCC(GNU Compiler Collection). This suite offers compilers for several languages. Following is a list of typical ones offered.

• C
• C++
• Objective C
• Fortran

The name of C compiler program on Linux is gcc and C++ compiler is called as g++. You can find out the version of the compiler using –version option.

$ gcc --version
gcc (GCC) 3.3.4
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Command-line Options

The command line options to the C/C++ compiler control their behavior. There are large number of options available. The compiler documentation has the detailed explanation but I will cover the commonly used ones.

A compiler option is denoted by ‘-’ such as -s. This is different than Dos/Windows where the options are denoted by ‘/’. In our first program, you can see that we have already made use of -o option.

Note that these options are case-sensitive.

-c
This option instructs the compiler to just compile the file and produce an object file, instead of creating a program. Creating a program is the default behavior.
This is typically used when a program or a library consist of more than one source files and/or the sources spread across multiple modules. The object files produced has a .o extension rather than .obj as with Dos/Windows.

-o <name>
This option instructs the compiler to produce the target with a specific name, overriding the default name.

When a source is compiled into an object file, the extension changes to .o e.g. A source file hello.c will produce an object file named hello.o. However you can change the name of object file using this option.

Following commands demonstrate this usage.

$ gcc -c -o hello1.o hello.c
$ ls -al hello1.o
-rw-r--r-- 1 shridhar users 844 2004-09-05 20:36 hello1.o

If this option is not specified, the name of program produced is always a.out. This is different than Dos/Windows where a source file hello.c results into a program with name hello.exe. Hence on Linux, this option is almost always used while producing the program.

-O<n>
This option instructs compiler to produce optimized programs. Here n denotes the level of optimization. This is an optional argument. The level of optimization range from 1 to 3. The most commonly used optimization level is 2.

There are several options that control specific optimizations. This option is a convenient way of specifying a group of most commonly used. More details on these options are available in compiler documentation.

-g

This options instructs compilers to produce a program with debug information included. Unless a program is compiled with this option, it can not be debugged.

If a program is created from more than one object files, all of the object files must be compiled with this flag, so that the entire program can be debugged.

-s
This option instructs the compiler to remove any symbol and object relocation information from the program. This is used to reduce the size of program and runtime overhead.

Coupled with -O2, these options produced programs that are used in production. Note that this option should not be used in conjunction with -g as it removes the debugging information as well.

-I <directory name>

This option instructs compiler to add the specified directory to include search path. Compiler will search in directory when it is looking for header files included by the programs.

By default, the compiler search in directory /usr/include and hence it need not be specified. The #include directive in source can take relative form. Let us say a program has a line such as follows.

#include <sys/socket.h>

Then the compiler will match the file /usr/include/sys/socket.h. Compiler will attempt to find a file among all the include directory path specified before throwing an error.

For C++, /usr/include/c++/<version> is the additional default include directory, where standard library headers for C++ are located. Here version is the compiler version. So on my machine, it translates to
the directory /usr/include/c++/3.3.4/.

-L <directory name>
This option instructs the compiler to add the specified directory to library search path. This option is actually used by linker. However since linker is invoked via compiler in most cases, this option is passed to the compiler. The compiler passes it to the linker.

By default compiler searches the directory /usr/lib. I will describe libraries in more details later.

-l <library name>
This option instructs the compiler to link against the specified library. This option follows a specific naming convention. The library name specified does not include library name suffix or prefix. E.g. To link against a library libnurses.so, one has to specify -lncurses since lib and .so are standard library
suffix/prefix.

Libraries

A library is like a executable program in that it contains the compiled code in machine specific assembly language. It differs from a program in that libraries are collection of reusable code. They are no meant to be run like a normal programs.

Using libraries in programs

To use libraries with a program, one needs corresponding header files and libraries. The header files are included in code. The compiler option -I, explained above, tells compiler where to find these specific headers. The compiler can obtain function declarations from these header files. After compilation, the compiler produces object files which has empty slots for functions/symbols1 declared in library header files. Later linker fills in these slots.

To produce a program, compiler invokes linker with appropriate linker directories and libraries. The linker puts all the object files together in the final program. It creates a list of empty slots from all the object files. Then it searches for these symbols in libraries specified. For each symbol found in libraries , it marks the library as a dependency.

If it can not find a symbol in any of libraries specified, it throws a Undefined symbols error. It means that the linker could not find any library which contains the symbol definition. Necessary libraries need to be specified so as to get the program linked successfully.

Types of libraries

There are two types of libraries, shared and static. They differ in how the compiled code is reused by the programs.

Static Libraries
When a program is linked against a static library, the linker copies the symbol definition i.e. the code for function implementation into the resulting program. Hence the program does not need the library installed in order to run. This results in a bigger program size at the cost of ease of installation. Furthermore to take advantage of newer version of a library, the program must be recompiled and reinstalled.

By convention, static libraries has a lib prefix and .a extension. Thus a libmyprog.a is file name for library myprog. While compiling/linking, only the library name is passed and not the complete file name as linker can locate the file from the library name.

Shared Libraries
On the other hand, while linking against a shared library, the linker marks the symbols in shared library as external. While running the program, the runtime linker searches through installed libraries for necessary library and the required symbol definition in the library. If either the library or the necessary symbol definition is not found, a runtime error is thrown and program execution is aborted.

Using shared libraries, the program size can be kept to minimum. If more than one programs are using same library, only one copy of library is loaded saving memory at the runtime. This is not possible with static libraries. If the installed library is upgraded, all the programs depending upon it get the benefit of newer version.

The standard prefix for shared libraries is lib and file extension is .so. Thus a library myprog will have the file name libmyprog.so

Published in: on September 12, 2007 at 1:17 pm Leave a Comment

The URI to TrackBack this entry is: http://mitchi.wordpress.com/2007/09/12/starting-c-programming-with-linux/trackback/

RSS feed for comments on this post.

Leave a Comment