In computer programming, particularly in the C and C++ programming languages, a header file or include file is a file, usually in the form of source code, that is automatically included in another source file by the compiler. Typically, header files are included via compiler directives at the beginning (or head) of the other source file.
A header file commonly contains forward declarations of classes, subroutines, variables, and other identifiers. Identifiers that need to be declared in more than one source file can be placed in one header file, which is then included whenever its contents are required.
In the C and C++ programming languages, standard library functions are traditionally declared in header files; see C standard library and C++ standard library for examples.
Motivation
In most modern computer programming languages, programs can be broken up into smaller components such as classes and subroutines, and those components can be distributed among many translation units (typically in the form of physical source files), which are compiled separately. Once a subroutine needs to be used somewhere other than translation unit where it's defined, the concept of forward declarations or function prototypes must be introduced. For example, a function defined in this way in one source file:
int add(int a, int b)
{
return a + b;
}
may be declared (with a function prototype) and then referred to in a second source file, thus:
extern int add(int, int);
int triple(int x)
{
return add(x, add(x, x));
}
However, this simplistic approach requires that the programmer maintain the function declaration for add in two places — in the file containing its implementation and in the file where it's used. If the definition of the function ever changes, the programmer must remember to update all the prototypes scattered across the program, as well. This is necessary, because both C and C++ implementations are not required to diagnose all violations of what in C++ is called the one definition rule. Actually most of them relies on linker to do this. The linker, however, is typically very limited in its knowledge of types used in a program. This leads to some ODR violations that can not be caught by the language implementation. As result, it's a programmer's responsibility to keep all declarations that cross translation unit boundary coherent. Manually searching for all declarations of the same external entity and verifying that they are compatible is a tedious task. Please note that C does not define the term one definition rule - it is C++ specific. However, if declarations of the same entity in many C source files are different - the application is incorrect and results in undefined behavior, no matter how to call the rule you are violating.
To understand what is an ODR violation consider the following (correct) example:
/* File print-heading.c */
#include <stdio.h>
void print_heading()
{
printf("standard heading\n");
}
/* File main.c */
void print_heading();
int main()
{
print_heading();
}
The translation unit represented by the "main.c" source file references the "print_heading()" function defined in another translation unit (print-heading.c). According to the rules of C99, an external function must be declared before the first use. To meet this requirement, the "main.c" declares the function in the first line. This version of program works correctly.
Later the programmer who maintains the "print-heading" source file decides to make the function more flexible and support custom headings. This may be implemented like this:
/* File print-heading.c */
#include <stdio.h>
void print_heading(char const heading)
{
printf("%s\n", heading);
}
If the programmer forgets to update the declaration in "main.c", the results are very bad. The "print_heading()" function expects argument and accesses its value. However, the "main()" does not supply any value. In runtime executing this program leads to undefined behavior - the application may print garbage, terminate unexpectedly or perform whatever possible on the current platform.
Why does this code compile and link successfully? This happens because the compiler relies on a declaration in "main.c" when compiling the "main.c" translation unit. And that declaration is consistent with the use of the function. Later, when the linker combines compiled "main.c" and "print-heading.c" translation units (in most implementations they are represented as "main.o" or "main.obj" files), it probably could detect the inconsistency - but not in C. In C implementations functions are referenced by names at object and binary file levels, this does not include return value of argument list. The linker encounters a reference to "print_heading()" in "main.o" and finds suitable function in "print-heading.o". At this stage, all information about function argument types is lost. How to manage multiple declarations successfully?
Header files provide the solution. A module's header file declares each function, object, and data type that is part of the public interface of the module — for example, in this case the header file would include only the declaration of add. Each source file that refers to add uses the #include directive to bring in the header file:
/* File add.h */
#ifndef ADD_H
#define ADD_H
int add(int, int);
#endif /* ADD_H */
/* File triple.c */
#include "add.h"
int triple(int x)
{
return add(x, add(x, x));
}
This reduces the maintenance burden: when a definition is changed, only a single copy of the declaration must be updated (the one in the header file). The header file may also be included in the source file that contains the corresponding definitions, giving the compiler an opportunity to check the declaration and the definition for consistency.
/* File add.c */
#include "add.h"
int add(int a, int b)
{
return a + b;
}
Typically, header files are used to specify only interfaces, and usually provide at least a small amount of documentation explaining how to use the components declared in the file. As in this example, the implementations of subroutines are left in a separate source file, which continues to be compiled separately. (One common exception in C and C++ is inline functions, which are often included in header files because most implementations cannot properly expand inline functions without seeing their definitions at compile time.)
Alternatives
Header files are not the only solution to the problem of accessing identifiers declared in different files. They have the disadvantage that it may still be necessary to make changes in two places (a source file and a header file) whenever a definition changes. Some newer languages (such as Java) dispense with header files and instead use a naming scheme that allows the compiler to locate the source files associated with interfaces and class implementations. In such languages the ODR problem is typically solved by two techniques: first, the compiler puts full information about types into compiled code and this information remains accessible even at run-time when the program executes. Second, Java and other modern languages are able to verify the number and types of arguments at method call. This is not for free, results in space and execution time overhead, that is not acceptable for some time-critical applications.
See also
External links
|