Originally published by Jun 6, 2017
In Lightweight Arduino Library for ROHM Sensor Evaluation Kit, I introduced RohmMultiSensor – Arduino library that allows you to easily interface with multiple sensors in the ROHM Sensor Evaluation Kit. One of the core features of this library is that the program size is noticeably minimized by only compiling the parts of the library that contain the code specific for the sensor you want to use. This means that when you use less sensors, the overall program size and memory usage will be smaller. But, how exactly does that happen? And what is really going on behind the scenes when you #include a library and then press the “Upload” button?
Almost everyone who has ever used Arduino has used a library. This is one of the reasons why Arduino programming is so easy for beginners – you don’t need to have a deep understanding of how a sensor works; libraries will do most of the work for you. Dividing code into separate files is also a good programming practice. It is much easier to organize, debug and maintain multiple, split files, than a huge blob of code.
Arduino beginners should already be familiar with #include which “adds” the library to the main sketch. To understand how exactly this happens, we first have to take a quick look at how C/C++ source code is compiled into a program. Don’t worry, it sounds much more complicated than it actually is. Let’s take a look at how compilation works.
Let’s do a quick experiment first: start your Arduino IDE, open one of the example codes (e.g. “Blink”) and press the “Verify” button. Assuming there are no syntactic errors in the program, the console at the bottom should print out some information about the program size and memory. Well, you just successfully compiled C++ source code into a binary. While compiling, several things happened:
Now we have a basic idea what really goes into compiling Arduino sketch, but out of all the compilation stages described above, we’re going to only focus on the second one: the preprocessor.
In the above text, I mentioned that preprocessor is essentially very simple: it just takes in text input, searches from some keywords, does some operations according to what it finds, and then outputs a different text. Even though it’s very simple, it’s also extremely powerful, because it allows you to do things that would otherwise be very complicated – if not impossible – in plain C/C++ language.
The preprocessor works by searching for lines that start with the hash sign (#) and have some text after it. This is called preprocessor directive and it’s a sort of “command” for the preprocessor. The full list of all supported directives with detailed documentation can be found here:
https://gcc.gnu.org/onlinedocs/cpp/Index-of-Directives.html#Index-of-Directives.
In the following text, I will focus mainly on #include, #define and conditional directives, since these are the most useful on Arduino, but if you want to know more about some more “exotic” directives, like #assert or #pragma, this is the place to get official information.
This is probably the best-known preprocessor directive, not only amongst Arduino enthusiasts, but in C/C++ programming in general. The reason is simple: it’s used to include libraries. But how exactly does it happen? The exact syntax is as follows:
#include <file>
or
#include "file"
The difference between the two is subtle and mainly comes down to where exactly the preprocessor searches for the file. In the first case, the preprocessor searches only in directories specified by the IDE. In the second case, the preprocessor first looks through the folder containing the source, and only if the file isn’t there, it moves on the same directories it would search in the first case. Since the folder containing libraries is specified in Arduino IDE, there’s no major difference between the two when including a library.
When the preprocessor finds the file, it simply copy-pastes the contents into the source code in place of the #include directive. However, if no such file can be found in any of the directories, it will raise a fatal error and stop the compilation.
It’s important to remember that preprocessor only works with text – it doesn’t really understand what all those fancy letters and numbers mean. And most importantly, it does zero higher-level checks on what was included and how many times. Let’s take a look what can happen if you use incorrectly written library.
#include <ExampleLibrary.h> void setup() { } #include <ExampleLibrary.h> void loop() { }
There’s really not much going on in the Arduino sketch. Notice that we’re including a file called “ExampleLibrary.h”, and that we’re including it twice.
//This is an example library int a = 0; //End of example library
And this is what’s inside “ExampleLibrary.h”. Again, not much going on, except for one integer variable. So what happens when we try to compile this Arduino sketch?
The error shows that the variable a is being declared twice, which causes the compilation to fail. This is what the source code looks like after preprocessor is done.
//This is an example library int a = 0; //End of example library void setup() { } //This is an example library int a = 0; //End of example library void loop() { }
It is obvious now that no library should get included more than once, but how do you achieve this without having to rely on the user? The standard solution is to wrap the entire library in the following construct:
#ifndef _EXAMPLE_LIBRARY_H #define _EXAMPLE_LIBRARY_H //This is an example library int a = 0; //End of example library #endif
Now, when the library is included for the first time, the preprocessor checks whether there is something defined with the name “_EXAMPLE_LIBRARY_H”. Since nothing like that exists yet, it proceeds on the next line and defines a constant called “_EXAMPLE_LIBRARY_H”. Then, the library code is copied into the sketch.
When including the library for the second time, the preprocessor checks for constant named “_EXAMPLE_LIBRARY_H” again. This time however, the constant is already defined from the previous #include, so nothing is added to the sketch. Now, the compilation finishes successfully. The #ifdef and #endif are conditional directives, which will be discussed later on.
In the previous example, we used the #define directive to create a constant, which determined whether to include a library or not. In the official documentation, anything defined by #define directive is called macro, so I will stick to that terminology in this article. The syntax of this directive is following:
#define macro_name macro_body
Most Arduino beginners are somewhat confused by macros. If I define a macro like the following:
#define X 10
what is the exact difference from declaring some variable like below?
int Y = 10;
Again, it all comes down to the fact that preprocessor only works with text. When it finds the #define directive, it will search the rest of the source code and replace all occurrences of “X” with “10”. This means that unlike variables, the value of macro will never change. Also, you have to keep in mind that preprocessor only searches the rest of the source code after the line with #define on it. Let’s see what happens when we try to use a macro before it is defined.
int Y = X; #define X 10 int Z = X; void setup() { } void loop() { }
Compiling the above code will give us the following error:
The code after preprocessing will look like this:
int Y = X; int Z = 10; void setup() { } void loop() { }
The first line contains X, which is interpreted as a variable, however, that variable was never declared, so the compilation stops.
Even though the most common use of of #define directive is to create named constants, it can do much more than just that. For example, let’s say you want to know which of two given numbers is smaller. You could write a function that will do just that.
int min(int a, int b) {
if(a < b) {
return(a);
}
return(b);
}
Or in a simpler way with a ternary operator:
int min(int a, int b) {
return((a < b) ? a : b);
}
However, both of these functions will be compiled and will take up precious program storage space. We can achieve the same effect with the following function-like macro, that will take much less program space.
#ifndef MIN
#define MIN(A, B) (((A) < (B)) ? (A) : (B))
#endif
Now, each occurrence of “MIN(A, B)”, will be replaced with “(((A) < (B)) ? (A) : (B))” where “A” and “B” can be either a number, or a variable. Notice that the #define is wrapped in the same protective construct that prevents user from defining the macro twice.
When creating macros, you have to keep in mind that once again, they are processed as text. That’s why in the definition above, pretty much everything is wrapped in brackets. Try and guess the result of the following operation.
#ifndef MULTIPLY #define MULTIPLY(A, B) A * B #endif //some code... int result = MULTIPLY(2 - 0, 3);
The value of result should be 6, since 2 – 0 is 2 and 2 * 3 is 6 right? What if I told you, that the result will be 2? The following is what actually gets compiled:
int result = 2 - 0 * 3;
Since multiplication has priority over subtraction, it is obvious now that the result has to be 2, because 3 * 0 is 0 and 2 – 0 is still 2. The correct version will look like this:
#ifndef MULTIPLY
#define MULTIPLY(A, B) ((A) * (B))
#endif
In the previous examples, I used #ifndef a directive, which allowed me to check, whether library was already included. This directive can be used to achieve something that would be impossible in the terms of C/C++ language only: conditional syntax. These directives have the following syntax:
#if expression //compile this code #elif different_expression //compile this different code #else //compile this entirely different code #endif
The most common way to use conditional syntax is to check whether a macro has been defined. To do that you can use several specialized directives:
#ifndef macro_name //compile this code if macro_name does not exist #endif
We’re already familiar with the above, since we used this directive to check whether a library was already included. You can also use this condition:
#ifdef macro_name //compile this code if macro_name exists #endif
The above is a just shorthand for #if defined, which can be used to test multiple macros in a single condition. Note that every condition has to end with #endif directive, to specify which parts of the code are affected by the condition, and which ones are not.
Let’s take a look at a practical example. Suppose you have written a library and you want it to work correctly on both Arduino UNO and Arduino Mega. This seems like a good idea, right? Portable code is always easier to use than having to adapt already working library for a different board. But what if , for example, your library uses the SPI bus? This bus is located on pins 11 – 13 on Arduino UNO, but on Mega, it is moved to pins 50 – 52.
How can you tell the compiler that it should use the correct pins, no matter what board you’re currently uploading into? You guessed it – conditional syntax! Based on what board you have selected in the Arduino IDE in the “Tools” > “Board” menu, the IDE will define different macros, allowing you to select parts of code that will only compile for a specific board! This is incredibly powerful because it allows you to do something like this:
#if defined(__AVR_ATmega168__) || defined(__AVR_ATmega328P__) //this will compile for Arduino UNO, Pro and older boards int _sck = 13; int _miso = 12; int _mosi = 11; #elif defined(__AVR_ATmega1280__) || defined(__AVR_ATmega2560__) //this will compile for Arduino Mega int _sck = 52; int _miso = 50; int _mosi = 51; #endif
See the beauty of it? With only three lines of code, we have made a multi-platform, portable library! On a side note, this is exactly how the RohmMultiSensor library (from Lightweight Arduino Library for ROHM Sensor Evaluation Kit) knows which parts of code should be compiled for each selected sensor. If you take a peek inside the header file RohmMultiSensor.h, you will only see several #ifdef and some #include directives. Since all sensor-specific code is stored in separate .cpp files, it is easy to add new sensors to the library – just create another file, then create the same #ifdef – #include – #endif structure the other sensors use. Done!
The last directives we’re going to cover are #warning and #error. Both of them are really self-explanatory, so here’s the syntax:
#warning "message"
and
#error "message"
When preprocessor finds these directives, it will print the message into the Arduino IDE console. The difference between the two is that after #warning, compilation proceeds as normal, while #error stops the compilation altogether.
We can use this in our previous example:
#if defined(__AVR_ATmega168__) || defined(__AVR_ATmega328P__) //this will compile for Arduino UNO, Pro and older boards int _sck = 13; int _miso = 12; int _mosi = 11; #elif defined(__AVR_ATmega1280__) || defined(__AVR_ATmega2560__) //this will compile for Arduino Mega int _sck = 52; int _miso = 50; int _mosi = 51; #else #error “Unsupported board selected!” #endif
This way, when the user tries to compile the library for some other Arduino board (e.g. Yún, LilyPad, etc.) the compilation will fail, rather than not defining the SPI pins at all.
This concludes our brief tour through the depths of C/C++ preprocessor. I hope the terms like compilation, preprocessor, or directive seem at least a bit less scary than they did before you read the article. Let me just sum up the most important points I tried to explain in this article:
Would you like to find out more about Arduino DIY projects? Why not check out some of our other great articles, such as: