Chapter 4. The NASM Preprocessor

Table of Contents

4.1. Single-Line Macros
4.1.1. The Normal Way: %define
4.1.2. Enhancing %define: %xdefine
4.1.3. Concatenating Single Line Macro Tokens: %+
4.1.4. Undefining macros: %undef
4.1.5. Preprocessor Variables: %assign
4.2. String Handling in Macros
4.2.1. String Length: %strlen
4.2.2. Sub-strings: %substr
4.3. Multi-Line Macros
4.3.1. Overloading Multi-Line Macros
4.3.2. Macro-Local Labels
4.3.3. Greedy Macro Parameters
4.3.4. Default Macro Parameters
4.3.5. %0: Macro Parameter Counter
4.3.6. %rotate: Rotating Macro Parameters
4.3.7. Concatenating Macro Parameters
4.3.8. Condition Codes as Macro Parameters
4.3.9. Disabling Listing Expansion
4.4. Conditional Assembly
4.4.1. %ifdef: Testing Single-Line Macro Existence
4.4.2. %ifmacro: Testing Multi-Line Macro Existence
4.4.3. %ifctx: Testing the Context Stack
4.4.4. %if: Testing Arbitrary Numeric Expressions
4.4.5. %ifidn and %ifidni: Testing Exact Text Identity
4.4.6. %ifid, %ifnum, %ifstr: Testing Token Types
4.4.7. %error: Reporting User-Defined Errors
4.5. Preprocessor Loops
4.6. Including Other Files
4.7. The Context Stack
4.7.1. %push and %pop: Creating and Removing Contexts
4.7.2. Context-Local Labels
4.7.3. Context-Local Single-Line Macros
4.7.4. %repl: Renaming a Context
4.7.5. Example Use of the Context Stack: Block IFs
4.8. Standard Macros
4.8.1. __YASM_MAJOR__, etc: Yasm Version
4.8.2. __FILE__ and __LINE__: File Name and Line Number
4.8.3. __YASM_OBJFMT__ and __OUTPUT_FORMAT__: Output Object Format Keyword
4.8.4. STRUC and ENDSTRUC: Declaring Structure Data Types
4.8.5. ISTRUC, AT and IEND: Declaring Instances of Structures
4.8.6. ALIGN and ALIGNB: Data Alignment

NASM contains a powerful macro processor, which supports conditional assembly, multi-level file inclusion, two forms of macro (single-line and multi-line), and a context stack mechanism for extra macro power. Preprocessor directives all begin with a % sign.

The preprocessor collapses all lines which end with a backslash (\) character into a single line. Thus:

%define THIS_VERY_LONG_MACRO_NAME_IS_DEFINED_TO \
          THIS_VALUE

will work like a single-line macro without the backslash-newline sequence.

4.1. Single-Line Macros

4.1.1. The Normal Way: %define

Single-line macros are defined using the %define preprocessor directive. The definitions work in a similar way to C; so you can do things like

%define ctrl    0x1F &
%define param(a,b) ((a)+(a)*(b))

        mov     byte [param(2,ebx)], ctrl 'D'

which will expand to

        mov     byte [(2)+(2)*(ebx)], 0x1F & 'D'

When the expansion of a single-line macro contains tokens which invoke another macro, the expansion is performed at invocation time, not at definition time. Thus the code

%define a(x)    1+b(x)
%define b(x)    2*x

        mov     ax,a(8)

will evaluate in the expected way to mov ax,1+2*8, even though the macro b wasn’t defined at the time of definition of a.

Macros defined with %define are case sensitive: after %define foo bar, only foo will expand to bar: Foo or FOO will not. By using %idefine instead of %define (the i stands for insensitive) you can define all the case variants of a macro at once, so that %idefine foo bar would cause foo, Foo, FOO, fOO and so on all to expand to bar.

There is a mechanism which detects when a macro call has occurred as a result of a previous expansion of the same macro, to guard against circular references and infinite loops. If this happens, the preprocessor will only expand the first occurrence of the macro. Hence, if you code

%define a(x)    1+a(x)

        mov     ax,a(3)

the macro a(3) will expand once, becoming 1+a(3), and will then expand no further. This behaviour can be useful.

You can overload single-line macros: if you write

%define foo(x)   1+x
%define foo(x,y) 1+x*y

the preprocessor will be able to handle both types of macro call, by counting the parameters you pass; so foo(3) will become 1+3 whereas foo(ebx,2) will become 1+ebx*2. However, if you define

%define foo bar

then no other definition of foo will be accepted: a macro with no parameters prohibits the definition of the same name as a macro with parameters, and vice versa.

This doesn’t prevent single-line macros being redefined: you can perfectly well define a macro with

%define foo bar

and then re-define it later in the same source file with

%define foo baz

Then everywhere the macro foo is invoked, it will be expanded according to the most recent definition. This is particularly useful when defining single-line macros with %assign (see Section 4.1.5).

You can pre-define single-line macros using the -D option on the Yasm command line: see Section 1.3.3.1.

4.1.2. Enhancing %define: %xdefine

To have a reference to an embedded single-line macro resolved at the time that it is embedded, as opposed to when the calling macro is expanded, you need a different mechanism to the one offered by %define. The solution is to use %xdefine, or its case-insensitive counterpart %xidefine.

Suppose you have the following code:

%define  isTrue  1
%define  isFalse isTrue
%define  isTrue  0

val1:    db      isFalse

%define  isTrue  1

val2:    db      isFalse

In this case, val1 is equal to 0, and val2 is equal to 1. This is because, when a single-line macro is defined using %define, it is expanded only when it is called. As isFalse expands to isTrue, the expansion will be the current value of isTrue. The first time it is called that is 0, and the second time it is 1.

If you wanted isFalse to expand to the value assigned to the embedded macro isTrue at the time that isFalse was defined, you need to change the above code to use %xdefine.

%xdefine isTrue  1
%xdefine isFalse isTrue
%xdefine isTrue  0

val1:    db      isFalse

%xdefine isTrue  1

val2:    db      isFalse

Now, each time that isFalse is called, it expands to 1, as that is what the embedded macro isTrue expanded to at the time that isFalse was defined.

4.1.3. Concatenating Single Line Macro Tokens: %+

Individual tokens in single line macros can be concatenated, to produce longer tokens for later processing. This can be useful if there are several similar macros that perform similar functions.

As an example, consider the following:

%define BDASTART 400h                ; Start of BIOS data area

struc   tBIOSDA                      ; its structure
        .COM1addr       RESW    1
        .COM2addr       RESW    1
        ; ..and so on
endstruc

Now, if we need to access the elements of tBIOSDA in different places, we can end up with:

        mov     ax,BDASTART + tBIOSDA.COM1addr
        mov     bx,BDASTART + tBIOSDA.COM2addr

This will become pretty ugly (and tedious) if used in many places, and can be reduced in size significantly by using the following macro:

; Macro to access BIOS variables by their names (from tBDA):

%define BDA(x)  BDASTART + tBIOSDA. %+ x

Now the above code can be written as:

        mov     ax,BDA(COM1addr)
        mov     bx,BDA(COM2addr)

Using this feature, we can simplify references to a lot of macros (and, in turn, reduce typing errors).

4.1.4. Undefining macros: %undef

Single-line macros can be removed with the %undef command. For example, the following sequence:

%define foo bar
%undef  foo

        mov     eax, foo

will expand to the instruction mov eax, foo, since after %undef the macro foo is no longer defined.

Macros that would otherwise be pre-defined can be undefined on the command-line using the -U option on the Yasm command line: see Section 1.3.3.5.

4.1.5. Preprocessor Variables: %assign

An alternative way to define single-line macros is by means of the %assign command (and its case-insensitive counterpart %iassign, which differs from %assign in exactly the same way that %idefine differs from %define).

%assign is used to define single-line macros which take no parameters and have a numeric value. This value can be specified in the form of an expression, and it will be evaluated once, when the %assign directive is processed.

Like %define, macros defined using %assign can be re-defined later, so you can do things like

%assign i i+1

to increment the numeric value of a macro.

%assign is useful for controlling the termination of %rep preprocessor loops: see Section 4.5 for an example of this.

The expression passed to %assign is a critical expression (see Section 3.8), and must also evaluate to a pure number (rather than a relocatable reference such as a code or data address, or anything involving a register).