Linking assembler code

Procedures and functions written in assembly language can be linked with Delphi programs or units using the $L compiler directive. The assembly language source file must be assembled into an object file (extension .OBJ) using an assembler like Turbo Assembler. Multiple object files can be linked with a program or unit through multiple $L directives.

Procedures and functions written in assembly language must be declared as

external in the Object Pascal program or unit. For example,

function LoCase(Ch: Char): Char; external;

In the corresponding assembly language source file, all procedures and functions must be placed in a segment named CODE or CSEG, or in a segment whose name ends in _TEXT. The names of the external procedures and functions must appear in PUBLIC directives.

You must ensure that an assembly language procedure or function matches its Object Pascal definition with respect to call model (near or far), number of parameters, types of parameters, and result type.

An assembly language source file can declare initialized variables in a segment named CONST or in a segment whose name ends in _DATA. It can declare uninitialized variables in a segment named DATA or DSEG, or in a segment whose name ends in _BSS. Such variables are private to the assembly language source file and can't be referenced from the Object Pascal program or unit. However, they reside in the same segment as the Object Pascal globals, and can be accessed through the DS segment register.

All procedures, functions, and variables declared in the Object Pascal program or unit, and the ones declared in the interface section of the used units, can be referenced from the assembly language source file through EXTRN directives.

Again, it's up to you to supply the correct type in the EXTRN definition.

When an object file appears in a $L directive, Delphi converts the file from the Intel relocatable object module format (.OBJ) to its own internal relocatable format. This conversion is possible only if certain rules are observed:

  • All procedures and functions must be placed in a segment named CODE

    or CSEG, or in a segment with a name that ends in _TEXT. All initialized private variables must be placed in a segment named CONST, or in a segment with a name that ends in _DATA. All uninitialized private variables must be placed in a segment named DATA or DSEG, or in a segment with a name that ends in

_BSS. All other segments are ignored, and so are GROUP directives. The segment definitions can specify BYTE or WORD alignment, but when linked, code segments are always byte aligned, and data segments are always word aligned. The segment definitions can optionally specify PUBLIC and a class name, both of which are ignored.

  • Delphi ignores any data for segments other than the code segment

    (CODE, CSEG, or xxxx_TEXT) and the initialized data segment (CONST or xxxx_DATA). So, when declaring variables in the uninitialized data segment (DATA, DSEG, or xxxx_BSS), always use a question mark (?) to specify the value, for instance:

Count DW ?

Buffer DB 128 DUP(?)

  • Byte-sized references to EXTRN symbols aren't allowed. For example,

    this means that the assembly language HIGH and LOW operators can't be used with EXTRN symbols.

Turbo Assembler and Delphi

Turbo Assembler (TASM) makes it easy to program routines in assembly language and interface them into your Delphi programs. Turbo Assembler provides simplified segmentation and language support for Object Pascal programmers.

The .MODEL directive specifies the memory model for an assembler module that uses simplified segmentation. For linking with Object Pascal programs, the .MODEL syntax looks like this:

.MODEL xxxx, PASCAL

xxxx is the memory model (usually this is large).

Specifying the language PASCAL in the .MODEL directive tells Turbo Assembler that the arguments were pushed onto the stack from left to right, in the order they were encountered in the source statement that called the procedure.

The PROC directive lets you define your parameters in the same order as they are defined in your Object Pascal program. If you are defining a function that returns a string, notice that the PROC directive has a RETURNS option that lets you access the temporary string pointer on the stack without affecting the number of parameter bytes added to the RET statement.

Here's an example coded to use the .MODEL and PROC directives:

.MODEL LARGE, PASCAL

.CODE

MyProc PROC FAR I : BYTE, J : BYTE RETURNS Result : DWORD

PUBLIC MyProc

LES

MOV MOV

DI, Result

AL, I

BL, J

;get address of temporary string

;get first parameter I

;get second parameter J

.
.
.
RET

The Object Pascal function definition would look like this:

function MyProc(I, J: Char): string; external;

For more information about interfacing Turbo Assembler with Delphi, refer to the

Turbo Assembler User's Guide.

Examples of assembly language routines

The following code is an example of a unit that implements two assembly language string-handling routines. The UpperCase function converts all characters in a string to uppercase, and the StringOf function returns a string of characters of a specified length.

unit Stringer;

interface

function UpperCase(S: String): String;

function StringOf(Ch: Char; Count: Byte): String;

implementation

{$L STRS}

function UpperCase; external; function StringOf; external; end.

The assembly language file that implements the UpperCase and StringOf routines is shown next. It must be assembled into a file called STRS.OBJ before the Stringer unit can be compiled. Note that the routines use the far call model because they are declared in the interface section of the unit. This example uses standard segmentation:

CODE

ASSUME

SEGMENT BYTE PUBLIC

CS:CODE

PUBLIC

UpperCase, StringOf

;Make them known

; function UpperCase(S: String): String

UpperRes

EQU

DWORD PTR [BP + 10]

UpperStr

EQU

DWORD PTR [BP + 6]

UpperCase PROC FAR

PUSH

BP

;Save BP

MOV

BP,

SP

;Set up stack frame

PUSH

DS

;Save DS

LDS

SI,

Upperstr

;Load string address

LES

DI,

Upperres

;Load result address

CLD

;Forward string-ops

LODSB

;Load string length

STOSB

;Copy to result

MOV

CL,

AL

;String length to CX

XOR

CH,

CH

JCXZ

U3

;Skip if empty string

U1:

LODSB

;Load character

CMP

AL,

'a'

;Skip if not 'a'..'z'

JB

U2

CMP

AL,

'z'

JA

U2

SUB

AL,

'a'-'A'

;Convert to uppercase

U2:

STOSB

;Store in result

LOOP

U1

;Loop for all characters

U3:

POP

DS

;Restore DS

POP

BP

;Restore BP

RET

4

;Remove parameter and return

UpperCase ENDP

; procedure StringOf(var S: String; Ch: Char; Count: Byte) StrOfS EQU DWORD PTR [BP + 10]

StrOfChar EQU BYTE PTR [BP + 8] StrOfCount EQU BYTE PTR [BP + 6]

StringOf PROC FAR

PUSH BP ;Save BP

MOV BP, SP ;Set up stack frame

LES DI, StrOfRes ;Load result address

MOV AL, StrOfCount ;Load count

CLD ;Forward string-ops

STOSB ;Store length

MOV CL, AL ;Count to CX XOR CH, CH

MOV AL, StrOfChar ;Load character

REP STOSB ;Store string of characters

POP BP ;Restore BP

RET 8 ;Remove parameters and return StringOf ENDP

CODE ENDS

END

To assemble the example and compile the unit, use the following commands:

TASM STRS

BPC stringer

Assembly language methods

Method implementations written in assembly language can be linked with Delphi programs using the $L compiler directive and the external reserved word. The declaration of an external method in an object type is no different than that of a normal method; however, the implementation of the method lists only the method header followed by the reserved word external.

In an assembly language source text, an @ is used instead of a period (.) to write qualified identifiers (the period already has a different meaning in assembly language and can't be part of an identifier). For example, the Object Pascal identifier Rect. Init is written as Rect@Init in assembly language. The @ syntax can be used to declare both PUBLIC and EXTRN identifiers.

Inline machine code

For very short assembly language subroutines, Delphi's inline statements and directives are very convenient. They let you insert machine code instructions directly into the program or unit text instead of through an object file.

Inline statements

An inline statement consists of the reserved word inline followed by one or more inline elements, separated by slashes and enclosed in parentheses:

inline (10/$2345/Count + 1/Data - Offset);

Linking assembler code - 图1Here's the syntax of an inline statement:

Each inline element consists of an optional size specifier, < or >, and a constant or a variable identifier, followed by zero or more offset specifiers (see the syntax that follows). An offset specifier consists of a + or a - followed by a constant.

Linking assembler code - 图2

Each inline element generates 1 byte or 1 word of code. The value is computed from the value of the first constant or the offset of the variable identifier, to which is added or subtracted the value of each of the constants that follow it.

An inline element generates 1 byte of code if it consists of constants only and if its value is within the 8-bit range (0..255). If the value is outside the 8-bit range or if the inline element refers to a variable, 1 word of code is generated (least-significant byte first).

The < and > operators can be used to override the automatic size selection we described earlier. If an inline element starts with a < operator, only the least- significant byte of the value is coded, even if it's a 16-bit value. If an inline element starts with a > operator, a word is always coded, even though the most-significant byte is 0. For example, the statement

inline (<$1234/>$44);

generates 3 bytes of code: $34, $44, $00.

The value of a variable identifier in an inline element is the offset address of the variable within its base segment. The base segment of global variables--variables declared at the outermost level in a program or a unit--and typed constants is the data segment, which is accessible through the DS register. The base segment of local variables--variables declared within the current subprogram--is the stack segment. In this case the variable offset is relative to the BP register, which automatically causes the stack segment to be selected. Registers BP, SP, SS, and DS must be preserved by inline statements; all other registers can be modified.

The following example of an inline statement generates machine code for storing a specified number of words of data in a specified variable. When called, procedure FillWord stores Count words of the value Data in memory, starting at the first byte occupied by Dest.

procedure FillWord(var Dest; Count, Data: Word);

begin

inline (

$C4/$BE/Dest/ { LES DI,Dest[BP] }

$8B/$8E/Count/ { MOV CX,Count[BP] }

$8B/$86/Data/ { MOV AX,Data[BP] }

$FC/ { CLD }

$F3/$AB); { REP STOSW }

end;

Inline statements can be freely mixed with other statements throughout the statement part of a block.

Inline directives

With inline directives, you can write procedures and functions that expand into a given sequence of machine code instructions whenever they are called. These are comparable to macros in assembly language. The syntax for an inline directive is the same as that of an inline statement:

Linking assembler code - 图3

When a normal procedure or function is called (including one that contains inline statements), the compiler generates code that pushes the parameters (if any) onto the stack, and then generates a CALL instruction to call the procedure or function. However, when you call an inline procedure or function, the compiler generates code from the inline directive instead of the CALL. Here's a short example of two inline procedures:

procedure DisableInterrupts; inline ($FA); { CLI }

procedure EnableInterrupts; inline ($FB); { STI }

When DisableInterrupts is called, it generates 1 byte of code--a CLI instruction.

Procedures and functions declared with inline directives can have parameters; however, the parameters can't be referred to symbolically in the inline directive (other variables can, though). Also, because such procedures and functions are in fact macros, there is no automatic entry and exit code, nor should there be any return instruction.

The following function multiplies two Integer values, producing a Longint result:

function LongMul(X, Y: Integer): Longint;

inline (

$5A/ { POP AX ;Pop X }

$58/ { POP DX ;Pop Y }

$F7/$EA); { IMUL DX ;DX : AX = X * Y }

Note the lack of entry and exit code and the missing return instruction. These aren't required, because the 4 bytes are inserted into the instruction stream when LongMul is called.

Inline directives are intended for very short procedures and functions only (less than 10 bytes).

Because of the macro-like nature of inline procedures and functions, they can't be used as arguments to the @ operator and the Addr, Ofs, and Seg functions.

A p p e n d i x