DesktopLinuxAsm - Tipstips




Coding Tips

The following coding tips focus on reducing code size.
In some cases larger code fragments will run faster,
but in general the smaller code will be faster.  This
is constantly debated, and often the only way to be sure
is to measure execution speed.  It may be that the
processor cache can hold short code and run it faster
than long code sequences that have faster instructions.  

1. setting registers

One of the most common operations in assembly is to move
a constant value into a register.  Often the value moved
is zero.  Here is the obvious instruction to load zero:

    B800000000                mov	eax,0

The nasm generated code (on left) says this instruction is
five bytes long and uses the operation code "B8". Another
way to do this is:

    31C0                      xor	eax,eax

This is two bytes long, but modifies the flag states.

Another common value to load into registers is -1.  Once
again the typical way to do this is:

    B8ffffffff                mov	eax,-1

Another way to generate -1 using one less byte is:

    31C0                      xor	eax,eax
    F7D0                      not	eax

A even shorter way to generate -1 is:

    83C8FF                    or	eax,byte -1

If we want byte values from 1 to 254 (either negative
or positive) the following macro does it in only 3 bytes


                             %macro _mov 2
                                push	byte %2
                                pop	%1
                             %endmacro
                                      
    _mov	eax,2  ;example of _mov macro usage
     6A02                <1>  push byte %2
     58                  <1>  pop %1

Note: To keep code simple, use macro names that clearly
describe the operation performed.  This makes reading the
source code easier, but beware, debuggers that disassemble
the code only see the push and pop. 

2. checking a register value

A very common operation is to check if register is zero.  The
obvious way to do this is:

    83F800                    cmp	eax,byte 0
    740B                      je	match

A better way to do this and save one byte is:

    09C0                      or	eax,eax
    7410                      jz	match

An even better way (depending upon the design) is to use
the "jecxz" or "loop" instruction and avoid the test:

    E312                      jecxz	match

    E210                      loop	match

The loop instruction is consided slow and avoided by
some programmers.  So.. if speed is important, do some
code timing.  I've not found the use of "loop" to be
slow.

The the "dec" and "inc" instructions are one byte long on
32 bit processors and provide an alternative way to check
for registers with "1" or "-1".  Here are the traditional
test for 1 and the "dec" test.


    83F801                    cmp	eax,byte 1
    740B                      je	match

The "dec" test (modifies the register)

    48                        dec	eax		;set zero flag if eax=1
    7408                      jz	match		;jmp if eax was = 1

To check for -1 use:


    40                        inc	eax		;set zero flag if eax=-1
    7400                      jz	match


3. Register math

The LEA instruction (load effective address) can be useful to multiply registers and add values. Simple multipies are usually better done with "MUL" or one of the shift instructions. 8D0400 lea eax,[eax*2] ;eax * 2 8D0440 lea eax,[eax+eax*2] ;eax * 3 8D048500000000 lea eax,[eax*4] ;eax * 4 8D0480 lea eax,[eax*4+eax] ;eax * 5 8D04C500000000 lea eax,[eax*8] ;eax * 8 8D04C0 lea eax,[eax*8+eax] ;eax * 9 If we want to add in a constant value or register then use of LEA becomes a good choice. 8D8418F4010000 lea eax,[eax+ebx+500] The best way to do a multiply is using shift and adds as follows: D1E0 shl eax,1 ;eax * 2 89C3 mov ebx,eax D1E0 shl eax,1 01D8 add eax,ebx ;eax * 3 C1E002 shl eax,2 ;eax * 4 89C3 mov ebx,eax C1E002 shl eax,2 01D8 add eax,ebx ;eax * 5 4. Avoiding branches

Programs execute a lot faster if they do not have to jump
to a new location very often. This suggests we use decisions that do not involve the conditional jump instructions. There are several technques to do this using the "XOR" instruction. Here is one example: choose regiser value without branch if (eax != 0) eax = ebx; else eax = ecx; 3D01000000 cmp eax,1 19C0 sbb eax,eax 21C1 and ecx,eax 35FFFFFFFF xor eax,-1 21D8 and eax,ebx 09C8 or eax,ecx The disadvantage of the above code is complexity. It makes reading code more difficult and in most cases isn't necessary. 5. Creating registers

Programs run a lot faster if all data is kept in registers.
This isn't a problem for simple loops, what if we run out of registers. Our options are: 1. Free up a register by pushing it on the stack. 2. Free up a regiser by moving it to memory 3. Use a CPU special regsiter that is dedicated for other purposes. 4. Split a register into two or more regsiters. Options 3 and 4 are seldom used, and option 4 has promise. The general registers eax, ebx,ecx,edx can be split into byte or word registers, and ebp,esi,edi can be split into word regisers. If our application only needs 16bit for some registers we can define over 14 word registers. splitting a register into two 16bit registers 0FC8 bswap eax ; work with "ax #1" 0FC8 bswap eax ; work with "ax #2"
6. setting flags

- mov instructions do not set flags
- lea insruction does not set flags - to set flags for register use "or eax,eax" - the direction flag is initially set to "cld" and is assumed by most AsmLib functions. 7. looping
- loops are most efficient if the loop back test is at end.
- the "loop" instuction works with a count in "ecx" - the "jecxz" is often a good way to create loops using "ecx" as a flag 8. divide error Divide by zero or division that will overflow is an error and can be detected by the following code. 3B15[87000000] cmp edx,[divisor] 730A jnb error F735[87000000] div dword [divisor] 9. Set register to state of carry flag The following code sets eax to zero if eax=ebx. If eax does not equal ebx then set it to -1. This is useful in setting a flag without using a conditional jmp. 39D8 cmp eax,ebx 19C0 sbb eax,eax ;eax=result 10. Set edx to 0 or -1

Often we know what is in eax and need a constant in edx. The
following code will set edx. B805000000 mov eax,5 99 cdq ;set edx to 0 if eax positive B8FFFFFFFF mov eax,-1 99 cdq ;set edx to -1 if eax negative If we wanted to clear both eax and edx, the shortest code is: 31C0 xor eax,eax 99 cdq 11. Coding Style The use of structures allow variables to be kept on the stack and they are essential in describing data records. The following code defines a structure and sets up a stack frame to hold the structure: struc animal .dog resd 1 .cat resd 1 animal_struc_size: endstruc start: 81EC08000000 sub esp,animal_struc_size ;make room on stack C7042401000000 mov [esp+animal.dog],dword 1 ;initialize dog C744240402000000 mov [esp+animal.cat],dword 2 ;initialize cat (program body here) 81C408000000 add esp,animal_struc_size ;destroy struc on stack C3 ret 12. Avoiding spaghetti

Complexity and spaghetti code increases if we:
1. jump back often 2. use a lot of pushes and pops 3. fail to document register states with comments 4. use large blocks of code rather than small blocks with inputs and outputs identified. 13. Converting C to Asm It is easy to convert most "c" programs to nasm assembler using the AsmSrc program, but best results are obtained if debug information was provided by the compiler. Here are some tips to convert "C" programs. - After generating source, strip the library information off. - Add a _start label at entry and make it a global. - compile the program and fix any compile errors. - Test the program and get it working using Asmbug. - Create a structure describing the stack frame. - Replace all the "ebp+xx" references with structure references.
Debugging Tips
1. debuggers Debuggers come in many styles. Some show the source files in code window. Other disassemble the code in memory. The display can be either a scrolling text window or separate gui windows. debugger window code window shows -------- ------ ----------------- gdb text source ald text disassembly debug text disassembly kdbg gui source insight gui source AsmBug gui hybrid If you are debugging a program with windows, the "text" debuggers need to be on a separate terminal. Otherwise the display will be a mess. Most source debuggers assume a HLL is in use and expect library code to be appended to start of programs. For this reason you need to execute the program and run until your code is reached. Some debuggers will fail if you try to do a "step" before running to the start of your program. 2. debug functions AsmLib has many log functions to write data to a file called "log". they are useful in finding difficult bugs or to create a simple trace of program activity. The following functons are available: log_hex, log_num, log_process, log_signals, log_str If these logs are not enought, there are: hex_dump_file and log functions that write to stdout.

Fork me on GitHub