Why would you want to learn assembly? I don't know, but, if you would like to (and, really quickly), this is probably good reading.

So, processors run something call opcode, which basically means binary. Of course, not all processors are the same, but, to preserve some measure of sanity, there's something called x86 which is a specification which gives some guidelines about opcode. If you want to, you can find a whole listing of opcodes for x86 right here.

Of course, no one really wants to write pure opcode, so, there's assembly. Every assembly instruction (usually) corresponds to one "instruction".

To take these instructions and turn them into opcode, there's an assembler, which is very much like a compiler, except its called an assembler (there's more than that to it, but, that's the general idea).

There's two main assemblers for *nix; GNU Assembler and The Netwide Assembler.

The GNU Assembler was initially built as an output format for GCC, and isn't really suitable for writing code in, so, we'll use NASM.

Installing nasm on Ubuntu/Debian is a breeze:

sudo apt-get install nasm

Now, we're ready to start writing some assembly.

section .text
    global _start

_start:
    jmp _start

Save this into a file called first.s or something, and fire up nasm:

nasm -f elf first.s

Or, on 64 bit systems:

nasm -f elf64 first.s

Which will give you an ELF file called first.o, which you can then link:

ld first.o -o first

Run the code:

./first

And, you should have something that just runs continuously.

What have we just done? Analyzing the code, we first tell NASM about the _start "function" (which is just a name for a group of assembly procedures), then, inside that procedure, we're using the jmp call to go back to the beginning of _start. Recursion!

Something a little more interesting:

section .text
    global _start

_start:
    mov eax,4
    mov ebx, eax

This uses registers which are little pieces of memory attached onto the processor that are extremely volatile). We use the registers EAX and EBX. First, we use the mov command.

mov is possibly the worst name you could come up for what it does; it doesn't actually move anything, it copies it. So, the number 4 is copied into the register EAX, which is then copied onto EBX.

Assembly gets much more interesting as you learn more "commands"; you can find a full listing here (courtesy of Intel).

Since assembly is so simple, its very easy to get started, but, its rather difficult to do anything useful; if you think strings are difficult in C, you have no idea what you've gotten yourself into.

Enjoy!

blog comments powered by Disqus
Mobile and Web Analytics