cs24-20fa Assembly: Why? and How?

Introduction to Computing Systems (Fall 2020)

Assembly is fairly close to the machine. Why should I bother learning assembly if I’m just going to program in python or javascript?

Why Bother?

In 2020, it seems kind of silly to learn assembly, because it really only shows up in very low level systems. We argue that it’s still worth learning for the following reasons:

From C to Binary

To give you a brief preview of where we’re headed, we will go through the process of taking C code and lowering it all the way to machine code. The toy program we will work with is identity.c:

identity.c

int identity(int x) {
    return x;
}

The .s extension is the standard way of indicating that a file contains assembly code. To get clang to output an assembly file, you give the -S switch as shown here:

Terminal

blank@compute-cpu2:~clang -S identity.c
blank@compute-cpu2:~cat identity.s
identity:
    movl %edi, %eax
    retq

The next step is taking the assembly file and turning it into an object file (or .o) file which is a partial executable. You can use objects files to separately compile modules and then link them together later. To get an object file, from an assembly file, you use the as command:

Terminal

blank@compute-cpu2:~as identity.s -o identity.o

And Back…

When reverse-engineering a binary, it is super useful to be able to go back to the assembly from a binary or object file. To do this, you can use the objdump tool:

Terminal

blank@compute-cpu2:~objdump -d identity.o
simple.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <identity>:
    0:  89 f8        mov    %edi,%eax
    2:  c3           retq

Additionally, we’ve written a wrapper for objdump called cs24-dasm which can take a binary and objdump a single function. You can invoke it as follows:

Terminal

blank@compute-cpu2:~cs24-dasm identity.o identity
0000000000000000 <identity>:
    0:  89 f8        mov    %edi,%eax
    2:  c3           retq

For this course, you’ll find cs24-dasm to be more useful than directly using objdump.