# cs24-20faAssembly: Why? and How?

## Introduction to Computing Systems (Fall 2020)

Assembly is fairly close to the machine. Why should I bother learning assembly if I’m just going to program in python or javascript?

# Why Bother?

In 2020, it seems kind of silly to learn assembly, because it really only shows up in very low level systems. We argue that it’s still worth learning for the following reasons:

• Almost all assembly is written by compilers or in the operating system, but who writes those? [N.B. in a week or so, you will build your own small compiler]
• The high-level language model occasionally breaks down and you have to read the assembly to understand the machine’s behavior! [N.B. in a few weeks, you will write some exploits that rely on understanding assembly]
• It’s important to understand the types of optimizations a compiler is capable of making–and those it isn’t! [N.B. when you write the compiler, you will also write some optimizations]
• Software is generally distributed in binary form; if you want to reverse engineer or security audit software, it’s going to be assembly!

# From C to Binary

To give you a brief preview of where we’re headed, we will go through the process of taking C code and lowering it all the way to machine code. The toy program we will work with is identity.c:

identity.c

int identity(int x) {
return x;
}


The .s extension is the standard way of indicating that a file contains assembly code. To get clang to output an assembly file, you give the -S switch as shown here:

Terminal

blank@compute-cpu2:~clang -S identity.c
blank@compute-cpu2:~cat identity.s
identity:
movl %edi, %eax
retq

The next step is taking the assembly file and turning it into an object file (or .o) file which is a partial executable. You can use objects files to separately compile modules and then link them together later. To get an object file, from an assembly file, you use the as command:

Terminal

blank@compute-cpu2:~as identity.s -o identity.o

# And Back…

When reverse-engineering a binary, it is super useful to be able to go back to the assembly from a binary or object file. To do this, you can use the objdump tool:

Terminal

blank@compute-cpu2:~objdump -d identity.o
simple.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <identity>:
0:  89 f8        mov    %edi,%eax
2:  c3           retq

Additionally, we’ve written a wrapper for objdump called cs24-dasm which can take a binary and objdump a single function. You can invoke it as follows:

Terminal

blank@compute-cpu2:~cs24-dasm identity.o identity
0000000000000000 <identity>:
0:  89 f8        mov    %edi,%eax
2:  c3           retq

For this course, you’ll find cs24-dasm to be more useful than directly using objdump.