What is Assembly Language?

Inside the DS5000, instructions are really stored as hex numbers, not a very good way to look at them and extremely difficult to decipher. An assembler is a program that allows you to write instructions in, more or less, english form, much more easily read and understood, and then converted or assembled into hex numbers.

The program is written with a text editor ( I'll furnish you with one), saved, and then assembled by the assembler (I'll furnish you with one). The result is the file you download to the DS5000. Here is an example of the problem of adding 2 plus 2: [This file will be called prob01.asm]

The first line moves a 2 into register r0. The second moves a 2 into the accumulator. This is all the data we need for the program. The third line adds the accumulator with r0 and stores the result back into the accumulator, destroying the 2 that was originally in it. The accumulator has a 4 in it now and r0 still has a 2 in it.

Assembly language follows some rules that I will describe as they come up. With most instructions, especially those involving data transfer, the instruction is first, followed by at least 1 space, then the destination followed by a comma, and then the source. The destination is where the result of the instruction will end up and the source is where the data is coming from.

Next we will read a switch, and light an led if the switch is pressed. Bit 0 of p1 will be the switch. When the switch is closed or pressed, bit 0 will be a 1, and if the switch is open or not pressed, bit 0 will be a 0.

Bit 0 of p0 will be the led. If bit 0 is a 0 the led is off and if bit 0 is a 1, the led will be on. All the other bits of both p0 and p1 will be ignored and assumed to be all 0's, for the sake of discussion. [This is prob02.asm]

start: mov a,p1        ;read the switch
       mov p0,a        ;write to the led
       sjmp,start      ;go to start

The first line has something new. It's called a label. In this case it is start:. A label is a way of telling the assembler that this line has a name that can be referred to later to get back to it. All labels are followed by the symbol : , which tells the assembler that this is a label. Also a comment can be added to the line to remind you of what that line does. A comment is always started with a ; which tells the assembler to ignore all that follows on that line because it is a comment.

In the first line we also read the switch by reading p1 (the source) and putting it into the accumulator (the destination).

The next line writes the accumulator, which has the switch in it (the source) to p0, which has the led attached (the destination).

The last line jumps back to start. This completes the loop of reading the switch and writing to the led.

I hope you see that if the switch is closed, the first line will result in a 1 in the accumulator. In the second line, writing a 1 to p0 will light the led. If the switch is open in line 1, there will be a 0 in the accumulator which, when written to p0, will turn off the led. This loop would continue, endlessly, reading the switch and writing the led, until the micro is turned off, or a new program loaded. If the switch is pressed, the led will be lit, otherwise it won't..

This particular problem could have been solved with just a switch connected to an led, like a light is connected to a wall switch in your house. But with a micro in the loop, much more could be done. We could have a clock that also turns on and off the led based on time. Or we could monitor the temperature and turn the led on and off based on what temperature it is. Or we could monitor several switches and turn the led on and off based on a combination of switches.

In the above example we assumed that the other bits of p0 and p1 were all zeros. But in reality, each of these bits could have a function assigned to them. Then we would need to look only at bit 0 in p0 and bit 0 in p1. This further complicates the problem.

In assembly we can assign a name to a bit and refer to it by that name, instead of a bit in a port. This is done with an equate directive. Directives are assembler commands that don't result in program but instead direct the assembler to some action. All directives start with a period.

In the DS5000 there are many bit locations. 128 of them were discussed in the previous lesson. There are others like all the bits in the accumulator, all the bits in the p0, p1, p2, and p3 ports. Each one is assigned an address by the DS5000 and can't be changed. The first 128 are the 16 bytes of internal ram following the first 32 bytes. Bit 0 of p0 is address 128 (80h) and bit 0 of p1 is address 144 (90h). Now let's look at what this looks like in assembly. [This is prob03.asm]

       .equ  switch,144      ;p1 bit 0 is now called switch
       .equ  led,128         ;p0 bit 0 is now called led
start: mov   c,switch        ;get the state of the switch and put in carry
       mov   led,c           ;mov the carry flag to the led
       sjmp  start           ;jump to start

This has the same result as the previous program, but doesn't assume anything about the other bits in p0 and p1. Also the equate only has to be made once at the start of the program, and thereafter the name or label is used instead of the bit number. This makes things much simpler for the programmer. Also the carry flag is used with bit instructions like the accumulator is used for byte instructions. All bit moves must be through the carry flag. All equates must be defined before they are used in a program. This holds true for labels also. Another advantage of naming bits with an equate is that if, later in the design process, you decide to use a different bit for the led or the switch, only the equate has to be changed, not the program itself.

Another way to do the 2 plus 2 problem is to use a similiar technique. This directive is the reserve storage directive. Here is how this would look in assembly: [This is prob04.asm]

first: .rs   1                ;reserve a byte called first
secnd: .rs   1                ;reserve a byte called secnd
       mov   first,#2         ;put a 2 into first
       mov   secnd,#2         ;put a 2 into secnd
       mov   a,first          ;mov first into accumulator
       add   a,secnd          ;add secnd to accumulator

In the first line we see the .rs directive. The first part is the name or label, the second is the directive, and the third is the amount of storage to be assigned to the name. In this case we only needed 1 byte for each variable so only 1 was reserved for each. Also you will notice that everything lines up vertically. This is only for readability for the programmers sake of clarity. The rule that the assembler uses is that at least 1 space must be between each part of the line. Also the names can be much longer, but I prefer short names so I don't have to type so much. Plus by using short names, more space is left for the comment field. Comments are very important. When you initially write a program, the tendancy is not to write much in the comment field because you're in a hurry. But if you have to come back to it a few weeks later, it's much easier to understand what you've written if you've taken the time to write good comments. Also good comments help in debugging.

In this example, we reserved two bytes and stored a 2 in each one. Then we got the first one (first) and put it into the accumulator. Then we added the second (secnd) to the accumulator. Now the accumulator has a 4 in it, and first and secnd still have a 2 in them. This method also has the advantage of being able to not worry which byte is first and which byte is secnd, because the assembler takes care of it for you. The difference between a .rs and a .equ is that with the .equ you tell the assembler exactly which bit you want to name, where with the .rs, the order of the reserve storage determins which byte is which. It really doesn't matter which is which, because you refer to them as names instead of actual locations.

In the DS5000, there are several ways of addressing variables. One is the register addresses, r0, r1, r2, and so on. Another is by direct addressing. This uses the actual address in internal ram where the variable is. Using a name with the .rs directive is a form of direct addressing. Here the assembler assigns the actual address of the named variable by which order the .rs's are in. If the variable first was actual location 0 then the variable secnd would be actual location 1. Location and address mean the same thing. When the assembler assembles the program, the names are dropped (to protect the innocent) and the actual addresses are used.

Another form of addressing is called immediate addressing. This is the form used in the original 2 plus 2 problem. It is indicated by the # symbol in front of the number. This tells the assembler that the number is in the instruction, not somewhere else. Also this method was used in the above example where the variables first and secnd were loaded with a 2. The 2 was in the instruction not somewhere else. But when we started getting the numbers to add we used direct addressing, using the names. Here the 2's were in locations and not in the instruction. Here are those instructions and what type of addressing was used in each:

first: .rs   1                ;this is a directive not an instruction
secnd: .rs   1                ;so is this one
       mov   first,#2         ;direct,immediate
       mov   secnd,#2         ;direct,immediate
       mov   a,first          ;implied,direct
       add   a,secnd          ;implied,direct

Here's the original 2 plus 2 problem and the addressing modes of each instruction:

Notice that each instruction can have two different addressing modes, one for the destination and one for the source. Also notice that the a is an implied address. Register refers stictly to the registers r0, r1, r2, and so on, even though I've referred previously to the accumulator as a register. This comes partly from my experience with the Zilog Z-80 microprocessor, where the accumulator is a register. In the DS5000 the accumulator is in the special function registers but the instruction implies that the accumulator will be used as either the source or the destination, depending on the instruction. Implied is a method of addressing that shortens the number of bytes any particular instruction assembles into.

To digress just a little here, an instruction like add a,r0 is a one byte instruction. In other words this instruction would end up inside the DS5000 as one byte. Part of the byte is the opcode and the other part is which register is affected or used. The reason for this is that a prime concern in programming a micro is how may bytes the program will actually take up inside the micro, after it's been assembled. The idea is to cram as much as possible into as few bytes as possible. This is why implied addressing is used. It limits choices in the use of the instruction, you always have to use the accumulator as either the source or the destination, but it shrinks the size of the instruction, so that more instructions can fit inside the micro. This is a choice made by the maker of the micro, and is not up for discussion. It's a trade off of flexibility vs. size. That's why you'll see lots of instructions that use the accumulator. This is the best way to describe implied addressing.

In the case of an instruction like add a,secnd ,two bytes are assembled. The first byte says that this is an add instruction and that the accumulator is implied as the destination. The second byte is the direct address of the source variable, secnd. This is transparent to the programmer because we are using an assembler, but the underlying results are noteworthy when trying to cram the most into the DS5000. Well enough of that. We will probably get into this again, later.

Another form of addressing variables is called register indirect or just plain indirect addressing. This is a little more complicated. Here the address is held in a register, either r0 or r1. The following is another example of the 2 plus 2 problem using register indirect addressing. [This is prob05.asm]

buffer: .rs 2             ;reserve two locations for the data
        mov r0,#buffer    ;set r0 to the start of the buffer
        mov @r0,#2        ;put a 2 into the first location of the buffer
        inc r0            ;increment r0 to point to the second byte
        mov @r0,#2        ;put a 2 into the second location of the buffer
        mov r0,#buffer    ;set r0 to the start of the buffer
        mov a,@r0         ;get the first 2
        inc r0            ;step to the second 2
        add a,@r0         ;add the second 2 to the first 2

Line 1 is again a reserve storage directive, but this time we are reserving two locations, one for each 2 in the problem. Line 2 sets r0 to point to the first location of the buffer we've created with the .rs directive. As stated before, the # means immediate, but in this case the assembler sees that the first character after the # is not a number, and, instead finds the label buffer and gets the direct address of where it is in the internal ram and puts that address in r0, immediately. That's a mouthfull but that's what happens on that one line!

Line 3 stores a 2 in the first location of buffer. The @ symbol tells the assembler that the following register holds the address (indirect) of where to put the 2 into. So after this instruction, the first byte of buffer has a 2 in it. Line 4 increments or steps r0 to the second location in buffer. If r0 had the address of the first byte, then incrementing it by 1 now results in the address of the second byte being in r0.

Line 5 does the same thing that line 3 did, except that the 2 is stored in the second byte of buffer. Line 6 does the same thing that line 2 did, getting r0 to point to the first location in buffer. Line 7 moves the first byte of  buffer (the first 2)into the accumulator. Line 8 steps r0 to the address of the second 2 in buffer. Line 9 adds the second 2 to the first 2 and stores the result in the accumulator.

Repeating what we did for the other example programs, here are the addressing modes of each line:

buffer: .rs 2             ;this is a directive, not an instruction
        mov r0,#buffer    ;register,immediate
        mov @r0,#2        ;indirect,immediate
        inc r0            ;this is arithmetic, adding 1 to r0
        mov @r0,#2        ;indirect,immediate
        mov r0,#buffer    ;register,immediate
        mov a,@r0         ;implied,indirect
        inc r0            ;arithmetic, adding 1 to r0
        add a,@r0         ;implied,indirect

Here is another example of the 2 plus 2 problem done still a different way: [prob06.asm]

buffer: .rs 2             ;reserve 2 bytes for the data
        mov #buffer,#2    ;put a 2 in the first location in buffer
        mov #buffer+1,#2  ;put a 2 in the second location in buffer
        mov a,buffer      ;move the first 2 into the accumulator
        add a,buffer+1    ;add the first 2 to the second 2

Line 1 reserves 2 bytes and names it buffer. Line 2 puts a 2 into the first location of buffer. Line 3 puts a 2 in the second location of buffer. Line 4 moves the first 2 into the accumulator. Line 5 adds the first 2 to the second 2 and stores the result into the accumulator. There are still other ways to solve this problem. In fact, if a problem was given to 100 programmers, there would probably be 100 different programs, all with the same results.

Lastly, I want to explain something else about the assembler. Since there are two distinct memory areas in the DS5000, there must be a way to describe to the assembler which one is being referred to at any particular place in the source file. The source file is what the above program, or any program that has been written, is referred to. It is the source for the assembler, or the file that is going to be read by the assembler to generate the object file (the object of the assembler) from. The object file is the file that will be download to the DS5000. They are two different files. One you've written with a text editor (the source file) and the other is created by the assembler (the object file) when you assemble the source file. You use an assembler with the object in mind of generating a file to download to the micro, hence the name, object file.

I've left out some directives from the previous programs, for simplicities sake, that I need to mention now. One is the .org directive. It is the originate or origin directive. This tells the assembler at what address the first byte of assembled code is to be placed inside the DS5000. It is the origin of the program or the beginning. Here's how this would look for our last example program:

        .org h'0000
        mov r0,#buffer    ;set r0 to the start of the buffer
        mov @r0,#2        ;put a 2 into the first location of the buffer
        inc r0            ;increment r0 to point to the second byte
        mov @r0,#2        ;put a 2 into the second location of the buffer
        mov r0,#buffer    ;set r0 to the start of the buffer
        mov a,@r0         ;get the first 2
        inc r0            ;step to the second 2
        add a,@r0         ;add the second 2 to the first 2

        .segment .memory
        .org h'00
buffer: .rs 2

The first part is the same with the exception of the .org. This tells the assembler that the first byte of code will be assembled at address 0000h, in this case. Later in the program is another directive called .segment. This tells the assembler that a different area of memory named .memory will be used now. This represents the internal ram. The next line has a .memory directive. This switches to the internal ram memory. The .org following this is telling the assmbler that the first .rs is located at address 00h. So the address of buffer is 00h or just 0. This is the value or address that is loaded into r0 when the mov r0,#buffer is executed.

Well we've covered quite a lot in this lesson, and I hope you've gotten most of it. If not, I would suggest re-reading it until you do. I would also suggest that you print out all of these lessons so you can refer to them later. In the next lesson we will actually be assembling these programs and running them in the simulator for a closer look at what happens inside the DS5000. This is where it really gets good.

My home page is http://www.hkrmicro.com/personal/index.html .

On to lesson 7