What Does The Program Counter (Eip Register In 80x86) Point To?
Extended Instruction Pointer
Buffer Overflow
In Hack Proofing Your Network (2d Edition), 2002
Finding a Jump Point
Next, we need to write out where we want the EIP to get. As mentioned earlier, in that location are numerous means to get the EIP to point to our code. Typically, I put a debugging break signal at the finish of the part that returns, and then I tin see what the state of the registers are when we are right before the vulnerable functions ret instruction. In examining the registers in this instance:
EAX = 00000001 EBX = 7FFDF000
ECX = 00423AF8 EDX = 00000000
ESI = 00000000 EDI = 0012FF80
ESP = 0012FF30 EBP = 90909090
Nosotros notice that the ESP points right into the stack, right later on where the saved EIP should be. After this ret, the ESP will movement upwardly 4 bytes and what is there should exist moved to the EIP. Also, command should continue from there. This ways that if we can get the contents of the ESP register into the EIP, we tin can execute code at that point. As well notice how in the function epilogue, the saved EBP was restored, only this time with our 0x90 string instead of its original contents.
Then now we examine the memory infinite of the attacked program for useful pieces of code that would permit the states to get the EIP register to betoken to the ESP. Since we have already written findjmp, we'll use that to discover an constructive place to get our ESP into the EIP. To exercise this finer, we need to see what DLLs are imported into our attacked program and examine those loaded DLLs for potentially vulnerable pieces of code. To do this, we could use the depends.exe program that ships with visual studio, or the dumpbin.exe utility that will allow you to examine a program's imports.
In this case, we volition use dumpbin for simplicity, since it tin can quickly tell usa what we need. We will utilise the control line:
This shows that the merely linked DLL loaded directly is kernel32.dll. Kernel32.dll also has dependencies, merely for now, nosotros volition simply use that to find a spring point.
Side by side, we load findjmp, looking in kernel32.dll for places that tin redirect us to the ESP. We run it equally follows:
detect jmp kerne 132.d11 ESP
And it tells united states:
So nosotros tin can overwrite the saved EIP on the stack with 0x77E8250A and when the ret hits, it will put the address of a call ESP into the EIP. The processor will execute this instruction, which will redirect processor control back to our stack, where our payload volition be waiting.
In the exploit code, we ascertain this address every bit follows:
and then write it in our exploit buffer after our 12 byte filler like so:
memcpy(writeme+12,&EIP,four); //overwrite EIP here
Read full chapter
URL:
https://www.sciencedirect.com/science/article/pii/B9781928994701500112
System Exploitation
Aditya Thou Sood , Richard Enbody , in Targeted Cyber Attacks, 2014
4.6.6 Heap Spraying
Heap spraying is a stage of browser exploitation where a payload is placed in a browser'south heap. This technique exploits the fact that information technology is possible to predict heap locations (addresses). The thought is to fill chunks of heap retentiveness with payload before taking control of the Extended Education Pointer (EIP). The heap is allocated in the form of blocks and the JavaScript engine stores the allocated strings to new blocks. A specific size of memory is allocated to JavaScript strings containing NOP sled (also known as NOP ramp) and shellcode (payload) and in most cases the specific address range points to a NOP sled. NOP stands for No operation. It is an assembly instruction (x86 programming) which does non perform any performance when placed in the lawmaking. NOP sled is a collection of NOP instructions placed in the retention to delay the execution in the scenarios where the target address is unknown. The instruction pointer moves forward educational activity-by-didactics until information technology reaches the target code. When the return address pointer is overwritten with an accost controlled by the attacker, the pointer lands on the NOP sled leading to the execution of the attacker supplied payload. Basically, the heap exploitation takes the post-obit steps:
- •
-
Start, create what is known every bit a nop_sled (NOP sled), a block of NOP instructions with a Unicode encoding which is an manufacture standard of representing the strings that is understood past the software application (browser, etc.). The "\0×90" represents the NOP teaching and the Unicode encoding of NOP instruction is "%u90". The nop_sled is appended to the payload and written to the heap in the form of JavaScript strings mapping to a new block of retention. Spraying the heap by filling chunks of retention with payload results in payload at anticipated addresses.
- •
-
Next, a browser'due south vulnerability in a component (such as a plug-in) is exploited to alter the execution menstruation to jump into the heap. A standard buffer overflow is used to overwrite the EIP. It is usually possible to predict an appropriate EIP value that will land execution within the NOPs which will "execute" until the payload (usually shellcode) is encountered.
- •
-
The shellcode then spawns a process to download and execute malware. By downloading within a spawned process, the malware can exist hidden from the user (and the browser).
A simple structure of heap spray exploit is shown in Listing iv.3 that covers the details discussed above.
Listing iv.three. Heap spraying example in activity.
Read total chapter
URL:
https://www.sciencedirect.com/science/article/pii/B9780128006047000048
Spider web applications and services
Jeremy Faircloth , in Penetration Tester's Open Source Toolkit (4th Edition), 2017
Stack-based overflows
A stack is simply a terminal in, outset out (LIFO) abstruse information type. Data is pushed onto a stack or popped off information technology (Fig. five.two).
Effigy five.2. A simple stack
The simple stack shown in Fig. 5.two has [A] at the bottom and [B] at the meridian. Now, allow'south push button something onto the stack using a Button C command (Fig. 5.three).
Figure 5.3. PUSH C
Let's push some other for skilful measure: Push button D (Fig. v.4).
Effigy 5.iv. Button D
At present allow'southward run into the effects of a Pop command. POP effectively removes an element from the stack (Fig. 5.5).
Figure five.5. POP removing one chemical element from the stack
Notice that [D] has been removed from the stack. Allow's practise it again for good mensurate (Fig. five.6).
Figure 5.6. POP removing some other element from the stack
Discover that [C] has been removed from the stack.
Stacks are used in modern computing as a method for passing arguments to a function and they are used to reference local function variables. On x86 processors, the stack is said to be inverted, meaning that the stack grows downward (Fig. five.7).
Figure five.seven. Inverted stack
When a function is called, its arguments are pushed onto the stack. The calling function's current accost is also pushed onto the stack so that the part can return to the correct location one time the function is consummate. This is referred to equally the saved Extended Educational activity Pointer (EIP) or just the Instruction Pointer (IP). The accost of the base of operations pointer is as well then saved onto the stack.
Expect at the following snippet of lawmaking:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int foo()
{
char buffer[8]; /* Point 2 */
strcpy(buffer, "AAAAAAAAAA";
/* Point iii */
return 0;
}
int master(int argc, char **argv)
{
foo(); /* Point 1 */
render 1; /* accost 0x08801234 */
}
During execution, the stack frame is ready at Point one. The address of the next instruction after Point i is noted and saved on the stack with the previous value of the 32-fleck Base of operations Pointer (EBP). This is illustrated in Fig. 5.8.
Effigy five.8. Saved EIP
Adjacent, space is reserved on the stack for the buffer char assortment (eight characters) as shown in Fig. v.nine.
Figure 5.9. Buffer pushed onto the stack
Now, allow's examine whether the strcpy role was used to copy 8 A's as specified in our defined buffer or 10 A'south equally defined in the actual string (Fig. five.10).
Effigy 5.10. Too many A's
On the left of Fig. 5.x is an illustration of what the stack would have looked similar had nosotros performed a strcopy of six A's into the buffer. The case on the right shows the start of a problem. In this example, the extra A's have overrun the space reserved for buffer [eight] and take begun to overwrite the previously stored [EBP]. Let's run into what happens if we copy xiii A's and 20 A's, respectively. This is illustrated in Fig. 5.xi.
Figure 5.11. Stack overflow
In Fig. 5.11, we can come across that the old EIP value was completely overwritten when 20 characters were sent to the 8 grapheme buffer. Technically, 16 characters would have done the trick in this case. This means that once the foo() function was finished, the processor tried to resume execution at the address A A A A (0x41414141). A classic stack overflow assault aims at overflowing a buffer on the stack to supplant the saved EIP value with the address of the assailant'due south choosing. The goal would exist to accept the assailant'due south code available somewhere in memory and, using a stack overflow, crusade that memory location to exist the next educational activity executed.
Read total affiliate
URL:
https://world wide web.sciencedirect.com/science/commodity/pii/B9780128021491000051
Embedded Processor Architecture
Peter Barry , Patrick Crowley , in Modern Embedded Computing, 2012
Branch and Control Flow Instructions
Nosotros clearly need instructions to control the catamenia of a program's execution. The branch and control flow instructions fall into 2 primary categories. The first is unconditional changes of program flow to a new programme counter accost. This occurs when a jump or call pedagogy is encountered. The second category of branch or control flow instructions are conditional branches or provisional execution of an instruction. The conditional execution of an instruction is dictated by the contents of bits inside the EFLAGS register, or for some instructions the value in the ECX register.
Bound operations transfer control to a different point in the program stream without recording any return information. The destination operand specifies the address of the education nosotros wish to execute next. The operand tin exist an immediate value, a annals, or a memory location. Intel processors accept several different bound modes that have evolved over fourth dimension, but a number of modes are no longer used. The near jump is a jump within the current lawmaking segment. Equally nosotros mentioned earlier, the current code segment often spans the entire linear retention range (such as zero to 4 GB). So all jumps are constructive within the electric current lawmaking segment. The target operand specifies either an accented offset (that is, an beginning from the base of the code segment) or a relative offset (a signed displacement relative to the electric current value of the educational activity arrow in the EIP register). A near jump to a relative offset of 8 bits is referred to as a brusk jump. The CS register is not changed on near and brusk jumps. An accented offset is specified indirectly in a general-purpose register or a memory location. Accented offsets are loaded direct into the EIP register. A relative offset is generally specified every bit a label in assembly code, but at the machine code level it is encoded as a signed 8-, sixteen-, or 32-flake firsthand value. This value is added to the value in the EIP annals. (Here, the EIP register contains the address of the instruction post-obit the JMP education). Although this looks complicated, in practice the near spring is a simple branch with flexibility in specifying the target destination address. Intel processors as well includes FAR jumps, which let the plan to jump to a different lawmaking segment, jump through a phone call gate with privilege checks, or a chore switch (task in the IA-32 processor context). Tabular array 5.ix shows the different instructions with examples.
Table v.nine. Plan Catamenia—No Saved Render State
Instruction Mnemonic | Example | Description |
---|---|---|
Jmp | JMP target_label | Jumps unconditionally to the destination address operand |
JZ | JZ target_label | Jumps conditionally to the destination operand if the EFLAG. Goose egg flake is set |
JNZ | JZ target_label | Jumps conditionally to the destination operand if the EFLAG. Zero is non prepare |
LOOP | MOV ECX,5 LoopStart: XXX YYY LOOP LoopStart | Decrements the contents of the ECX register, and then tests the register for the loop-termination condition. If the count in the ECX annals is non-zero, program control is transferred to the instruction address specified by the destination operand |
Call gates are sometimes used to support calls to operating arrangement services; for instance, this is a configuration available in VxWorks for real-fourth dimension tasks when calling operating arrangement services. Even so, operating system calls are more usually provided via the software interrupt call.
Calling subroutines, functions, or procedures require the return address to be saved before the command is transferred to the new address; otherwise, at that place is no way for the processor to get back from the phone call. The CALL (call procedure) and RET (return from procedure) instructions allow a jump from one process (or subroutine) to another and a subsequent jump back (render) to the calling procedure. The Phone call education transfers programme control from the current process (the calling procedure) to another procedure (the chosen process). To allow a subsequent return to the calling procedure, the Phone call instruction saves the current contents of the EIP register on the stack earlier jumping to the called process. The EIP register (prior to transferring programme control) contains the address of the instruction post-obit the Telephone call instruction. When this address is pushed on the stack, it is referred to as the return pedagogy pointer or return accost. The address of the called procedure (the address of the offset instruction in the procedure beingness jumped to) is specified in a Call instruction in the same way it is in a JMP instruction (described above).
Near processors provide an instruction to let a program to explicitly raise a specified interrupt. The INT instruction tin can raise whatever of the processor's interrupts or exceptions by encoding the vector number or the interrupt or exception in the teaching or exception, which in turn causes the handler routine for the interrupt or exception to be called. This is typically used past user space programs to call operating system services. Table v.10 shows the instructions that touch the program catamenia.
Table v.x. Programme Flow with Saved Return State
Teaching Mnemonic | Case | Description |
---|---|---|
CALL | CALL target_label | Saves the return address of the stack and jumps to subroutine |
RET | RET | Returns to the instruction after the previous call |
INT x | INT 13h | Calls software interrupt 13 |
IRET | IRET | Returns from the interrupt handler |
On Intel platforms at that place is quite a lot of history associated with the INT calls. Legacy (non-EFI) BIOS supports a number of INT calls to provide support to operating systems. An case of a well-known INT call is the E820. This is an interrupt call that the operating system can utilise to get a report of the retentiveness map. The data are obtained by calling INT 15h while setting the AX register to E820h. For embedded programmers, there is an ever-decreasing dependence on the INT service provided by the BIOS. The Linux kernel reports the memory map reported past the BIOS in the dmesg logs at startup.
The BIOS environs is transitioning from a traditional legacy BIOS, which was first developed a few decades ago, to a more than modern codebase. The newer codebase is known as Unified Extensible Firmware Interface (UEFI). At the fourth dimension of writing, many products are transitioning from this legacy BIOS to EFI. More than information on this topic can exist institute in Chapter 6.
Read full chapter
URL:
https://www.sciencedirect.com/science/article/pii/B9780123914903000059
Exploitation scripting
Jason Andress , Ryan Linn , in Coding for Penetration Testers (Second Edition), 2017
Setting Upward Debugging
Next, we need to fix up debugging so that nosotros tin can see what happens to our plan. Inside a last we are going to execute the command "gdb --tranquillity ./vuln" which will execute our program "vuln" in the GNU Debugger without printing the startup imprint. As in Fig. 9.1, typing in "r" at the prompt will run our programme. The r is short for run, and either control will work but brevity is frequently easier when debugging.
Effigy 9.1. Launching the GNU debugger.
Subsequently nosotros run the program, we desire to inspect the state of the program. By hitting "CTRL+c" we will break the execution of the plan. Nosotros can run across that the program recieved a "SIGINT" bespeak, which means it was sent an interrupt signal which was caused past pressing "CTRL+c". In one case the program is stopped nosotros issue the command "i r". This control is brusque for "info registers" which will show us the values of each of the registers. Knowing generally what these values look like before we exploit a plan is useful so that we tin run across what looks different during an exploit.
These registers are important considering they indicate the electric current state of the awarding. Nosotros encounter the general purpose registers first: EAX, ECX, EDX, and EBX. These registers are short for Extended Accumulator Pair, Extended Counter Pair, Extended Data Pair, and Extended Base Pair. The reason these are called pairs is because each annals is 32 fleck, and information technology tin can be split into subregisters that are the low portion of the 32 chip value. And then for instance, EAX can exist segmented into the sixteen chip register of AX, and so the AX tin can be dissever into two 8 bit registers AH and AL, the loftier and low portions of the 16 flake register.
Nosotros can see the land of the stack and the current running command. ESP is the Extended Stack Arrow, or the address where the stack begins. EBP is the Extended Base Pointer, and points at the lesser of the stack then that application knows the range of the stack space. ESI and EDI are used for string re-create. EIP stands for Extended Instruction Pointer and is used to rails the address of the current pedagogy running within the awarding.
Note
In this chapter, we just begin to scratch the surface of the basics of exploit writing. A better understanding of the registers and basic associates language knowledge would be helpful to best sympathize this bailiwick. A number of books are devoted solely to exploit writing, so there is no manner to become an expert in this one affiliate. To learn more almost exploit writing, discover a book that has a comfortable reading style and work through that transmission. To learn more about uptodate techniques, read the articles written by the Corelan Team on both the basics of exploit writing and new techniques. You lot can find these excellent manufactures at https://www.corelan.exist/index.php/articles/.
Next, we want to see what the program is doing. To do this, we type in "x/5i $eip". This tells the program to examine (ten) the first 5 instructions pointed at by EIP. In Fig. 9.2 nosotros can run across that here the program is executing a "popular" instruction. At that place are different ways to format the examine command, and then for instance 10h volition bear witness the side by side 10 hex values, 3s will show the adjacent 3 strings, 5t volition show the next 5 binary values, and 16c volition bear witness the next 16 characters. Nosotros will be looking at this more as we get through our exploit.
Figure 9.2. Viewing the running instructions with GDB.
The side by side control nosotros result is the "backtrace" command. This shows how we got to where we are. We tin can see here that nosotros are currently in the __kernel_vsyscall function which was called from take which was chosen from the main function in vuln. at line 73. We see the line number and some additional information considering we compiled in the gdb information at our compile time. Using the commands "upwards" and "down", we tin can go more information about the functions and inspect the registers.
(gdb) up
#ane 0xf7ef2221 in accept () from /lib/i386-linux-gnu/i686/cmov/libc.and so.6
(gdb) up
#two 0x08048816 in master () at vuln.c:73
73 c = accept(fd, (struct sockaddr*)&client_addr, &addrlen);
Nosotros can come across hither that we got to the current function through the have call which is in libc.and then.6. When we do the upwards control, it besides adjusts the registers for the states, and so we can run across the registers that were set when the function was called. Going all the manner upward we encounter the line that was chosen in vuln.c and we see that it is the have line from our source code. We can do these sorts of navigations to see how a program is behaving as well as to determine the state going into a part if a office is misbehaving.
Annotation
GDB is an incredibly powerful and circuitous beast. At that place are tons of resources on using GDB and after you've brushed upwards on some of the basics, yous volition exist able to get much more out of the debugger. Being able to set breakpoints efficiently, pace through instructions one at a time, and watch registers change at each step drastically helps with both debugging applications and helping to write exploits. We are just going to scratch the surface in this chapter, and then if you want to get good at GDB there volition be some boosted reading required.
Read full chapter
URL:
https://world wide web.sciencedirect.com/science/article/pii/B9780128054727000097
Architecture
Sarah L. Harris , David Harris , in Digital Design and Computer Architecture, 2022
6.8 Another Perspective: x86 Architecture
Almost all personal computers today use x86 architecture microprocessors. x86, too called IA-32, is a 32-bit compages originally developed past Intel. AMD as well sells x86-compatible microprocessors.
The x86 compages has a long and convoluted history dating dorsum to 1978, when Intel announced the sixteen-bit 8086 microprocessor. IBM selected the 8086 and its cousin, the 8088, for IBM's first personal computers. In 1985, Intel introduced the 32-bit 80386 microprocessor, which was backward compatible with the 8086, and so it could run software developed for earlier PCs. Processor architectures compatible with the 80386 are chosen x86 processors. The Pentium, Core, and Athlon processors are well known x86 processors.
Various groups at Intel and AMD over many years have shoehorned more instructions and capabilities into the antiquated architecture. The result is far less elegant than RISC-V. Yet, software compatibility is far more important than technical elegance, then x86 has been the de facto PC standard for more than ii decades. More than than 100 1000000 x86 processors are sold every year. This huge market justifies more than $5 billion of research and development annually to continue improving the processors.
x86 is an example of a complex didactics set calculator (CISC) compages. In contrast to RISC architectures such as RISC-Five, each CISC teaching tin can exercise more than work. Programs for CISC architectures usually require fewer instructions. The instruction encodings were selected to be more than compact to save memory when RAM was far more than expensive than it is today; instructions are of variable length and are often less than 32 bits. The merchandise-off is that complicated instructions are more than difficult to decode and tend to execute more slowly.
This section introduces the x86 compages. The goal is non to make you into an x86 assembly language programmer merely rather to illustrate some of the similarities and differences between x86 and RISC-V. We remember it is interesting to see how x86 works. Notwithstanding, none of the material in this section is needed to understand the rest of the book. Major differences betwixt x86 and RISC-V (RV32I) are summarized in Table 6.nine.
Table six.nine. Major differences between RISC-V (RV32I) and x86
Feature | RISC-V | x86 |
---|---|---|
# of registers | 32 general-purpose | 8, some restrictions on purpose |
# of operands | 3 (ii sources, 1 destination) | 2 (1 source, one source/destination) |
Operand locations | Registers or immediates | Registers, immediates, or memory |
Operand size | 32 bits | 8, 16, or 32 bits |
Condition flags | No | Yes |
Instruction types | Unproblematic | Simple and complicated |
Instruction encoding | Fixed: 4 bytes | Variable: 1–fifteen bytes |
6.8.ane x86 Registers
The 8086 microprocessor provided eight xvi-fleck registers. It could separately access the upper and lower eight bits of some of these registers. When the 32-bit 80386 was introduced, the registers were extended to 32 bits. These registers are called EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI. For backward compatibility, the lesser 16 bits and some of the lesser 8-scrap portions are also usable, equally shown in Effigy 6.36
Figure 6.36. x86 registers
The eight registers are well-nigh, only not quite, general purpose. Certain instructions cannot use certain registers. Other instructions always put their results in sure registers. Similar sp in RISC-V, ESP is unremarkably reserved for the stack pointer.
The x86 program counter is called the EIP (the extended instruction pointer ). Similar the RISC-V PC, it advances from one instruction to the next or tin can be changed with branch and part telephone call instructions.
half dozen.8.two x86 Operands
RISC-Five instructions always act on registers or immediates. Explicit load and shop instructions are needed to move information between memory and the registers. In contrast, x86 instructions may operate on registers, immediates, or memory. This partially compensates for the small-scale set of registers.
RISC-V instructions mostly specify iii operands: two sources and one destination. x86 instructions specify only ii operands. The first is a source. The 2nd is both a source and the destination. Hence, x86 instructions ever overwrite one of their sources with the result. Table half-dozen.x lists the combinations of operand locations in x86. All combinations are possible except memory to retentivity.
Tabular array 6.10. Operand locations
Source/ Destination | Source | Example | Meaning |
---|---|---|---|
annals | register | add EAX, EBX | EAX <− EAX + EBX |
register | immediate | add EAX, 42 | EAX <− EAX + 42 |
register | memory | add EAX, [twenty] | EAX <− EAX + Mem[20] |
retentivity | register | add [twenty], EAX | Mem[twenty] <− Mem[20] + EAX |
memory | immediate | add [xx], 42 | Mem[20] <− Mem[xx] + 42 |
Like RISC-V (RV32I), x86 has a 32-bit retention space that is byte-addressable. Still, unlike RISC-V, x86 supports a wider variety of retention indexing modes. Memory locations are specified with whatsoever combination of a base annals, displacement, and a scaled index register. Table 6.eleven illustrates these combinations. The displacement can be an 8-, xvi-, or 32-bit value. The scale multiplying the index annals can be ane, 2, 4, or 8. The base + displacement manner is equivalent to the RISC-5 base of operations addressing mode for loads and stores, but RISC-V instructions do non allow for scaling. x86 as well provides a scaled alphabetize. In x86, the scaled index provides an easy way to access arrays or structures of 2-, 4-, or 8-byte elements without having to upshot a sequence of instructions to generate the address. While RISC-V ever acts on 32-chip words, x86 instructions can operate on eight-, sixteen-, or 32-fleck data. Tabular array 6.12 illustrates these variations.
Tabular array six.xi. Retentiveness addressing modes
Case | Meaning | Comment |
---|---|---|
add EAX, [20] | EAX <− EAX + Mem[20] | displacement |
add EAX, [ESP] | EAX <− EAX + Mem[ESP] | base addressing |
add EAX, [EDX+40] | EAX <− EAX + Mem[EDX+40] | base + displacement |
add EAX, [60+EDI*4] | EAX <− EAX + Mem[60+EDI*four] | displacement + scaled alphabetize |
add together EAX, [EDX+80+EDI*2] | EAX <− EAX + Mem[EDX+80+EDI*2] | base + deportation + scaled alphabetize |
Table vi.12. Instructions acting on 8-, 16-, or 32-chip information
Example | Meaning | Data Size |
---|---|---|
add AH, BL | AH <− AH + BL | 8-bit |
add AX, −one | AX <− AX + 0xFFFF | 16-bit |
add EAX, EDX | EAX <− EAX + EDX | 32-chip |
half dozen.8.3 Status Flags
x86, like many CISC architectures, uses status flags (also called status flags) to make decisions about branches and to keep track of carries and arithmetic overflow. x86 uses a 32-scrap register, chosen EFLAGS, that stores the status flags. Some of the bits of the EFLAGS register are given in Table 6.13. Other $.25 are used past the operating system. The architectural land of an x86 processor includes EFLAGS every bit well as the eight registers and the EIP.
Table 6.thirteen. Selected EFLAGS
Name | Significant |
---|---|
CF (Carry Flag) | Carry out generated by last arithmetic operation. Indicates overflow in unsigned arithmetic. Also used for propagating the carry between words in multiple-precision arithmetic |
ZF (Zero Flag) | Effect of last operation was zero |
SF (Sign Flag) | Result of last performance was negative (msb = ane) |
OF (Overflow Flag) | Overflow of two's complement arithmetics |
half dozen.eight.4 x86 Instructions
x86 has a larger set of instructions than RISC-Five. Table 6.14 describes some of the general-purpose instructions. x86 as well has instructions for floating-indicate arithmetic and for arithmetic on multiple short data elements packed into a longer word. D indicates the destination (a annals or memory location), and Southward indicates the source (a register, retentiveness location, or firsthand).
Tabular array six.xiv. Selected x86 instructions
Instruction | Meaning | Role |
---|---|---|
Add together/SUB | add/subtract | D = D + S / D = D − Due south |
ADDC | add together with behave | D = D + S + CF |
INC/DEC | increment/decrement | D = D + 1 / D = D − 1 |
CMP | compare | set flags based on D − South |
NEG | negate | D = − D |
AND/OR/XOR | logical AND/OR/XOR | D = D op S |
NOT | logical NOT | |
IMUL/MUL | signed/unsigned multiply | EDX:EAX = EAX × D |
IDIV/DIV | signed/unsigned divide | EDX:EAX/DEAX = caliber; EDX = residuum |
SAR/SHR | arithmetic/logical shift right | D = D >>> Due south / D = D >> Southward |
SAL/SHL | left shift | D = D << S |
ROR/ROL | rotate right/left | rotate D by Southward |
RCR/RCL | rotate right/left with behave | rotate CF and D by Due south |
BT | fleck exam | CF = D[Southward] (the Southth fleck of D) |
BTR/BTS | chip test and reset/prepare | CF = D[S]; D[S] = 0 / i |
TEST | set flags based on masked $.25 | set up flags based on D AND Due south |
MOV | movement | D = S |
Button | push onto stack | ESP = ESP − 4; Mem[ESP] = S |
POP | pop off stack | D = MEM[ESP]; ESP = ESP + 4 |
CLC, STC | clear/set carry flag | CF = 0 / 1 |
JMP | unconditional jump |
|
Jcc | conditional jump | if (flag) EIP = EIP + S |
LOOP | loop |
|
CALL | role call |
|
RET | function return | EIP = MEM[ESP]; ESP = ESP + 4 |
Note that some instructions e'er human activity on specific registers. For instance, 32×32-chip multiplication ever takes one of the sources from EAX and ever puts the 64-bit result in EDX and EAX. LOOP always stores the loop counter in ECX. PUSH, POP, CALL, and RET use the stack pointer, ESP.
Conditional jumps check the flags and co-operative if the advisable status is met. They come in many flavors. For example, JZ jumps if the cypher flag (ZF) is 1. JNZ jumps if the zero flag is 0. The jumps usually follow an instruction, such as the compare pedagogy (CMP), that sets the flags. Tabular array 6.15 lists some of the conditional jumps and how they depend on the flags prepare past a prior compare operation. Unlike RISC-V, conditional jumps (called conditional branches in RISC-V) usually require two instructions instead of i.
Table six.fifteen. Selected branch weather
Educational activity | Meaning | Function after CMP D, Southward |
---|---|---|
JZ/JE | bound if ZF = 1 | leap if D = Southward |
JNZ/JNE | jump if ZF = 0 | jump if D ≠ Due south |
JGE | spring if SF = OF | jump if D ≥ S |
JG | jump if SF = OF and ZF = 0 | bound if D > S |
JLE | spring if SF ≠ OF or ZF = 1 | jump if D ≤ S |
JL | leap if SF ≠ OF | jump if D < S |
JC/JB | jump if CF = 1 | |
JNC | jump if CF = 0 | |
JO | jump if OF = i | |
JNO | jump if OF = 0 | |
JS | leap if SF = one | |
JNS | spring if SF = 0 |
6.8.5 x86 Instruction Encoding
The x86 educational activity encodings are truly messy, a legacy of decades of piecemeal changes. Different RISC-V, whose instructions are uniformly 32 bits (or 16 bits for compressed instructions), x86 instructions vary from 1 to 15 bytes, as shown in Figure 6.37. 3 The opcode may be 1, two, or 3 bytes. Information technology is followed by iv optional fields: ModR/M, SIB, Displacement, and Immediate. ModR/Thou specifies an addressing mode. SIB specifies the scale, alphabetize, and base of operations registers in certain addressing modes. Deportation indicates a ane-, 2-, or 4-byte displacement in sure addressing modes. Immediate is a 1-, 2-, or iv-byte constant for instructions using an firsthand as the source operand. Moreover, an pedagogy can be preceded by up to four optional byte-long prefixes that modify its behavior.
Figure 6.37. x86 instruction encodings
The ModR/M byte uses the 2-bit Modern and three-bit R/Yard field to specify the addressing manner for one of the operands. The operand can come from one of the eight registers or from one of 24 memory addressing modes. Due to artifacts in the encodings, the ESP and EBP registers are not available for utilise equally the base or alphabetize register in certain addressing modes. The Reg field specifies the register used as the other operand. For certain instructions that do not require a second operand, the Reg field is used to specify three more bits of the opcode.
In addressing modes using a scaled index register, the SIB byte specifies the index register and the calibration (ane, ii, 4, or eight). If both a base and index are used, the SIB byte besides specifies the base register.
RISC-V fully specifies the pedagogy in the op, funct3, and funct7 fields of the teaching. x86 uses a variable number of $.25 to specify different instructions. It uses fewer bits to specify more common instructions, decreasing the average length of the instructions. Some instructions even accept multiple opcodes. For example, Add AL, imm8 performs an 8-bit add together of an immediate to AL. It is represented with the 1-byte opcode, 0x04, followed by a 1-byte firsthand. The A register (AL, AX, or EAX) is chosen the accumulator. On the other hand, ADD D, imm8 performs an eight-fleck add together of an firsthand to an arbitrary destination, D (retentiveness or a register). It is represented with the 1-byte opcode 0x80 followed past i or more bytes specifying D, followed by a 1-byte immediate. Many instructions take shortened encodings when the destination is the accumulator.
In the original 8086, the opcode specified whether the instruction acted on 8- or 16-bit operands. When the 80386 introduced 32-bit operands, no new opcodes were available to specify the 32-bit form. Instead, the aforementioned opcode was used for both 16- and 32-bit forms. An additional bit in the code segment descriptor used by the OS specified which form the processor should choose. The flake is gear up to 0 for backward compatibility with 8086 programs, defaulting the opcode to 16-fleck operands. It is fix to i for programs to default to 32-fleck operands. Moreover, the programmer can specify prefixes to alter the form for a detail instruction. If the prefix 0x66 appears before the opcode, the alternative size operand is used (16 bits in 32-chip mode, or 32 bits in 16-scrap mode).
6.8.6 Other x86 Peculiarities
The 80286 introduced segmentation to divide retentiveness into segments of up to 64 KB in length. When the OS enables segmentation, addresses are computed relative to the beginning of the segment. The processor checks for addresses that get beyond the end of the segment and indicates an error, thus preventing programs from accessing retentiveness exterior their own segment. Segmentation proved to be a hassle for programmers and is not used in modern versions of the Windows operating system.
x86 contains string instructions that deed on entire strings of bytes or words. The operations include moving, comparing, or scanning for a specific value. In modern processors, these instructions are commonly slower than performing the equivalent operation with a series of simpler instructions, so they are best avoided.
Equally mentioned earlier, the 0x66 prefix is used to choose betwixt 16- and 32-bit operand sizes. Other prefixes include ones used to lock the bus (to control access to shared variables in a multiprocessor system), to predict whether a co-operative will be taken or non, and to repeat the educational activity during a string movement.
Intel and Hewlett-Packard jointly developed a new 64-bit architecture called IA-64 in the mid-1990'southward. It was designed from a clean slate, bypassing the convoluted history of x86, taking reward of twenty years of new research in computer architecture, and providing a 64-bit address infinite. However, the offset IA-64 bit was too late to market and never became a commercial success. Near computers needing the large address space now use the 64-fleck extensions of x86.
The blight of any compages is to run out of retention capacity. With 32-bit addresses, x86 can admission 4 GB of retention. This was far more than the largest computers had in 1985. However, by the early on 2000's, information technology had become limiting. In 2003, AMD extended the address space and annals sizes to 64 bits, calling the enhanced architecture AMD64. AMD64 has a compatibility fashion that allows it to run 32-bit programs unmodified while the Os takes advantage of the bigger accost space. In 2004, Intel gave in and adopted the 64-bit extensions, renaming them Extended Memory 64 Technology (EM64T). With 64-chip addresses, computers can admission sixteen exabytes (sixteen billion GB) of memory.
For those interested in examining x86 architecture in more detail, the x86 Intel Architecture Software Developer's Transmission is freely available on Intel's website.
half-dozen.eight.vii The Big Picture
This section has given a taste of some of the differences between the RISC-V architecture and the x86 CISC compages. x86 tends to accept shorter programs considering a complex teaching is equivalent to a series of simple RISC-V instructions and considering the instructions are encoded to minimize memory usage. However, the x86 architecture is a hodgepodge of features accumulated over the years, some of which are no longer useful merely must be kept for compatibility with onetime programs. It has too few registers, and the instructions are hard to decode. Merely explaining the teaching set is hard. Despite all these failings, x86 is firmly entrenched every bit the ascendant computer architecture for PCs because the value of software compatibility is so smashing and because the huge market justifies the effort required to build fast x86 microprocessors.
Read full affiliate
URL:
https://www.sciencedirect.com/science/article/pii/B9780128200643000064
Source: https://www.sciencedirect.com/topics/computer-science/extended-instruction-pointer
Posted by: yeagereimstand.blogspot.com
0 Response to "What Does The Program Counter (Eip Register In 80x86) Point To?"
Post a Comment