bypassing segmentation fault

serpretetsky

2[H]4U
Joined
Dec 24, 2008
Messages
2,180
I'm doing some basic assembly language programming for x86-64 using gcc on ubuntu 64.

Anyways, i've noticed that whenever i try to access memory i'm not supposed to, it breaks with a segmentation error.

This is not the problem i have. The problem is that i want someway to be able to check if a memory address is accessable to me without breaking out of my program and giving control back to the os.

this is purely educational, im just trying to learn how all of this works.

For example: i want to create a loop that simply goes through every single memory address and prints the contents out to me. I realize that most addresses i wont have access to, but i dont want my program to stop, i'd rather it just printed an error and continued.

how can i do this? ( something i can do in assembly, or some option in os where i can tell it to simply ignore seg faults and give my program some error code?)

I'm very much a noob to assembly.

also, something im confused about. Whats the point of separating the text segment (code) from the data segment (data). I accidently put one of my strings under .text and doesn't seem to make any difference from what i can tell. Is .text always read only or something?
 
I'm not aware of a way to do this, it's really a lot more complicated than you think. Each program gets its own virtual memory map; accessing addresses outside this map is just plain invalid, they're not mapped to physical memory at all. The OS probably provides a mechanism via syscalls or a special device (ie. /dev/mem) to access physical memory as a privileged user, but in any modern OS, a standard application will see its own virtual memory space and anything else won't be visible. If you want to examine your own applications, there should be debugging hooks you can pull in as well, but these will have to match the runtime you're using.

.text will normally be read only, and .data will normally be no-execute. It's security and sanity checking.

About all you can do to bypass it as a normal user would be to fork off processes or threads that communicate back to the main process which detects if they die, but I don't think this would give you what you're looking for. You'd see your program, some kernel space where you can jump to syscalls, and any libraries you've pulled in, and that's about it.
 
You can't trap segmentation faults. It's not complicated at all -- the system is in an undefined state after you stomp on someone else's memory. You can set up a signal handler, but that can't do much -- least of all, continue execution.

.text being read-only contirubtes to system efficiency. Since the pages are read-only, the OS knows it can perform a couple of optimizations. One is that the data won't change, so other programs running the same code can share those pages. Another is that the memory, if swapped out, can be read from the executable image instead of the system's paging file.
 
thank you for those quick responses. couple questions:
Each program gets its own virtual memory map; accessing addresses outside this map is just plain invalid, they're not mapped to physical memory at all
does this mean when my stack pointer is pointing to 0x7FFF0180dbc0, that this does not correlate to my actual memory (ie, i have 2gb of memory, so it starts at 0x0 and goes to 0x80000000 ...) .... lol, i just answered my own question while typing that, that obviously doesn't make sense 0x7fff0180dbc0 is far out of range of 0x80000000 and even out of range of the limit of a 32bit system.

edit: wait. so why does the OS remap those address. whats the point? why doesn't the OS just tell programs the exact addresses they can or can't access. Also, does this imply that every time i access the memory, ontop of the fact that memory is slow, there's also a delay for the OS to check that i have access to that memory, or is it hardwired somehow?

.text will normally be read only, and .data will normally be no-execute. It's security and sanity checking.
does that mean that my instruction pointer can never point to an address in the .data segment? I suppose i could test this by attempting to jump to an address in .data segment
.text being read-only contirubtes to system efficiency ... the OS knows it can perform a couple of optimizations ... [one of them being] that the memory, if swapped out, can be read from the executable image instead of the system's paging file.
wait, you mean executable code is never loaded to a swap partition or pagefile, but instead if the program needs to be moved back to memory it just loads back from the original executable?
( i dont think i understood you correctly, cause that would imply me deleting the file while it was a current process but paged to the drive would cause it to crash once it started executing again)
 
Last edited:
wait, you mean executable code is never loaded to a swap partition or pagefile, but instead if the program needs to be moved back to memory it just loads back from the original executable?
( i dont think i understood you correctly, cause that would imply me deleting the file while it was a current process but paged to the drive would cause it to crash once it started executing again)

It depends on the OS. Certainly, that's the way it works on Windows. On Linux/Unix, things might be a little different. Since the OS also controls the file system, it's possible that the FS will know the file is locked and not allow the deletion. Or, will unlink the file but not actually delete the file until the program has stopped running (or the next reboot, or something else).

Removable media is also an issue. Windows linkers, given a specific option, set a bit in the executable header so that the loader will commit the pages to the swap file. This way, even if the media is ejected, the pages can be swapped back in.
 
edit: wait. so why does the OS remap those address. whats the point? why doesn't the OS just tell programs the exact addresses they can or can't access. Also, does this imply that every time i access the memory, ontop of the fact that memory is slow, there's also a delay for the OS to check that i have access to that memory, or is it hardwired somehow?
Pieces of data get paged in and out all the time. When they're reloaded after being paged out, they're not guaranteed to be in the same physical address. Thus, physical addressing of data would cause huge problems.
 
thank you for responses
Pieces of data get paged in and out all the time. When they're reloaded after being paged out, they're not guaranteed to be in the same physical address. Thus, physical addressing of data would cause huge problems.
yeah, that makes sense. I suppose the OS doesn't do much to check if a program has access to an address like i said below. That's all handled by protected-mode processors huh? If thats true, what conditions have to occur in the processor in order for it to throw a segmentation fault, and in what form (certain flags have to be on, and it inserts an int?)

Also, does this imply that every time i access the memory, ontop of the fact that memory is slow, there's also a delay for the OS to check that i have access to that memory, or is it hardwired somehow?
 
( i dont think i understood you correctly, cause that would imply me deleting the file while it was a current process but paged to the drive would cause it to crash once it started executing again)

In Windows you are typically prevented from deleting the file in the first place, which causes no end of annoyance for users. One thing that constantly drives me nuts and still hasn't been fixed...

On the *nix side, the OS keeps track of open file handles, and while it unlinks the file from the directory tree, the data and inode structure remains on disk until the last file handle is closed. This also lets the OS deal properly with files that are moved or replaced while being accessed.

edit: wait. so why does the OS remap those address. whats the point?
There are a whole bunch of reasons. Lots of them based on protecting processes from each other, and access control to hardware. Some major practical reasons, like dealing well with dynamic memory allocation. With linear physical addressing you'd end up with major fragmentation quickly and would have to return a range of addresses or something for any large allocation request - a nightmare for development. It makes the concept of 'swapping' possible; without virtual memory when you run out of physical memory, well you're screwed. It's difficult to implement a modern multitasking OS without a proper MMU and virtual memory.

If thats true, what conditions have to occur in the processor in order for it to throw a segmentation fault, and in what form (certain flags have to be on, and it inserts an int?)
If the address isn't mapped in the page table, or you don't have the access to it you're trying to use, the processor will throw an exception. The OS/runtime can trap it and do what it wants, but doing anything other than kill the program isn't usually the right thing to do. A segfault indicates a major problem.
 
There must be some kernel calls that allow you to test if a certain address has been malloc'ed and by what PID
 
thank you for those quick responses. couple questions:

does this mean when my stack pointer is pointing to 0x7FFF0180dbc0, that this does not correlate to my actual memory (ie, i have 2gb of memory, so it starts at 0x0 and goes to 0x80000000 ...) .... lol, i just answered my own question while typing that, that obviously doesn't make sense 0x7fff0180dbc0 is far out of range of 0x80000000 and even out of range of the limit of a 32bit system.

edit: wait. so why does the OS remap those address. whats the point? why doesn't the OS just tell programs the exact addresses they can or can't access. Also, does this imply that every time i access the memory, ontop of the fact that memory is slow, there's also a delay for the OS to check that i have access to that memory, or is it hardwired somehow?


does that mean that my instruction pointer can never point to an address in the .data segment? I suppose i could test this by attempting to jump to an address in .data segment

wait, you mean executable code is never loaded to a swap partition or pagefile, but instead if the program needs to be moved back to memory it just loads back from the original executable?
( i dont think i understood you correctly, cause that would imply me deleting the file while it was a current process but paged to the drive would cause it to crash once it started executing again)

Memory mapping doesn't work that way. Your OS will map different segments of virtual memory to different segments of physical memory. You can't just say 0 to some amount based on how much memory you have. And yes, some virtual address that you provide does not necessarily map to a physical address since the OS must map that memory for you in some low-level call to memory allocation.
 
In Windows you are typically prevented from deleting the file in the first place, which causes no end of annoyance for users. One thing that constantly drives me nuts and still hasn't been fixed...

On the *nix side, the OS keeps track of open file handles, and while it unlinks the file from the directory tree, the data and inode structure remains on disk until the last file handle is closed. This also lets the OS deal properly with files that are moved or replaced while being accessed.
Tell me: in this scenario, how do Linux systems deal with the opportunity for mis-matched code? The reason that Windows doesn't allow the in-flight replacement trick is because it causes complexity. I'm wondering how (and if, really) Linux systems have that completely sorted.

There must be some kernel calls that allow you to test if a certain address has been malloc'ed and by what PID
malloc() is a C runtime function, not a Unix system call. So, no: there isn't such a function defined by the standard. Some implementations provide functions like malloc_list() or malloc_dump() to show what's been allocated, but those are wrappers over the standard functions and not a widely-available implementation.

If you're starting at a lower level, you can poke around in the kernel's memory management functions to find access to the list of pages, the mapped pages, and so on. But now you have to bridge the gap between the OS and the runtime library implementation, which varies from implementation to implementation.
 
Tell me: in this scenario, how do Linux systems deal with the opportunity for mis-matched code? The reason that Windows doesn't allow the in-flight replacement trick is because it causes complexity. I'm wondering how (and if, really) Linux systems have that completely sorted.
Your right, as far as I know this issue isn't dealt with. I'm not sure it needs to be though, if you're updating shared libraries it's easy enough to find out which applications have linked them in and restart same, which you have to do in the Windows case anyway to get them unlocked in the first place. I also don't think it's very common for an application in its fully running state to randomly pull in new shared libraries where it might run into version problems; generally I think you'll have everything you need pulled in within the first few uses of the application and assuming it's long-lived, so even if you did do in-place replacement it wouldn't cause the issues outlined in the article. Plus ABI compatibility is generally preserved between versions, so even if there were the potential for issues, you'd be unlikely to have any.

You might be able to fabricate a pathological executable that will crash if you do this kind of in-flight replacement, but I think for the vast majority of actual code it won't cause any issues, and where it might it's not hard to force the related processes to restart. This issue will only come up under very specific use and I don't think one avoidable-in-other-ways case warrants the 'lock all files that are open' approach, but it is a choice to be made.
 
Last edited:
so... more noob questions.

How do system drivers work. I mean, they need direct access to actual physical memory addresses right (which may or may not coorelate to ram)? or do they also have their own set of remapped addresses?

If the address isn't mapped in the page table, or you don't have the access to it you're trying to use, the processor will throw an exception. The OS/runtime can trap it and do what it wants, but doing anything other than kill the program isn't usually the right thing to do. A segfault indicates a major problem.
But what actually needs to happen on the machine level? How does the processor know to throw an exception when all its doing is executing machine instructions? I guess i'm asking how does it differentiate instructions that may have access to memory and instructions that are restricted?

Also, what does this "exception" take the form of? Is it an instruction that is inserted into the code stream that jumps the program somewhere else?

I'm sorry i'm asking all these questions, not because asking them is bad, but because i'm looking at it and it looks like i just didn't do enough research. But i've honestly never been able to understand this. I still have a hard time grasping all of this including things like "protected mode" in a processor
 
so... more noob questions.

How do system drivers work. I mean, they need direct access to actual physical memory addresses right (which may or may not coorelate to ram)? or do they also have their own set of remapped addresses?
They run in the same context as the kernel itself, where they have privileges to map anything they want. Exactly how this happens (resource allocation and mapping) is going to be very OS dependent.


But what actually needs to happen on the machine level? How does the processor know to throw an exception when all its doing is executing machine instructions? I guess i'm asking how does it differentiate instructions that may have access to memory and instructions that are restricted?
A table of virtual pages->physical base addresses is maintained in the processor TLB. On a context switch to a new process or thread, the OS must update the TLB to keep it accurate. This table can also store information such as whether a page is writable or executable and some other miscellany, depending on the system architecture. When a process performs a load or write to an address, that address is looked up in the TLB. If a match is found and the requested access is allowed, the program proceeds. If it fails to match or there is no permission, a page fault occurs.

Also, what does this "exception" take the form of? Is it an instruction that is inserted into the code stream that jumps the program somewhere else?
A page fault is an interrupt and the OS will have loaded an interrupt vector with the address of its page fault handler ahead of time. There may be multiple interrupts depending on the type of failure (no map, invalid access etc.). Page faults aren't necessarily bad, they could just mean that the page has been flushed out to disk and needs to be reloaded before the program can continue. However the OS is aware of the memory map on the system and if the address requested isn't valid, will terminate the application.
 
A table of virtual pages->physical base addresses is maintained in the processor TLB. On a context switch to a new process or thread, the OS must update the TLB to keep it accurate. This table can also store information such as whether a page is writable or executable and some other miscellany, depending on the system architecture. When a process performs a load or write to an address, that address is looked up in the TLB. If a match is found and the requested access is allowed, the program proceeds. If it fails to match or there is no permission, a page fault occurs.

Fascinating.is there a way for my program to access the tlb to see what address are mapped where?

Also, how much of this is releveant to 16bit real mode? does the cpu ignore pretty much all of this in real mode?
 
. I also don't think it's very common for an application in its fully running state to randomly pull in new shared libraries where it might run into version problems; generally I think you'll have everything you need pulled in within the first few uses of the application and assuming it's long-lived
Did you read the article mikeblas linked? As Raymond points out, that's hardly the biggest issue. The biggest to me would be Inter-Process Communication. Example:

Program A starts, loads version 1.0 of A.DLL.
A.DLL is replaced with version 1.1.
Program B starts, loads version 1.1 of A.DLL.
Program B attempts to talk to Program A (Raymond uses the common example of drag-drop), assuming it's using the same version of A.DLL.

At this point, bad things happen.
But what actually needs to happen on the machine level? How does the processor know to throw an exception when all its doing is executing machine instructions? I guess i'm asking how does it differentiate instructions that may have access to memory and instructions that are restricted?

Also, what does this "exception" take the form of? Is it an instruction that is inserted into the code stream that jumps the program somewhere else?
On the machine level, it starts getting more complicated - this is approaching the realm of a 400-level college course. If you really want to learn how processors work, I highly reccomend the Patterson/Hennessey text - it's what I learned from in college and is pretty much the definitive book on the subject.
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
At this point, bad things happen.On the machine level, it starts getting more complicated - this is approaching the realm of a 400-level college course. If you really want to learn how processors work, I highly reccomend the Patterson/Hennessey text - it's what I learned from in college and is pretty much the definitive book on the subject.
thank you, i think at this point i'll probably just try to calm down and continue at the same pace as my assembly class. If it's that complicated i probably should get back to the basics :). I should probably check that out in the future though
 
As an Amazon Associate, HardForum may earn from qualifying purchases.
Did you read the article mikeblas linked? As Raymond points out, that's hardly the biggest issue. The biggest to me would be Inter-Process Communication. Example:

Program A starts, loads version 1.0 of A.DLL.
A.DLL is replaced with version 1.1.
Program B starts, loads version 1.1 of A.DLL.
Program B attempts to talk to Program A (Raymond uses the common example of drag-drop), assuming it's using the same version of A.DLL.

This is why you abstract your interfaces and avoid changing them between versions... The majority of IPC in a *nix system occurs via UNIX sockets or TCP/IP, with well-defined interfaces that don't often change. Even when RPC or shared memory are used, I'm not sure this would cause any kind of serious issue. In fact I can't think of a single situation where you'd have such tight coupling between two discrete components that don't link the same shared library, and thus if they're running, can be restarted as already described. If a real serious issue such as this is exposed by a library update, it should be obvious. The system scaffolding can handle these when they crop up, for example most distributions will restart daemons that use certain parts of libc automatically when it is updated as there is an internal ABI between libc and libnss that is not guaranteed to be compatible across versions. Note that this is only really necessary since restarting all the libraries linking libc would mean restarting the whole system, so finer-grained control is desired.

Certainly this doesn't seem to me to be a good reason to carte blanche prevent users from making changes to any open file, however if this type of RPC against a tightly coupled interface that can't be commonly linked is common on Windows, it may be more of a legitimate issue there. Certainly this style of programming is not particularly common among UNIX applications, though with Mono and the like we may see more of these kinds of issues. Either way, I'm not a fan of the OS protecting me from myself or 'my' developers.

Also, how much of this is releveant to 16bit real mode? does the cpu ignore pretty much all of this in real mode?
None. In real mode (as the name implies), all addresses are real physical ones and no virtual memory mapping is available. It has its own odd quirks though. x86 is not a pretty architecture by any measure.

Fascinating.is there a way for my program to access the tlb to see what address are mapped where?
I doubt it without some debug instrumentation in the kernel. If you take a systems/OS class you'll probably spend enough time in the kernel that you'd be able to get what you're looking for. In mine we wrote some drivers, a memory allocator, added some syscalls and devices, modified the process scheduler...it was a fun class.
 
Last edited:
You might be able to fabricate a pathological executable that will crash if you do this kind of in-flight replacement, but I think for the vast majority of actual code it won't cause any issues, and where it might it's not hard to force the related processes to restart. This issue will only come up under very specific use and I don't think one avoidable-in-other-ways case warrants the 'lock all files that are open' approach, but it is a choice to be made.

Not really. In Windows, it happens with all the system DLLs. There's a "common controls" DLL, for example, which includes the generic file open dialog that all applications provide. If a patch changes that while program A is running and program B is not, then different versions of the DLL are loaded by different processes.

I used a tangible example that doesn't do much communication, but the problem still exists: When program A loaded, it asked the library how big a certain structure was. It then cached that value (either numerically or by allocating the memory with the size.) On a subsequent call, if the size is changed ...

Examples that do communication exist throughout Windows; Kernel32.DLL is really just a DLL, and any DLL can decide to handle window messages or callbacks.

Good abstraction doesn't guarantee binary compatibility, and certainly doesn't guarantee semantic compatibility.

In the real world -- commercial and consumer applications -- the OS has to protect itself from users. It also has to protect itself from developers who think they know better.
But what actually needs to happen on the machine level? How does the processor know to throw an exception when all its doing is executing machine instructions? I guess i'm asking how does it differentiate instructions that may have access to memory and instructions that are restricted?

Also, what does this "exception" take the form of? Is it an instruction that is inserted into the code stream that jumps the program somewhere else?

Kind of. It depends on both the OS and the processor architecture, but there are a few different ways to skin the cat.

On Intel machines, a program tries to access protected memory. The processor sees that, and throws an interrupt. The OS handles the interrupt and decides what to do. Generally, it's going to figure out what process caused the interrupt and terminate it. On Windows, that starts a chain of events: it asks the application if it has a dump handler, calls it, collects a minidump, lets the user optionally send it to Microsoft for analysis, and so on.

Exceptions might involve language-based exceptions (like in SQL or C++) where the developer can raise a trap, then handle it themselves. Usually, this is done by recording information in the program stack. If an exception is found, the code handling the exception will start by looking at the stack to see if there is a developer-provided handler to call. If not, it works on unwinding the stack context, destroying local objects, and then working to the next stack layer until it finds either the end of the stack or a handler.

If a handler is found, it is called and it executes. If no handler is found, the application calls the OS and tells it that it died an abnormal death. The OS can do whatever it wants. You can do a simple experiment to see how your OS behaves:

Code:
int main()
{
   throw 51; // throws!
   return 0; // never reached
}

When you step through this in your debugger, in assembly, you should see the throw quickly run out of contexts to unwind, then go through the runtimes to abend the program by notifying the OS.

Some architectures, like Windows, intermingle these mechanisms. You can tell the Microsoft compilers, for example, that you want "catch" to be able to catch system exceptions (like general protection faults) and call your handler. That gives the same semantics to a regular try/throw/catch in C++ to the OS-level exceptions.
 
Last edited:
Good abstraction doesn't guarantee binary compatibility, and certainly doesn't guarantee semantic compatibility.
Well yes, of course it doesn't. But good abstraction makes maintaining ABI compatibility pretty easy. It has to be, since individual libraries are changed all the time on a *nix system without rebuilding the applications that link to them. As I said, changes that break this compatibility are fairly rare and require many applications to be rebuilt. In these situations it is typical to see a major version number change and to keep both libraries on hand during a transition period until all applications can be updated to link to the new, incompatible version.

There are some exceptions to this, but the linker does versioning and if the developer identifies the API/ABI of the library has changed, it will deal with this gracefully. If the requested library has been removed, the application will get the typical library missing error. If it's been moved and the linker reconfigured to be aware of it in the new location, it will work as expected, pulling in the old version automatically.

Some light documentation on this here http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html
 
You're describing features of a linker, not features of a good abstraction. As such, I'm not sure how you're making the point that good abstraction makes ABI-compatibility easy. The major part of ABI compatibility isn't the abstraction; it's noticing the subtle changes that require a break in the ABI interface.

I think that the versioning you describe ends up meaning that most applications just install the version of the library they expect, and use it. The libraries don't change; the versions just stack up. Shared objects, then, end up not really being shared effectively.
 
You're describing features of a linker, not features of a good abstraction. As such, I'm not sure how you're making the point that good abstraction makes ABI-compatibility easy. The major part of ABI compatibility isn't the abstraction; it's noticing the subtle changes that require a break in the ABI interface.
No, I'm making both points. Using good abstractions means breaking ABI compatibility isn't often necessary, which allows the library versioning to be effective without getting a billion different versions of every library on the system. At most you'll have two, in almost every case.

I think that the versioning you describe ends up meaning that most applications just install the version of the library they expect, and use it. The libraries don't change; the versions just stack up. Shared objects, then, end up not really being shared effectively.

Applications don't generally install libraries. They are separate entities, installed and maintained separately, which is why having a stable ABI is important. The versioning I describe means the linker can either fail with an error, forcing the user to restart the application (as would happen if the library weren't installed), or automatically link the older version if it's retained. But as I outlined in my earlier point, that is quite rarely necessary and it's standard practice to upgrade libraries independent of the applications that link them. If an ABI break is required, it's a 'big deal', and will be handled as such - usually a new major version number, and most distributions will hold off on using it until their next major version.
 
But what actually needs to happen on the machine level? How does the processor know to throw an exception when all its doing is executing machine instructions? I guess i'm asking how does it differentiate instructions that may have access to memory and instructions that are restricted?
A processor is actually doing much more than just executing instructions.

There are circuits inside the processor which determine whether the memory access being requested is valid. It will check the location the instruction wants to address against a variety of rules which have been built into the circuitry (for example, addressing memory which it does not own).

Depending on the exception, the processor might interrupt the regular execution of instructions, and jump to another section of memory where it will resume execution and the failure will be handled.

Preempting execution like this serves many purposes, such as allowing multiple process to run simultaneously: the processor will switch execution to the OS scheduler at timed intervals, at which point the OS can decide what to execute next.
 
Back
Top