What is the "FS"/"GS" register intended for?

assembly x86 cpu-architecture cpu-registers memory-segmentation

96,569

Solution 1

There is what they were intended for, and what they are used for by Windows and Linux.

The original intention behind the segment registers was to allow a program to access many different (large) segments of memory that were intended to be independent and part of a persistent virtual store. The idea was taken from the 1966 Multics operating system, that treated files as simply addressable memory segments. No BS "Open file, write record, close file", just "Store this value into that virtual data segment" with dirty page flushing.

Our current 2010 operating systems are a giant step backwards, which is why they are called "Eunuchs". You can only address your process space's single segment, giving a so-called "flat (IMHO dull) address space". The segment registers on the x86-32 machine can still be used for real segment registers, but nobody has bothered (Andy Grove, former Intel president, had a rather famous public fit last century when he figured out after all those Intel engineers spent energy and his money to implement this feature, that nobody was going to use it. Go, Andy!)

AMD in going to 64 bits decided they didn't care if they eliminated Multics as a choice (that's the charitable interpretation; the uncharitable one is they were clueless about Multics) and so disabled the general capability of segment registers in 64 bit mode. There was still a need for threads to access thread local store, and each thread needed a a pointer ... somewhere in the immediately accessible thread state (e.g, in the registers) ... to thread local store. Since Windows and Linux both used FS and GS (thanks Nick for the clarification) for this purpose in the 32 bit version, AMD decided to let the 64 bit segment registers (GS and FS) be used essentially only for this purpose (I think you can make them point anywhere in your process space; I don't know if the application code can load them or not). Intel in their panic to not lose market share to AMD on 64 bits, and Andy being retired, decided to just copy AMD's scheme.

It would have been architecturally prettier IMHO to make each thread's memory map have an absolute virtual address (e.g, 0-FFF say) that was its thread local storage (no [segment] register pointer needed!); I did this in an 8 bit OS back in the 1970s and it was extremely handy, like having another big stack of registers to work in.

So, the segment registers are now kind of like your appendix. They serve a vestigial purpose. To our collective loss.

Those that don't know history aren't doomed to repeat it; they're doomed to doing something dumber.

Solution 2

The registers FS and GS are segment registers. They have no processor-defined purpose, but instead are given purpose by the OS's running them. In Windows 64-bit the GS register is used to point to operating system defined structures. FS and GS are commonly used by OS kernels to access thread-specific memory. In windows, the GS register is used to manage thread-specific memory. The linux kernel uses GS to access cpu-specific memory.

Solution 3

FS is used to point to the thread information block (TIB) on windows processes .

one typical example is (SEH) which store a pointer to a callback function in FS:[0x00].

GS is commonly used as a pointer to a thread local storage (TLS) . and one example that you might have seen before is the stack canary protection (stackguard) , in gcc you might see something like this :

mov    eax,gs:0x14
mov    DWORD PTR [ebp-0xc],eax

Solution 4

TL;DR;

What is the “FS”/“GS” register intended for?

Simply to access data beyond the default data segment (DS). Exactly like ES.

The Long Read:

So I know what the following registers and their uses are supposed to be:

[...]

Well, almost, but DS is not 'some' Data Segment, but the default one. Where all operation take place by default (*1). This is where all default variables are located - essentially data and bss. It's in some way part of the reason why x86 code is rather compact. All essential data, which is what is most often accessed, (plus code and stack) is within 16 bit shorthand distance.

ES is used to access everything else (*2), everything beyond the 64 KiB of DS. Like the text of a word processor, the cells of a spreadsheet, or the picture data of a graphics program and so on. Unlike often assumed, this data doesn't get as much accessed, so needing a prefix hurts less than using longer address fields.

Similarly, it's only a minor annoyance that DS and ES might have to be loaded (and reloaded) when doing string operations - this at least is offset by one of the best character handling instruction sets of its time.

What really hurts is when user data exceeds 64 KiB and operations have to be commenced. While some operations are simply done on a single data item at a time (think A=A*2), most require two (A=A*B) or three data items (A=B*C). If these items reside in different segments, ES will be reloaded several times per operation, adding quite some overhead.

In the beginning, with small programs from the 8 bit world (*3) and equally small data sets, it wasn't a big deal, but it soon became a major performance bottleneck - and more so a true pain in the ass for programmers (and compilers). With the 386 Intel finally delivered relief by adding two more segments, so any series unary, binary or ternary operation, with elements spread out in memory, could take place without reloading ES all the time.

For programming (at least in assembly) and compiler design, this was quite a gain. Of course, there could have been even more, but with three the bottleneck was basically gone, so no need to overdo it.

Naming wise the letters F/G are simply alphabetic continuations after E. At least from the point of CPU design nothing is associated.

*1 - The usage of ES for string destination is an exception, as simply two segment registers are needed. Without they wouldn't be much useful - or always needing a segment prefix. Which could kill one of the surprising features, the use of (non repetitive) string instructions resulting in extreme performance due to their single byte encoding.

*2 - So in hindsight 'Everything Else Segment' would have been a way better naming than 'Extra Segment'.

*3 - It's always important to keep in mind that the 8086 was only meant as a stop gap measure until the 8800 was finished and mainly intended for the embedded world to keep 8080/85 customers on board.

Solution 5

According to the Intel Manual, in 64-bit mode these registers are intended to be used as additional base registers in some linear address calculations. I pulled this from section 3.7.4.1 (pg. 86 in the 4 volume set). Usually when the CPU is in this mode, linear address is the same as effective address, because segmentation is often not used in this mode.

So in this flat address space, FS & GS play role in addressing not just local data but certain operating system data structures(pg 2793, section 3.2.4) thus these registers were intended to be used by the operating system, however those particular designers determine.

There is some interesting trickery when using overrides in both 32 & 64-bit modes but this involves privileged software.

From the perspective of "original intentions," that's tough to say other than they are just extra registers. When the CPU is in real address mode, this is like the processor is running as a high speed 8086 and these registers have to be explicitly accessed by a program. For the sake of true 8086 emulation you'd run the CPU in virtual-8086 mode and these registers would not be used.

View more solutions

96,569

Author by

user541686

Updated on September 17, 2021

Comments

user541686 over 2 years
So I know what the following registers and their uses are supposed to be:
- CS = Code Segment (used for IP)
- DS = Data Segment (used for MOV)
- ES = Destination Segment (used for MOVS, etc.)
- SS = Stack Segment (used for SP)
But what are the following registers intended to be used for?
- FS = "File Segment"?
- GS = ???
Note: I'm not asking about any particular operating system -- I'm asking about what they were intended to be used for by the CPU, if anything.
user541686 almost 12 years

lol. Until NaCl came along I guess. :) +1 great explanation, didn't know about the first paragraph
Ira Baxter almost 12 years

I just looked up NaCL; you mean Google's Chrome sandbox? What's the relationship to segmented VM?
user541686 almost 12 years

Yup, I mean Chrome's Native Client (which is sandboxed). It uses segmentation, which kind of made it a bummer that x64 doesn't support it.
supercat over 10 years

Were they intended to be used for OS-defined purposes, or to facilitate code which needs to do something like *dest++ = lookup[*src++]; which would otherwise be rather awkward if dest, lookup, and src were at three unrelated locations.
supercat over 10 years

The 8086's segmenting was brilliant. I know of no better scheme for allowing a machine with a given register size to access an effective address space 16 times as big. Even today 8086-style segmentation (but with a 32-bit segment register shifted and added to a 32-bit offset) could be better than linear addressing, allowing 32-bit object references to access 64GB of addressing space. With some slight tweaks, the useful range of 32-bit object references could be pushed out even further.
Ira Baxter over 10 years

@supercat: A simpler, more brilliant scheme that would have let them address 65536 times as much storage, would been to have treated the segment registers as full upper 16 bit extension of the lower 16 bits, which is in essence what the 286, 386 and Multics did.
supercat over 10 years

@IraBaxter: The problem with that approach is that 80286-style segments have a sufficiently high overhead than one ends up having to store many objects in each segment, and thus store both segment and offset on every pointer. By contrast, if one is willing to round memory allocations up to multiples of 16 bytes, 8086-style segmentation allows one to use the segment alone as a means of identifying an object. Rounding allocations up to 16 bytes might have been slightly irksome in 1980, but would represent a win today if it reduced the size of each object reference from 8 bytes to four.
ahoka over 10 years

"It would have been architecturally prettier IMHO to make each thread's memory map have an absolute virtual address (e.g, 0-FFF say) that was its thread local storage (no [segment] register pointer needed!)" Can you elaborate on this?
Ira Baxter over 10 years

It mostly speaks for itself. Each thread needs storage of its own. Having that storage at a known, constant address makes it convenient to access (Windows TLS storage takes several machine instructions to access if you cheat, and dozens if you call the official GetTLS API). Setting aside virtual page zero would achieve this; the OS would have to set a map page for each thread when it scheduled the thread; that's presumably cheap. ...
Ira Baxter over 10 years

... I did a version of this for 8 bit 6800 microprocessors. There was no memory map, so instead scheduler saved/restored 8 bytes of page zero context (literally at location 0) per thread. With only 3 registers available, having 8 bytes of always-available, thread specific space simplified assembly language code tremendously and actually gave us faster code since page zero accesses were cheap.
diogovk over 9 years

One thing that is not clear to me is, since those registers are not used in modern operating systems, would they be free for general use?
Ira Baxter over 9 years

Those registers are used in modern operating systems. They're mostly dedicated to point to information about task control blocks, at least in the two major OSes now available for x86 chips. And, since they are no longer "general purpose" even for their original intent, you can't use them for much. Better to pretend on x86-64 systems that they simply don't exist until you need the information they let you access in the thread control blocks.
Olorin about 9 years

I'm glad to see another soul who thinks this whole flat memory fetish is overrated. In fact I wouldn't be surprised if at some point they reintroduce it back into 64-bit mode. Think about it : with all the security scandals going around these days, every compiler needs to check for all kinds of attacks, like buffer overflows, stack overflow attacks, ... This can be done with very little overhead by the CPU if segmentation is used (for example set the limit of your top-down stack segment to the limit of your code segment).
Olorin about 9 years

And now that I think of it: a lot of the discussions are about how inefficient segmentation is, but in 32 bit (and possibly 64 bit) mode you hardly need to manage segments at all, just set the right limits and access rights. Intel COULD have decided to optimize their CPU's for this, like add a descriptor cache or so, but they didn't. Also many people forget that paging also slows down the system by about 3% (4K pages), and that's not counting management of the page tables.
Ira Baxter about 9 years

And with lots of bits on the CPUs, one can have lots of segment registers live at once acting as a fine local cache. Yes, I think the flat address space was a truly dumb idea. Multics handled the majority of its security issues by attaching what amount to capabilities to the segments.
Nedko over 8 years

On Windows FS is indeed for thread specific storage. See documented map of the block pointed by FS here en.wikipedia.org/wiki/Win32_Thread_Information_Block
code_dredd over 8 years

The appendix analogy is really bad based on outdated science; it's related to the immune system, so definitely not "vestigial". It detracts from the actual post. Other than that, it's a good response.
Peter Cordes almost 8 years

Using the actual zero page sounds like a bad idea, because it's easier to debug when NULL-pointer dereferences fault. (And it's highly convenient + sometimes incorrectly assumed for the bit-representation of NULL to be all-zeros). The same idea would work with the top few page(s) of virtual address space, or always leaving the low 64k unmapped and putting TLS after that.
Peter Cordes almost 8 years

Also note that this scheme would require context-switches between two threads of the same process to modify the page tables, but in the current scheme all threads share the same page table. TLS is fairly rarely used, I think. A tiny bit of extra code-size / perf overhead in code that does use it is probably preferable to extra context-switch overhead. Of course, the TLS area could be architecturally supported somehow to avoid TLB invalidation.
Ira Baxter almost 8 years

Yes, you could remove the lower most (zero page) and upper most (FF'd page) from the address space to catch null/small integer values used as pointers, and use some other place in the address space. Yes, the TLB switch for the TLS variables might be expensive on a context switch, but if you are right and TLS isn't often accessed, you could delay the TLB load until the first TLS access after the switch (or at least until some available unbusy bus cycles). If you have a LOT of parallel grains (my PARLANSE language does), you will find higher use for TLS (at least, that's our experience).
Nick over 7 years

I'd just like to throw out there that you're a bit incorrect about "Since Windows and Linux both used FS for [thread local storage] purpose in the 32 bit version". Linux used/uses GS in 32-bit environments. In fact, Windows/Linux toggled their entries to be opposites in both 32 and 64 bit environments: Linux32:GS, Linux64:FS, Win32:FS, Win64:GS. And they say 'opposites attract" - more like "opposites attack".
Ira Baxter over 7 years

@Nick: thanks for the clarification. I'll take your word as gospel on the details; I've modified the text. However, I don't think this detail mattered much for this discussion. The main point is the original valuable ideas/implementations behind these registers has been irretrievably lost, and we are all paying for it, long term, IMHO.
Martin over 7 years

I might be doing thread necromancy here, but isn't the problem with Multics and segmentation in general that it's all insanely complex? Even today we still have Intel manuals rambling endlessly about segmentation when nobody has used it since time immemorial. I can praise people who're able to solve technical problems with limited resources, but the worship of complexity that I'm seeing here is just baffling.
Ira Baxter over 7 years

An OS is an insanely complex artifact if you include all the services it provides. Multics provided extremely good security; modern systems seem pretty bad at this because we gave up that complexity. If you don't use the features, yes, the description of them all seems pretty pointless and wasted. Instead we worry about viruses, data theft and soon probably real computer controlled industrial damage (remember Stuxnet?). Seems like a bad trade to me.
Bodo Thiesen over 7 years

I believe, segmentation died with the introduction of the 80386 CPU which kept segment registers 16 bit wide, thus effectively preventing the concept of "every malloc returns a new segment" (including free of charge valgrind, and if you ban out local variable arrays into temporarily malloced off-stack memory blocks handled by the compiler directly, then we would have never had to deal with buffer overflows at all).
Bodo Thiesen over 7 years

And of course: The 80386 allowed a segment to be up to 4GB in size but only offered a 4GB linear address space as well, so preventing two (or more) non-overlapping 4GB segments to exist at the same time. (With 6 segment registers, we would have needed at least 24GB of linear address space).
Ira Baxter over 7 years

@BodoThiesen: Yes, there weren't enough unique segment IDs to make malloc-returns-segment practical. But there were plenty for provide access to DLLs, files, thread control blocks, and other entity that required access controls or need sharing across processes (including modest numbers of "shared memory" regions). That's still incredibly practical as the original Multics system showed. And using segments-per-malloc doesn't reaally solve the buffer overflow problem; consider a malloc'd struct that contains a string: you can access off the end of the string inside the struct.
Ira Baxter over 7 years

@BodoThiesen: I don't know where you got "need 24Gb address space" from. The point of full segmentation is that segments are mapped in page size pieces anywhere in the physical space convenient, including "not present" and including an upper bound on segment size. So smaller segments take correspondingly less space. Even so, arbitrary pages in the segment can be indivudually paged, so even with 6 active segments that really do contian 4Gb of data, the pages are treated like VM. That may page a lot if you only have 1Gb of RAM, but that's no different that present OSes.
Ira Baxter over 7 years

@Martin: I think I responded a bit wrongly. Your complexity complaint seems to be about the hardware machinery needed to support segments. Yes, that's more complex than flat page maps. But the payoff is the kernal of the OS, which provides the foundation services and ensures that access rights are not violated, can be much smaller, and you can build the "rest" of the OS services on top of that. Multics ran reliably in a machine with very small memory by today's standards. WIth modern OSes you have to build a complex kernal; you can't even boot with less than a GB these days for some.
Ira Baxter over 7 years

@Martin: ... and you sure as hell can't trust it.
Bodo Thiesen over 7 years

@IraBaxter: structs containing a string - avalid argument. I didn't write it but I actually are banning all static arrays whatsoever. So your string inside a struct would be a char * for me pointing to another malloced block. About the 24GB: If we have 4GB of linear address space, then we only have 4GB of addressable pages. All segments (loaded into registers) need to be present, and if they contain in sum more than 4GB, you're lost, because you can't map 8GB addresses uniquely to 4GB linear address space. 4GB * 6 = 24GB. Physical mem may be much smaller, even today I "only" have 16GB RAM.
Ira Baxter over 7 years

@BodoThiesen: It isn't "strings in structs"; it is any two entities in a struct, a language that has pointers that can be advanced (e.g, *p++), and a program with a pointer to the first entity in struct that abusively advances it. Now you have overwrite of the 2nd entity in spite of it being a completely different type than the first. Are you planning to outlaw C? Good luck selling that to the world. ...
Ira Baxter over 7 years

@BodoThiesen: I still dont understand your claim of outrageous storage demand for segments. If you have 6 live segments each with 4 GB of highly/randomly accessed locations, you will need 6*4GB = 24Gb of active RAM to back it up if you don't want continuous page faults. However, that's not different in any way from having 24Gb of locations in a linear address space. If your accesses are not completely random, than a sparse assignment of the individual segment pages to available RAM gives the classic advantage of virtual memory. ....
Ira Baxter over 7 years

@BodoThiesen: You seem to think that if a segment is logically of size X, you must map the whole segment to physical memory as a monolith. That wasn't true of Multics, and isn't true of 386 segments. Segments are mapped using pages of the same size as the flat address space you now see in Windows/Linux (which are in fact using that exact same paging mechanism hardware). The page map from your virtual/segment space to physical memory is only partial. (You get a page fault when you touch a page marked as "not present" in the map).
Ira Baxter over 7 years

PS: if you want less complicated "segment" hardware that protects individual arrays, you should check out the Burroughs 5500 series of stack machines, which did precisely this. See en.wikipedia.org/wiki/Burroughs_large_systems The rest of world didn't that good protection was worth it and bought Unisys and IBM mainframes with no protection at all (early 60s and 70s; even IBM didn't have VM on mainframes until ~~1975).
Bodo Thiesen over 7 years

@IraBaxter: A segment descriptor contains a base address of 24bit (80286) or 32bit (80386+) and a limit field of 16bit (80286) or 20+1bit(30386+, yielding byte ganularity with G=0 or page granularity with G=1). So, you can map a 4GB segment anywhere into the 4GB linear address space (which is then mapped to physical addresses or page faults in a second step) but you can only map one 4GB segment in this way. The second one will reuse all the linear addresses of the first segment. Paging only decides, where to map those linear addresses to physical addresses in a second step.
Bodo Thiesen over 7 years

So, it is perfectly possible to have a 4GB segment on a system with only 1MB RAM and everything will work fine, but you can't have an additional non-overlapping segment of any size at the same time even with 4GB of RAM and unlimited disk space. The paging unit maps 4GB to 4GB (plus page faults - e.g. swap) and all segments share the bottleneck of the first 4GB.
supercat about 7 years

@BodoThiesen: Sorry for necro-posting, but I was glad to see someone else who appreciates what segmentation could be. What I'd have liked to see on the 80386, however, would have been 32-bit segment registers that combine a small selector with a scalable offset. That would have made it practical to have every object start at offset 0 of some "segment" (thus allowing segments alone to be used as object identifiers) with no overhead beyond aligning alignment to the next segment boundary. If the selector could select among zones with different scale factors...
supercat about 7 years

...then one could have a "small objects" segment with a scale factor of 16, a "medium objects" segment with a scale factor of 256, and a "large objects" segment with a scale factor of 4096. If the 32-bit segment included 24 offset bits, then code which allocated one selector value (out of 256) for each of the three categories could accommodate 256MB of small objects (16-byte aligned), 4GB of medium objects (256-byte aligned), or 64GB of large objects (4KB aligned), all using 32-bit object identifiers.
E.T over 6 years

It's not just on Windows. GS is also used for the TLS on OS X. GS is also used by 64bit kernels to keep track of system structures during context switches. The OS will use SWAPGS to that effect.
kralyk over 6 years

This answer has a lot of what it presents as "old wisdom", but in fact is really just nostalgia not much based on anything factual. The access to files via memory segments might've been nice back in the time of Multics, but today it's much better solved on a higher level (ie memory-mapped files provided by the OS). The thread local space being at an absolute address would be a security problem as that would hinder ASLR.
Ira Baxter over 6 years

@kralyk: your response appear to imply that you have "new wisdom". File access by mapping files into the address space works, but you program has to go thru work to establish set up and tear down that mapping. DIrect segment access requires only the load instruction that accesses it; no set up or tearn down, so file-in-VM-window just looks like extra, pointless code. ....
Ira Baxter over 6 years

... Regarding the thread storage address: every engineering decision is a tradeoff. ASLR is a response to poor security in general. We shouldn't design systems to support a specific security solution hack; we should design them in a way that makes it easy to build software. (You didn't remark that Multics segments prevents ASLR as a solution, period. That's OK; Multics was far more secure than Windows and most other systems have ever acheived).
kralyk over 6 years

@IraBaxter Yes, memmapped files requires some setup, which is in accordance with the flexibility it provides, ie. if the equivalent were to be provided by the ISA, it would need to be set up / configured as well. And updating / extending that functionality would be far more difficult compared to when it is provided by the OS. Regarding ASLR, I'm not sure why you think it is a hack. It is no more a hack than file permissions or passwords are - it makes sure only those who should have an access have that access.
Ira Baxter over 6 years

@kralyk: You're defending an unnecessarily complex mechanism by claiming you need to add it to the ISA, and you can add more goo to the OS "for extended functionality" (uh, specificially, what extended functionality? The whole beauty of segments is their simplicity in use, their security in practice, and the fact that you don't need "extended functionality". ASLR is cute but added as an afterthought; it is hardly a complete security solution by itself. What matters is that you have enough. Multics proved to be more secure than the then and mostly the presently available OSes.
kralyk over 6 years

@IraBaxter The flexibility was a bit of an umbrella term for all the configuration you can do with today's mem mapping. If you want a specific example: filesystems. I don't even know how you would implement memmapped files on the ISA level given how layered and varied filesystems are and how they depend on the OS. A Multics-like solution would probably end-up beign a lot more complex than the API provided by the OS, which is fairly simple. As for Multics' security, that's apples to oranges, really. Multics has never been even close to being exposed to the dangers and scrutiny modern OSes are.
Ira Baxter over 6 years

@kralyk: Multics had full and complete filesystem. I don't understand, how you don't understand, that with segments you don't need any of that memory mapping nonsense, or any of the "flexibility" (that you didn't specify). Multics doesn't need "a solution" for this at all, so it cannot be more complex wrt this topic. Regarding security: Multics: rings of protection, tasks done at the appropriate level of security, owner/user access rights, all executable code read protected, buffer overruns only on data in the segment if it happens at all... I think it would stand up pretty well.
kralyk over 6 years

@IraBaxter How many filesystems did multics support? Linux supports dozens of them and memory-mapping works on many of them AFAIK. That's the flexibility, amongst other things. Besides, if you really wanted to get rid of that file opening or mapping "nonsense" (as you call it), you could write a library that would abstract that away, it wouldn't even be very difficult. Regarding security, read or write protection of exec code is moot in presence of techniques like ROP. Besides, there have been papers describing serious security bugs in Multics. Security is hard, there are no silver bulltets.
Ira Baxter over 6 years

@kralyk: I dont know how many. I'd guess they implemented several over its lifetime. The API of all of them would have been segments; the change would have been invisible to application program so the same "flexibility" would have been there. Yes, they might have had a ROP problem; and, like any other OS, any particular system application may have had exploitable bugs. Its track record was extremely good. You seem stuck on Linux and memory mapping, OK, stay stuck. I see the world differently than you; addressing it directly is beautifiul and efficient. Let's leave it at that.
kralyk over 6 years

@IraBaxter Linux was just an example, other OSes typically work in a similar manner though. I'm not sure what the point of insisting on Multics is since you mostly can't use it nowadays in practice anyway. If you want to use files without having to deal with opening/closing or setting up mmap, write yourself a library, like I said earlier, that's by far the simplest solution and doesn't involve any hardware or ISA changes. Otherwise the discussion is hypothetical and mostly apples to oranges (since Multics' ecosystem is historical and mostly incomparable to that of today's OSes).
Ira Baxter over 6 years

@kralyk: Alas, you are right about the absence of Multics. But then, that's my point: Eunuchs.
Richard Hodges over 6 years

Thanks for the amusing, no-holds-barred treatment of segmented vs flat memory :) Having also written code on 6809 (with and without paged memory), 6502, z80, 68k and 80[123]?86, my perspective is that segmented memory is a horror show and I'm glad it was consigned to the dustbin of history. The use of FS and GS for efficient access of thread_local data is a happy unintended consequence of an historical error.
i486 about 6 years

I have used FS and GS registers in a Win16 application with parts in assembly. With 64K segments you have to re-load ES register many times and FS/GS were very useful.
Michael Petch over 5 years

This doesn't actually answer the question. The question states Note: I'm not asking about any particular operating system -- I'm asking about what they were intended to be used for by the CPU, if anything.
zerocool over 5 years

@MichaelPetch ya i know i just want to add this as good info for those who read this q/s in SO
user541686 almost 4 years

Wow, thank you for explaining all this! This explains a lot and makes so much sense! +1
Atticus Stonestrom almost 4 years

Sorry to be resurrecting an ancient post, but do you have a link to the Andy Grove tantrum? Sounds interesting! In all cases thanks for a fantastic answer here
Ira Baxter almost 4 years

@AtticusStonestrom: Hmm. Haven't had much luck locating Andy Grove's tantrum. I'll keep looking. I did find this history of Multics which is pretty interesting, including several almost completed attempts to migrate Multics to 486 machines which had "reasonable" (Multics-compatible) segment registers: multicians.org/history.html No hints of attempts to resurrect on x86-64.
Atticus Stonestrom almost 4 years

@IraBaxter No worries at all, thank you so much for the reference!! Looks very very interesting
tuket almost 3 years

"In windows, the GS register is used to manage thread-specific memory"... isn't it FS ?
Johan Boulé over 2 years

@tuket their 32-bit os uses fs and their 64-bit os uses gs. linux did the opposite move.