Strings and Garbage Collection

29,207

Solution 1

That depends. Literal strings are interned per default, so even if you application no longer references it it will not be collected, as it is referenced by the internal interning structure. Other strings are just like any other managed object. As soon as they are no longer reference by your application they are eligible for garbage collection.

More about interning here in this question: Where do Java and .NET string literals reside?

Solution 2

If you need to protect a string and be able to dispose it when you want, use System.Security.SecureString class.

Protect sensitive data with .NET 2.0's SecureString class

Solution 3

I wrote a little extension method for the string class for situations like this, it's probably the only sure way of ensuring the string itself is unreadable until collected. Obviously only works on dynamically generated strings, not literals.

public unsafe static void Clear(this string s)
{
  fixed(char* ptr = s)
  {
    for(int i = 0; i < s.Length; i++)
    {
      ptr[i] = '\0';
    }
  }
}

Solution 4

I will answer this question from a security perspective.

If you want to destroy a string for security reasons, then it is probably because you don't want anyone snooping on your secret information, and you expect they might scan the memory, or find it in a page file or something if the computer is stolen or otherwise compromised.

The problem is that once a System.String is created in a managed application, there is not really a lot you can do about it. There may be some sneaky way of doing some unsafe reflection and overwriting the bytes, but I can't imagine that such things would be reliable.

The trick is to never put the info in a string at all.

I had this issue one time with a system that I developed for some company laptops. The hard drives were not encrypted, and I knew that if someone took a laptop, then they could easily scan it for sensitive info. I wanted to protect a password from such attacks.

The way I delt with it is this: I put the password in a byte array by capturing key press events on the textbox control. The textbox never contained anything but asterisks and single characters. The password never existed as a string at any time. I then hashed the byte array and zeroed the original. The hash was then XORed with a random hard-coded key, and this was used to encrypt all the sensitive data.

After everything was encrypted, then the key was zeroed out.

Naturally, some of the data might exist in the page file as plaintext, and it's also possible that the final key could be inspected as well. But nobody was going to steal the password dang it!

Solution 5

This is all down to the garbage collector to handle that for you. You can force it to run a clean-up by calling GC.Collect(). From the docs:

Use this method to try to reclaim all memory that is inaccessible.

All objects, regardless of how long they have been in memory, are considered for collection; however, objects that are referenced in managed code are not collected. Use this method to force the system to try to reclaim the maximum amount of available memory.

That's the closest you'll get me thinks!!

Share:
29,207
Kyle Rosendo
Author by

Kyle Rosendo

Currently on my mind: Continuous deployment Micro-service Architecture in AWS / other cloud providers Scaling organisations both in terms of technology and people Top book recommendations: Outliers (Malcolm Gladwell) Scaling Up (Verne Harnish) The Five Dysfunctions of a Team (Patrick Lencioni) Lean In (Sheryl Sandberg)** ** This should be mandatory reading in our industry.

Updated on July 30, 2022

Comments

  • Kyle Rosendo
    Kyle Rosendo almost 2 years

    I have heard conflicting stories on this topic and am looking for a little bit of clarity.

    How would one dispose of a string object immediately, or at the very least clear traces of it?

  • Kyle Rosendo
    Kyle Rosendo about 14 years
    Is there any way to make it that certain string literals are not to be interned?
  • Brian Rasmussen
    Brian Rasmussen about 14 years
    @Kyle: Only literal strings are interned automatically. So if you have string s = "hello"; it will be interned where as any string you create at runtime will not be interned unless you do so yourself.
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    Hehe, yes I understand that. I'm asking if there is any way to not have those strings interned?
  • user6170001
    user6170001 about 14 years
    That's an unfortunate example: as per Brian's answer, because dispose me! is a literal in the assembly code, it will be interned and never garbage collected. You're right as far as strings constructed at runtime go though.
  • user6170001
    user6170001 about 14 years
    Kyle: no. Literal strings are part of your assembly, and your assembly is never garbage collected. But does your assembly really contain so many string literals that they're causing memory pressure? Or if this is about security ("clearing traces"), remember that users can inspect your assembly, including its literal strings, without ever executing it -- more easily in fact than examining strings constructed at runtime!
  • user6170001
    user6170001 about 14 years
    There's no deterministic way to clear all traces of a character array from memory either, is there?
  • Ian Mercer
    Ian Mercer about 14 years
    Might be worth adding that forcing GC is rarely the 'right' thing to do ... and that unless you can explain how GC works and what the LOH is you probably shouldn't be messing with it!
  • Josh
    Josh about 14 years
    -1 to anybody suggesting the use of GC.Collect to "dispose" of strings. But itowlson is right on about the interning.
  • Brian Rasmussen
    Brian Rasmussen about 14 years
    @Kyle: Sorry about that. Please see itowlson's comment.
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    Sure, all good and well, but the problem comes with Parameters and the like. They will be in plain text before they're Secured, making it a pointless (or very small point) in doing it this way. The memory collection side is great, but expensive isn't it?
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    @itowlson - Thanks! @Brian, np :) Do you know of any method that can "safely" store a string embedded in an application without SecureString'ing a 1100 char string one char at a time? I say "Safely" as it is not heavy security, more like a bonus.
  • user6170001
    user6170001 about 14 years
    Yes and no. It depends on how safe is "safely." Whatever option you choose, a knowledgeable bad guy will be able to see 1100 characters of gibberish; and if you have to include the decoding code alongside the "secure" string, then a knowledgeable bad guy can disassemble your code and decipher the "gibberish." So it depends on how much knowledge and motivation you're trying to defend against, and how bad it is if the bad guy succeeds. If all you want to do is discourage the casual eye, then encryption with a hardwired key might suffice -- but this would definitely NOT be heavy security!
  • Aviad P.
    Aviad P. about 14 years
    I think there is, just set all the array element values to 0.
  • Aviad P.
    Aviad P. about 14 years
    Oh, and of course there's no deterministic way to clear all traces of a character array, but if you use that character array to represent a string, then there's a deterministic way to clear all traces of that string from memory by using a character array and setting all its elements to 0.
  • Aviad P.
    Aviad P. about 14 years
    But I guess you're right, if the array has been moved by the garbage collector at some point, than an old version of it (with non-0 values) might be lying around in memory somewhere.
  • Niall Connaughton
    Niall Connaughton about 14 years
    If the strings are large, and the concatenation is calculated frequently, caching is a good idea. Joining large strings that are thrown away quickly can cause fragmentation of the Large Object Heap, which isn't compacted during collection. Of course, things like StringBuilders are also useful in these situations to reduce impact on the heap.
  • CG.
    CG. about 14 years
    Well, why couldn't one use a SecureString as a parameter? I've never had a need to try it, but it seems like it should work and be secure. If anyone stores anything that needs to be secure in a non-secure variable, you can bet a simple debugger can get the contents as plain text. Look at a simple textbox used as a password field.
  • CG.
    CG. about 14 years
    -cont All one needs is to look at the raw contents using the windows api and all the *'s in the world are worthless. Now if that textbox was an inherited user control that replaces the backing string variable with a securestring instance, and that instance was passed to a safe calling function, then what else would need to be done?
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    What I was getting at is more a question of interoperability. If I have a WebService written in say, PHP, that would be useless then as a SecureString would need to load that in as a plain string (encryption aside here).
  • Ricky
    Ricky about 14 years
    Oh, this really surprises me! Thx
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    Nicely done, however what would you do if the information is passed as a string via a Web Service for instance (all encrypted of course)?
  • Jeffrey L Whitledge
    Jeffrey L Whitledge about 14 years
    @Kyle Rozendo - If the encryption is transparent to the application (like SSL for example), then there's probably nothing you can do, but if you are doing the encryption yourself (using the System.Security.Cryptography namespace for example) then it's all done as byte arrays anywy, so there's still no need for generating strings. Of course, once you've shown it to the user, then all bets are off.
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    @itowlson - Absolutely, these are deterrents more than anything else. So, if I load a string into a SecureString just to keep it from prying eyes, this would be considered as good enough as a deterrent?
  • user6170001
    user6170001 about 14 years
    You've said that you don't need "heavy security, more like a bonus," so I'd say that encrypting the embedded string would probably be fine as long as you don't make the decrypt key too obvious to the casual Reflectorista. On the other hand, you really only need to bother with SecureString if you want to make sure that the decrypted string really does get cleared. An attacker too lazy to have a go at the 1100 characters of gibberish is probably also too lazy to grovel through the pagefile or slap a debugger on you, so it's not clear to me if SecureString is buying you anything.
  • user6170001
    user6170001 about 14 years
    Incidentally, one possible way around the "obvious attack target" of 1100 characters of gibberish is to use steganography -- e.g. embed those characters as, say, a bitmap resource rather than an encrypted string. This is security through obscurity and won't fool a determined attacker, but if you just wish to deter the idly curious, it might present a less obvious thing for them to poke at!
  • Kyle Rosendo
    Kyle Rosendo about 14 years
    @itowlson - Thanks a ton. I was looking into this just the other day and it is definitely going to be done from my point of view ;) Thanks, I'm accepting this answer due to the original answer and following comments.
  • Basic
    Basic over 8 years
    The reason SecureString has operators for working a byte at a time is because that's how it's supposed to be used. Read from the user/file/stream/etc into the SecureString (byte by byte), then write to crypto provider/stream a byte at a time. The full string should never be in memory, at worst you're talking about lots of single bytes with no ordering
  • Wai Ha Lee
    Wai Ha Lee over 8 years
    You can stop the array from being moved by the garbage collector by using the fixed statement.
  • Roger Hill
    Roger Hill over 7 years
    Your technique is clever, but I think that any keylogger would have easily bypassed it by storing all the keystrokes from the keyboard.
  • Jeffrey L Whitledge
    Jeffrey L Whitledge over 7 years
    @MadTigger - Sure, or someone could mount a small camera on the ceiling and record all the keystrokes being pressed, etc., etc. These laptops were not holding national security secrets or anything like that, fortunately. My main concern was data theft after the laptops were stolen, and I think I got 90% the way there, which was probably good enough in this case.
  • Taedrin
    Taedrin about 5 years
    Note that this won't clean up any spurious copies of your string that were made when the garbage collector compacts the heap or when the operating system moves pages of your process's virtual memory around in RAM.
  • cb88
    cb88 over 4 years
    Also even if you GC... that doesn't mean the data isn't still sitting in ram, if someone freezes you ram and dumps it...
  • Bip901
    Bip901 about 3 years
    Microsoft discourages using SecureString. It is not considered secure anymore.