Thursday, July 12, 2007

Tool Testing Methodology, Memory

In my last post, I described what you'd need to do to set up a system in order to test the effects of a tool we'd use on a system for IR activities. I posted this as a way of filling in a gap left by the ACPO Guidelines, which says that we need to "profile" the "forensic footprint" of our tools. That post described tools we'd need to use to discover the footprints within the file system and Registry. I invite you, the reader, to comment on other tools that may be used, as well as provide your thoughts regarding how to use them...after all, the ACPO Guidelines also state that the person using these tools must be competent, and what better way to get there than through discussion and exchange of ideas?

One thing we haven't discussed, and there doesn't seem to be a great deal of discussion of, is the effects of the tools we use on memory. One big question that is asked is, what is the "impact" that our tools have on memory? This is important to understand, and I think one of the main drivers behind this is the idea that when IR activities are first introduced in a court of law, claims will be made that the responder overwrote or deleted potentially exculpatory data during the response process. So...understanding the effect of our tools will make us competent in their use, and we'll be able to address those (and other) issues.

When a process is created (see Windows Internals, by Russinovich and Solomon for the details, or go here), the EXE file is loaded into memory...the EXE is opened and a section object is created, followed by a process object and a thread object. So, memory pages (default size is 4K) are "consumed". Now, almost all EXEs (and I say "almost" because I haven't seen every EXE file) include an import table in their PE header, which describes all of the dynamic link libraries (DLLs) that the EXE accesses. MS provides API functions via DLLs, and EXEs access these DLLs rather than the author rewriting all the code used completely from scratch. So...if the necessary DLL isn't already in memory, then it has to be located and loaded...which in turn, means that more memory pages are "consumed".

So, knowing that these memory pages are used/written to, what is the possibility that important 'evidence' is overwritten? Well, for one thing, the memory manager will not overwrite pages that are actively being used. If it did, stuff would randomly disappear and stop working. For example, your copy of a document may disappear because you loaded Solitaire and a 4K page was randomly overwritten. We wouldn't like this, would we? Of course not! So, the memory manager will allocate memory pages to a process that are not currently active.

For an example of this, let's take a look at Forensic Discovery, by Dan Farmer and Wietse Venema...specifically, chapter 8, section 17:

As the size of the memory filling process grows, it accelerates the memory decay of cached files and of terminated anonymous process memory, and eventually the system will start to cannibalize memory from running processes, moving their writable pages to the swap space. That is, that's what we expected. Unfortunately even repeat runs of this program as root only changed about 3/4 of the main memory of various computers we tested the program on. Not only did it not consume all anonymous memory but it didn't have much of an affect on the kernel and file caches.

Now, keep in mind that the tests that were run were on *nix systems, but the concept is the same for Windows systems (note: previously in the chapter, tests run on Windows XP systems were described, as well).

So this illustrates my point...when a new process is loaded, memory that is actively being used does not get overwritten. If an application (Word, Excel, Notepad) is active in memory, and there's a document that is open in that application, that information won't be overwritten...at worst, the pages not currently being used will be swapped out to the pagefile. If a Trojan is active in memory, the memory pages used by the process, as well as the information specific to the process and thread(s) themselves will not be overwritten. The flip side of this is that what does get "consumed" are memory pages that are freed for use by the memory manager; research has shown that the contents of RAM can survive a reboot, and that even after a new process (or several processes) have been loaded and run, information about exited processes and threads still persists. So, pages used by previous processes may be overwritten, as will pages that contained information about threads, and even pages that had not been previously allocated. When we recover the contents of physical memory (ie, RAM) one of the useful things about our current tools is that we can locate a process, and then by walking the page directory and table entries, locate the memory pages used by that process. By extracting and assembling these pages, we can then search them for strings, and anything we locate as "evidence" will have context; we'll be able to associate a particular piece of information (ie, a string) with a specific process. The thing about pages that have been freed when a process has exited is that we may not be able to associate that page with a specific process; we may not be able to develop context to anything we find in that particular page.

Think of it this way...if I dump the contents of memory and run strings.exe against it, I will get a lot of strings...but what context will that have? I won't be able to associate any of the strings I locate in that memory dump with a specific process, using just strings.exe. However, if I parse out the process information, reassembling EXE files and memory used by each process, and then run strings.exe on the results, I will have a considerable amount of context...not only will I know which process was using the specific memory pages, but I will have timestamps associated with process and threads, etc.

Thoughts? I just made all this up, just now. ;-) Am I off base, crazy, a raving lunatic?

3 comments:

Anonymous said...

Concerning the admissibility of RAM-dump-evidence, I don't think that it's very different from offering evdience from a live acquisition of non-volatile media. Neither will exist in its acquired state after the mission. While the memory chips may become totally devoid of data, you never know whether the hard drive may go south or the machine lost. So, in either case, you have an image, for what its worth.

This goes to your point of tool validation. If I gather RAM with an accepted, validated tool, I have a high probability that what I've gathered will be admitted. Tool criteria include whether the data the tool has acquired actually existed. Perhaps we can set up a test and load memory with known data and see whether a tool gathers that data. I'll defer to you on how such a test can best be accomplished in a fornsically acceptable manner. Obviously, we should use a few tools for comparison. An issue is that the playing field is different in every test.

It's important to demonstrate that we didn't introduce (into RAM) the data offered into evidence. In the case of index.dat records, it shouldn't be too difficult. Likewise for text from a document file. It may be more difficult to set forth what we should infer from the data. Also, we will have overwritten something, but nobody can say what it was. Exculpatory evidence? Maybe. Proving a negative is difficult at best.

Should an attempt at RAM acquisition be a routine practice? From what folks have discovered, maybe it should be part of every case. What's the best way to do that? (I'm asking; I don't know.) Linux aside, should I boot a machine with a clone of the original drive? Put the chips in another machine? The latter may be a problem with compatibility, plus I'd be nervous about what a different machine will throw in RAM. Someone needs to develop a R/O hardware device for DIMMS/SIMMS, etc.

The files/registry data created or modified are easy to measure. I guess we'd also see the RAM data created by our tool in the resulting dump file, although I guess it's possible that it could be faulted to the page file. Yes? Interesting stuff!

H. Carvey said...

Tool criteria include whether the data the tool has acquired actually existed.

I think this is an important issue, but not one that will be answered technically. The reason is that acquiring volatile data is not something that is inherently reproduceable. It's not like acquiring an image from a hard drive, where another examiner can do the same thing, and ideally get the same result (MD5, etc.).

It's important to demonstrate that we didn't introduce (into RAM) the data offered into evidence.

Good point...any thoughts on how to do this? ;-)

Should an attempt at RAM acquisition be a routine practice?

Not necessarily. The ACPO Guidelines, for example, mention the need for competence. This should be part of the responder's checklist, along with things like justification for live acquisition over post-mortem, etc.

Anonymous said...

...where another examiner can do the same thing, and ideally get the same result

Except, of course, in an live acquisition when we image a running drive. That situation is analogous to acquiring RAM. We're back to your point of tool reliability and competence.

any thoughts on how to do this?

To begin with, you have to consider the case and decide whether you want to acquire RAM, much as you said in your last paragraph. I think some things are obvious. Assume that I use a tested tool and sterile media. I find text or index.dat records. I think I can say that I didn't put them in memory. However, I did write the memory associated with, for example, the X-Ways Capture process. That, I'll admit.

Sometimes, we beat ourselves up too much! It's kind of like, "Who was at the keyboard?" of an unprotected system, absent any reliable artifacts or evidence. If pushed, you can always respond, "It could have just as easily been Martians," if the judge has a sense of humor. :-)