Terry Brugger's Graduate School Work
I'm now officially a Doctor of Philosophy (2009)
in Computer Science from the
University of California at Davis.
I started my coursework in the fall of 2000, was officially admitted
to the program in the fall of 2001, and finished my coursework in
the spring of 2004.
The way the PhD program works in CS at UC Davis is you do your
coursework, hopefully pulling off A's for the courses in the four
core areas (Theory, Systems, Applications, and Architecture),
which thankfully I did, because the alternative is to take the
preliminary exam for each area (and I don't test well). You then
prepare some preliminary research and a dissertation proposal into
a paper, which you present to your qualifying committee, who will
grill you over it. And I mean grill you (more on that in a moment).
Eventually, they pass you, you do the research you said you would
in the proposal, put it together into a dissertation, and have
your dissertation committee sign off on it, pass go (I originally
said that to be funny, but it turns out that the steps between
having your committee sign off and finishing are nontrivial)
and collect your PhD.
Notice that I didn't say anything about a dissertation defense.
That's right -- no dissertation defense. Instead, you do your
defense up front, in the form of your qualifying examination. The
qualification examination is as grueling, if not more so, than
most dissertation defenses. I think this is great, because it
keeps students from traveling down a long path on a dissertation
only to have their committees tell them that their research is
fundamentally flawed. What's more, it keeps committees from
rejecting the research just because they don't like the findings;
for example, discovering that an idea is a bad approach to a
problem is a valid (and rather common) finding for research, but
some committees don't like that and send students back to the
drawing board. The approach at Davis ensures that students don't
waste time with bad approaches to research.
Shortly after starting grad school, I knew that I wanted to research
data mining approaches to network intrusion detection. This was
prompted by a project at work where they came to me and said,
"We've got all this connection log data from a firewall -- we want
a tool that will look at it and flag the suspicious connections."
I figured that there was probably some off the shelf tools we could
grab and, with a little integration work, kick this project out.
Well, there weren't any. Okay, so I figured that some academics
probably had solved the problem, and we just needed to build a
system that implemented their techniques. Indeed, there was a little
research, but it was obvious that it was far from a solved problem.
In fact, at the time (late 2000), it was a very hot research area.
Dissertation city, here I come!
Spending a few years surveying the field, I developed a
presentation
which I gave as a seminar for the UC Davis
Seclab and our College
Cyber Defenders (CCD) program at LLNL. That presentations was
a dry run for my Qualification Examination.
The presentation is based on my original dissertation proposal
The feedback from my committee was that the survey was excellent,
however they had some reservations about the proposal itself.
Based on this, I extracted the survey portion of the paper
Now then, having actually survived the Qualifying Examination,
proper, allow me to offer some Advice
to other UC Davis CS Qualifying Exam.
So, I got through the exam with a "Conditional Pass". Instead of
waiting for the chair of the committee to get back to me with
the changes they wanted before continuing, I forged ahead. The
first step was to baseline against Snort
with the DARPA IDS Eval
data. What I found was that the data was very good at modeling
attacks that signature-based IDS, such as Snort, wouldn't easily
detect, however there was no basis for the background traffic
generated in the dataset. In other words, the data could tell you
if your IDS had a good true-positive rate for non-signature
detectable attacks; however, it was useless for evaluating the
false-positive rate of the system. These results were written
up as
which I submitted to a number of conferences where it was consistently
rejected because they felt the results were nothing new -- it's been
commonly accepted in the network intrusion detection community for years
now that the DARPA IDEval data is flawed. I think the research was
interesting in that it showed that the dataset did have at least one
redeeming quality. After it became clear that no one wanted to publish
the paper, I released it as a UC Davis CS Department Tech Report.
To give credit where credit is due, Jedidiah was one of my College
Cyber Defender students, who put together the actual scripts for the
assessment -- quite impressive for someone just out of high school.
Jed's now in the CS program at UC Berkeley.
Despite these problems with the DARPA dataset, I still see it widely
used in KDD research. As a result, I wrote the following:
Some people have written me and asked if there are any other dataset
for doing data mining for network intrusion detection (not that's
known to the wider network security community), and what they
can do if they want to apply data mining methods for intrusion
detection. Here's what I responded to one person:
- Don't use network intrusion as an application area. Yes, it's
harsh, but it's the most honest answer I have.
- Mix in real network data as Maholney & Chan did; problem is,
that only corrects some of the known flaws in the data.
Personally, I wouldn't much stock in any results from this
approach.
- Grab some real network data and run it through a signature
based NID like Snort or Bro (or both) to identify known
vulnerabilities. Treat anything they did not alert on as unknown
(as opposed to assuming it's normal). The real data mining task
becomes malicious use detection generalization (as opposed to
either strict signature matching or anomaly detection): by training
on the known attacks in the training portion of the data, can the
method identify variants or different types of attacks in the test
data that were not present in the training portion (in addition to
the attacks that were present in both)? For such a task, you
couldn't reliably report the false positive rate, only the true
positive rate (due to the unlikely event that it finds an attack
Snort didn't -- again, that's why the non-attack connections are
unknown, not normal).
- Grab some real network data and start labeling. Problem is that
often times we can't infer the intent of a given connection. One
option here would be to score each connection with how suspicious it
is, say from 1 (you're sure it's benign) to 10 (known network
exploit). At this point, you can apply data mining methods to see
how they compare to the human analyst's assessment. The results of
the data mining may also cause you to reconsider some of the labels.
Making such a labeled dataset available to the data mining & network
intrusion communities (after appropriate annonymization) would be a
huge boon to both.
By the time I finished the DARPA data assessment, I still didn't have the
requirements to pass the qualifying examination. I did, though,
understand why my committee had serious reservations about my proposal.
I couldn't effectively test my data mining methods for network intrusion
detection without data. Now, you have to understand that one can not
use real network data to test IDSs, because one does not know what
the intent behind every connection is. While many may appear obvious,
there are still many that may either be malicious attacks, or benign
misconfigurations. I think the chair of my committee, who is not only
notorious for being overworked and nonresponsive, but is also one of the
nicest gentlemen you'll ever meet, just didn't have it in his heart
to tell me that there's no way my proposal would ever work, for lack
of good test data.
So, the network intrusion detection community needs a better test
dataset. Well, the Lincoln Labs DARPA project to create the IDEval
dataset generated numerous PhD and MS theses, so surely this was
a dissertation worthy endeavor. A month or so down that road and I
said, "Okay, how do I validate that the traffic I'm generating looks
like real network data?" This was, after all, the central problem
with the DARPA IDEval dataset.
Nothing.
There were no established methods to do this. Long story short
(I know, too late): a new dissertation proposal:
which includes my assessment of the DARPA data using Snort to
establish the need for such a methodology. By the time I got that
proposal done, the chair of my committee was serving as an area
manager or some such for NSF. So I volunteered
for a project at work that needed to send people to DC, where upon
I kicked off a job and headed over to NSF headquarters. Allow me to
note that security at NSF is as good as any other government facility
I've visited, so I wasn't able to just go camp outside my chair's
door. Calling from the lobby did actually get me a response though,
so I was able to set up an appointment for the following day, where
I presented my new proposal. Given a couple more weeks to read it and
take care of the paperwork, he finally signed off on it and I advanced
to candidacy. I'm pretty sure that two years between the qual exam
and advancing to candidacy is a record.
Three years of pounding hard on this research, and I'm done!
Here's my recommended version:
It turns out to file the dissertation electronically, it must be under
100MiB, so I had to change all of my high-resolution vector graphics
to bitmaps (because I really didn't want to print
722 pages). Here's the high-resolution version:
And if, for some reason, you want to see the official version --
which has more of the graphs downsampled into bitmaps -- here it is:
I've found the need to frequently defend the need for such a work.
Indeed, before I started down this path, I figured it was a solved
problem. In the hopes of raising awareness of the need for
research in this area (and, let's face it, to hopefully get some
funding), I've put this together:
I've also discovered that many connection metrics I've seen used
over the years are somewhat ambiguously defined. I'm hoping that
the networking community can agree on more concrete definitions
for these metrics, and in order to spur such work, I've put
together:
And to go with it, a set of proposed definitions for connection metrics:
As noted, the three above papers are drafts, and I welcome any and all
feedback on them. I tried publishing the last already as an RFC,
however it was rejected -- apparently engaging the IETF working
groups (WGs) or Area Directors is more than "recommended". So
besides a Technical Report (it's really much too dry to be a
conference or journal paper), I might try the RFC route again if
I can engage the proper people.
"Zow" Terry Brugger
Last modified: Sun Jan 7 2007