As noted in the National Intelligence Council’s Global Trends Report, the future security landscape is increasingly complex and uncertain. One of the more promising ways to manage this complexity and cut through its opacity is in new paradigms of data collection and analysis. Concepts such as Big Data Analysis and Activity Based Intelligence may be critical to gaining timely awareness of risks and opportunities to achieve true insight from a complicated context.
To better understand these issues and their impact, OTH sat down with Dr. Jon Kimminau, a Senior Executive at the Pentagon and forward thinker on matters of intelligence collection and analysis. Since the conversation was extraordinarily rich, we have divided it into a series of posts that will be published over the next three weeks. This is the first part of a three part series.
Over the Horizon (OTH): Can you describe for us some of the key concepts and capabilities behind Big Data and how they fit into the future of intelligence analysis?
Dr. Jon Kimminau (JK): Okay, this is a huge question that relates to a vision we set forth for the future of Air Force Intelligence, Surveillance, and Reconnaissance (ISR) back in 2013. After a few months of working hard on it, what we came up with was that first we had to recognize our resource limitations. We aren’t going to get more people and we aren’t necessarily going to get more resources. But at the same time, we saw that ISR requirements were continuing to climb as they have every year. The demand for more and better ISR was continuing to rise and so the question was: how are we going to meet that without additional resources, no additional people? The only answer we could come up with that we all believed in was that we had to change the way we do analysis. Analysis is the engine of what we produce so changing it would involve the whole infrastructure: going from what we collect, to how we process it, to then how we analyze it, and how we connect to the operators, platforms and staffs that need that information.
So in devising this transformation we said, “Okay first let’s start with the collection”, and the collection and processing in particular, because we all recognize we have a stovepiped system. Every single type of intelligence is stovepiped and even within each there are their own stovepipes of particular types of collection. In just one of those pipes you go from the collection to the data being transmitted to usually a large number of people helping in the processing to produce some sort of database and reports. Think of all these pipes kind of spilling into repositories into which analysts then reach to do their analysis.
I describe that because if that’s the way things are now, we need to look at our investment in people. The rule of thumb we’ve been using is that about 80% of our people including those we call “analysts” are invested on those pipes on the input part and only about 20% of our people are on the analysis side. If we can break down the stovepipes and use more automation and go towards digitization in a cloud – some people like to say “a lake” – for all of the data, we believe we can flip that paradigm so that about 20% are invested on the processing side and 80% are involved in being able to actually work on the data and do analysis and produce more and better stuff, with better tools of course.
In the center of this is where the Big Data construct comes in. If you think of breaking down the stovepipes so that we’re data focused, and of course the cloud where it all gets dumped and structured so that analysts can then operate on it, then that brings us to their tools to do this and that’s where the data analytics part comes in.
So if that’s the vision of the future that we started in 2013, what did people start coming to talk to us about? Well, I call it two waves. The first was kind of the big companies – the Microsofts and Amazons and Googles and IBM’s would talk to us about how to make that data lake. They had different structures and approaches to make a cloud or lake and could bring you that infrastructure to build it. Well, that’s all good and necessary. We know we’re going to need that kind of thing but that doesn’t quite reach the data analytics part yet.
So with the second wave of people we got coming in, and this is where I started to get frustrated, and we’re talking maybe the 2014-2015 timeframe and still today somewhat, we’d get vendors or labs or you name it, people would come in and start talking to us about “look at this slick tool I have that we can give your operator and look at what they can do out there with it.”
Well the tool is really neat but it begs some questions. Among them, at the staff level for instance, we have no idea what tools are out there already being used because they’re being proliferated in different places. Second, we don’t know in the whole spectrum of what we do for analysis where in particular we need tools because nobody’s been trying to take that inventory.
Third comes the questions I wish people would ask if they knew more about data analytics. This led me to search for a framework for talking about Big Data analytics – talking about these questions and understanding the full scope of things rather than looking at tools you can give an individual analyst or looking at just the infrastructure you need for the data.
And I found a framework through two big projects that have been going on across the Intelligence Community in the area of data analytics. One of them is called Activity Based Intelligence and one is Object Based Production. But focusing on Activity Based Intelligence, the Director of National Intelligence (DNI), Director Clapper, charters three – what they call – major issues studies each year, and in 2015 they delivered a major issue study on Activity Based Intelligence. It started because basically DNI was asking: “Okay, I have an agency and a couple others who’ve been pursuing this idea they call Activity Based Intelligence. Is this something for the future of the whole Intelligence Community?”
What the study came out with was a yes to that, but I won’t go into all those answers because the thing I’m coming to is part of what they structured this around. The study produced the framework and said this is how you have to think about the whole thing and that framework, with a slight expansion, I would say colors and organizes my thoughts about where we are in data analytics and where I see everybody working and I call it the Data Analytics Framework.
You have to think conceptually that there are four sets of activities that must take place in data analysis writ large. The first is called Big Data triage and in Big Data triage you have to think it goes all the way from what I collect, how I access it, how I ingest it, how I organize it and how I structure it because for tools to work, they have to work on some sort of structured data. So all Big Data triage activity is how I do all that, how I bring it all in and structure it in a way that it can then be used by analytic tools.
The second set of activities in data analytics just has the label forensic analysis or forensic network analysis. This is where you think of an analyst sitting and applying a tool to the data to, let’s say, look at a geographic box, and look for particular activities, trying to identify patterns from which we’ll get some sort of relevant information about what’s going on. Forensic is a great word for it because this whole set of activities is all about looking at the data you have, kind of looking backwards and pulling from it. And in the Big Data approach this can be years and years of data and you just sort and filter based on the types of questions you’re asking.
The third set of activities is called “activity forecasting” or it could be called “predictive analytics” but the idea here is that we want our, let’s say our analysts writ large, to be able to do more than just look back. We want them to be able to anticipate things that are about to happen. We want them to be able to alert folks that, hey this type of activity appears to be happening in this area. Well, the only way you can get there is with sophisticated tools, the kind of tools that just don’t pop out of nowhere – you have to have people who can sit down and model and say, as an example: what does a mobile missile activity look like, or a mobile missile event? So what are the parts, what are the observables, what’s the sequence, what kinds of things come together before we call it this kind of event? And with that model, then you can build the tools that would enable analysts to actually look at streaming data coming in and identify that this kind of activity might be happening.
It’s important to realize that point because all of the tools that the vendors and labs bring in. If the tool is anything beyond a simple descriptive – like what I call a screwdriver type of statistical analysis that tells you that you’ve got X many of this data points – if you want something more than that, that starts to tell you an event is going on or this looks like this type of activity, well then you have to have that modeling behind it. And it takes a different set of analysts and a different set of tools to build those models.
And that leads finally to the 4th activity which is collaborative analytics. This is the whole idea of the user environment itself. It’s the idea that we want to be able to have our platforms linked to analytics real-time and we want to be able to have our command and control, our operators in operations centers be able to interact with that analytics too – kind of like the internet of things idea. I stress in this fourth column (reference above graphic) for data analytics that they aren’t just users of information, you know, they aren’t just taking information out of some black box, they are participants in the data analytics. You have to think of this like when you go shop Amazon. You are actually contributing to the analytics that goes on behind that screen for Amazon to do it more efficiently. That’s why you get things like “oh you asked for this but do you realize that people who were looking for this were also looking for x, y, or, z,” or “did you realize that people asking for this rated it 2 star out of 5 and here’s their feedback.” Those kinds of things come from your participation. Our operators and platforms are collaborators in data analytics, they’re not just consumers.
Jon “Doc” Kimminau is the Air Force Analysis Mission Technical Advisor for the Deputy Chief of Staff, Intelligence, Surveillance and Reconnaissance. He is a Defense Intelligence Senior Leader (DISL) serving as the principal advisor on analytic tradecraft, substantive intelligence capabilities, acquisition of analysis technology, human capital and standards. Previously, he served nearly 30 years on active duty as an Air Force intelligence officer. Dr. Kimminau holds a Master’s in Public Policy from the Kennedy School of Government, Harvard University, a Master’s in Airpower Art and Science from the School of Advanced Airpower Studies (SAAS), and a PhD in Political Science from the Ohio State University.
This interview was conducted by Sean Atkins, Editor-in-Chief of Over the Horizon, on 14 December 2016.