Corroboration VS Correlation
The Signal to Noise ratio in Cyber Security is overwhelming because of all new devices entering and leaving the network. When you add in the Internet of Things (IoT) the numbers become staggering. The average traffic flowing through a network is twelve Terabytes every 24 hours!
That is five Gigabytes per second. This means you have to generate BILLIONs of hypothesis just to create a hypothesis on what is happening in your network. This is untenable for a cyber professional, so they rely on single point Cyber Security Applications.
Unfortunately, Cyber Security Appliances do not surface and amplify the right signals at a measurable level, mitigate or transfers risk.
In fact, in many conversations, CIOs and CISOs will comment that the signal to noise ratio among vendors is the same. All seem to have similar messages. As a result of the sheer volume of vendor buzzword compliance, they’ve become skeptical. To the IT professional, nothing is different, and as a result, they suffer from appliance fatigue.
How do we, Cyber Security Experts, remove the complexity and begin a common-sense problem-solving approach to Cyber Security? Is there a more common sense and a simple and elegant solution? The answer is found in Occam’s (Ockham’s) razor.
As an approach to problem-solving, Occam’s razor essentially states that “simpler solutions are more likely to be correct than complex ones.” When presented with competing hypotheses to solve a problem, one should select the solution with the fewest assumptions.
It is as simple as:
Determine how many assumptions and conditions are necessary for each explanation to be correct.
If an explanation requires extra assumptions or conditions, demand evidence commensurate with the strength of each claim.
Extraordinary claims require extraordinary proof.
In summary, how does one deconstruct the complexity of Cyber Security? How does one increase the weight of evidence by winnowing out the noise to prove or disprove assumptions and isolate and amplify the signal that matters?
“Stop Correlating and Start Corroborating weights of evidence using four analytic zones to attrit or reduce the number of non events (false positives)”
As verbs the difference between corroborating and correlating is reasonably straight forward:
Corroborate is to confirm or support something with additional evidence; to attest or vouch for.
Correlate is to compare things and bring them into a relationship having corresponding or similar characteristics.
Cyber defenders need to corroborate rather than correlate in order to establish context and cause. Causation is the relationship between cause and effect. So, when a cause results in an effect, that’s causation. However, most solutions use correlation to find the events that might matter, and that is flawed.
When they do correlate they are done in bulk with no clear logical definitions or micro-perimeters of control and lack context. The flaws manifest in too many non-events (false positives) being presented in an alert. Correlation does not imply causation.
Just because you can see a connection or a mutual relationship between two variables, it doesn’t necessarily mean that one causes the other. In other words, just because two trends seem to fluctuate in tandem, that doesn’t prove that they are meaningfully related to one another. While that sounds nice enough on paper, it’s easy to forget when a provocative marketing buzz word gets your attention and in the end, there is no there, there.
Correlation is a statistical technique which tells us how strongly the pair of variables are linearly related and change together. It says any change in the value of one variable will cause a difference in the amount of another variable. The problem is too often humans insert bias to state that one variable makes the other thing happen. Here is another simple example:
Corroboration is a simple concept: Instead of accepting the expert opinion of one of your security appliances, create a consensus using all of your security appliances and your devices normally not involved in a cyber defense. Using a common baseline like raw PCAP as ground truth, corroboration enables cyber defenders to look at the value of the information and make a more informed decision.
By separating the PCAP into four simple segments that consist of Endpoint, Network Segment, Perimeter and Ecosystem, cyber defenders can build context in the data internal to a zone and between them. The key is to recognize that data is not information.
Data + Value = Information.
Therefore if there was a way to assign the value to data and then using the extracted value to build a weight of evidence, then the results will always be better than correlation.
A critical step in solving the corroboration challenge is approached by building a calibration training set of data. A great way to build training sets is to establish declarative boundaries. For example, @RISK uses ecosystem, perimeter, network segments and assets as declarative boundaries. Next, deep learning and pattern space exploring while cataloging discoveries of expected and unexpected patterns.
The critical idea of corroboration is the ability to build a cataloged library of pattern sets by a declarative boundary to machine learn how trustworthy a source is and document its pedigree. Then creating a probability or how likely it is to provide the correct answer helps to build a forecast. However, to design an effective and efficient corroboration algorithm, there are several challenges. The first is how to derive the trustworthiness of a source and given the reliability of a source, how to evaluate the probability its answer is the correct answer.
One way to solve this problem is building a calibration model that can estimate the trustworthiness of the sources by declarative boundaries since an accurate estimation is a basis of computing the probability that an answer is, in fact, the correct answer.
Estimating the trust scores for the sources is achieved through the use of Big Data comparison of distribution, frequencies, and divergence. Corroboration also solves the problem of computing a score for candidate threat actors on the network. It also calculates the likelihood it is the correct one by using a simple Intersection over Union to gauge the similarity and diversity of contributing data sets. Using this coefficient measures the similarity between sources.
In more simple terms, Cyber is a Big Data Challenge and building a corroborative solution has to solve some fundamental problems. In essence, the biggest obstacle to solve using Big Data is how to fill in the blanks when each source in the network only provides partial answers.
How can multiple sources agree on a single answer, and how to can Machine Learning validate the correctness of the answer?
How can a solution combine partial answers from multiple sources to construct a final answer?
How can a solution evaluate the quality of the answer?
How can a solution efficiently derive the correct answer?
It starts with Big Data Analytics. Big Data Cybersecurity Analytics is essential to ensuring a healthy cybersecurity posture through the use of a corroborative approach.
Organizations are 2.25X more likely to identify a security incident within hours or minutes when they are a heavy user of big data cybersecurity analytics.
Eighty-one percent of respondents say demand for big data for cybersecurity analytics has significantly increased over the past 12 months.
Heavy users of big data analytics have a higher level of confidence in their ability to detect cyber incidents than light users.
Concerning 11 common cyber threats, the most significant gaps between heavy and light users concern the organization’s ability to detect advanced malware/ransomware, compromised devices (e.g., credential theft), zero-day attacks and malicious insiders.
The smallest gaps in detection between heavy and light users concern denial of services, web-based attacks and spear phishing/social engineering.
Big Data analytics powers Corroboration because it employs supervised machine learning, deep learning, and artificial intelligence. As surveys show Corroboration using the power of Big Data can provide a different approach that businesses and processes can be kept secure in the face of a cybersecurity breach and hacking.
Employing the power of Big Data, you can improve your data-management techniques, create a single pane of glass while delivering Left of Bang Cyber Situational Awareness. Corroboration enhances visibility: machine speed Detections and cognitively driven digital forensic investigation.
Big Data analytics powered by machine learning and artificial intelligence, provide a promise for businesses and processes will remain secure from a cybersecurity breach and hacking. Deploying a Big Data solution, organizations will improve data-management techniques and enhance existing cyberthreat-detection mechanisms. Big Data doesn’t sleep nor rest and a perfect capability for 24/7 monitoring and improving a security stance to bulletproof business.
Mixed in with periodic penetration Big Data pattern learning ensure constant tuning of analytics. Corroboration will bulletproof your business. More importantly, it powers Network Consensus in a unified, preemptive and proactive cyber security solution.
Measuring, mitigating and transferring risk are critical aspects of conflict and have been for thousands of years. Sun Tzu, the famous Chinese military strategist is attributed with saying: In simple terms to avoid losing, you must be able to defend. Knowing this, you...
In the aftermath of WWI, France built an elaborate defensive system that became known as the “Maginot Line.” It was designed to block an invading army and considered impenetrable. Hitler’s forces overcame these defenses by land, sea, and air to defeat France in 6...
The Magnificent Seven is a movie about a Mexican village that is at the mercy of Calvera, the leader of a band of outlaws. The townspeople, too afraid to fight for themselves, hire seven American gunslingers to free them from the bandits' raids. The professional...