Literature review is the next step after the research problem is specified. This review the research that wanted to be conducted with other people’s work that is similar and also specify what makes the research different from the other people’s work. During the literature review, researcher also need to formulate hypothesis.
The hypothesis helps to focus and act as a guidance for the researcher during the research work. Hypothesis functions are to delimit …show more content…
Data collection which collect specific data to be analyzed to achieve the objective of the research. Then the analysis interprets the data so it has some meaning. Finally, drawing conclusion from the analysis, giving recommendation and writing the research.
The research framework in Figure 3.2 is based on Singh (2006). The data collected from the network through the router. The network data then stored to the storage server and processed to next stage after collecting enough data. During the pre-processing stage, the data are processed by extracting related features and statistically visualized. Finally, the statistical result will be analyzed to summarize and find pattern in the data and evaluated. Flow Chart for this process is shown in Figure 3.3.
The clients connect to the Internet and generate network traffic. This traffic is captured using port mirroring technique. Network traffic that go through the router is port mirrored to a storage server. The storage server stores the DNS traffic and NetFlow traffic from the network. Both of the data is needed because it will be correlated and …show more content…
The dynamic threshold is also visualized in the line chart. Then, it is used for anomaly detector. Finally, the line chart with dynamic threshold is going to be analyze in the next stage. After the dynamic threshold is calculated, implemented, and visualized, the statistical data is analyzed and evaluated in NetFlow against DNS statistic.
Mean and standard deviation of the feature is needed for dynamic threshold formula. Formula 3.3 is to calculate mean of a feature (x) during the time bin (t) and Formula 3.4 is to calculate standard deviation of a feature (x) during the time bin (t).
The dynamic threshold is calculated using Formula 3.5 for DNS threshold and Formula 3.6 for NetFlow threshold. This threshold is used based on previous paper Amidan et al. (2005) and Oshima et al. (2010). DNS and NetFlow have different formula because of they have different feature. Variable k in the formula is the constant value to fine tuning the threshold sensitivity. In this research one is used for k. Value that outside the mean and the lower or upper bound threshold is determined as