i'm doing project on host based intrusion detection using ADFA-LD dataset ,now i'm doing feature extraction module. i constructed the phrase dictionary which consists of system call phrases of length 4. And now for feature extraction ,i need to compare the phrases with the new system call traces (following are some samples):
sys_clock_gettime sys_poll sys_poll sys_clock_gettime sys_poll sys_poll sys_poll sys_clock_gettime sys_poll sys_clock_gettime sys_poll sys_poll sys_poll sys_poll sys_poll sys_poll sys_poll sys_poll sys_socketcall.......
What i need is, how can i compare these phrases with the new traces. i'm doing in java.
my phrase dictionary:
sys_socketcall-sys_poll-sys_clock_gettime-sys_poll
sys_clock_gettime-sys_poll-sys_poll-sys_socketcall
sys_poll-sys_socketcall-sys_poll-sys_clock_gettime
sys_poll-sys_clock_gettime-sys_clock_gettime-sys_clock_gettime
sys_clock_gettime-sys_clock_gettime-sys_socketcall-sys_clock_gettime
sys_socketcall-sys_clock_gettime-sys_poll-sys_poll
sys_poll-sys_poll
i'm using '-' as separator on comparing these phrases with the new traces, so i joined unique system calls with '-'.