Track 1: Scalable Observation infrastructure - Low disturbance multi-level observation and production of enhanced data

The objective of this track is to efficiently capture and merge online observations made anywhere in the system (alerts, events, and states) by many selected relevant surveillance systems (AV, HIDS, performance monitor systems, software tracers and profilers, and in-system peeks) into an enhanced data that is the most appropriate for detection analysis. This will allow quasi-real-time feedback-directed observation (online control of software tracer probes according to the new situations) and saving selected data on disk for further offline analysis (forensics, software improvement).

Large installations, whether for telecom systems or government infrastructure, need the ability to closely monitor each node with low, almost negligible, overhead in order to monitor the performance and behavior of the system and detect possible anomalies. Anomalies may be caused by unusually high demand, security attacks, defective hardware or software, or improper configuration. The extent and precision of the observation data is key to maximize the possibility of anomaly detection. Previous work concentrated on obtaining low overhead detailed execution traces from kernel and user-space static trace-points which can be activated dynamically and remotely.

Proposed work will concentrate on getting additional sources of information, including interfaces to existing systems (AV, HIDS), dynamic tracepoints, performance counters samples, information from I/O subsystems (bus, disk subsystem, network adapter...), and in-kernel peeks (observation of variables in the kernel). The main challenge is to design proper algorithms to achieve extremely low overhead even for complex, heterogeneous many-core architectures with multiple levels of virtualization (e.g. Kernel Virtual Machine, Java Virtual Machine). This highly efficient and accurate infrastructure for host tracing must be able to integrate sources of information (events and states) from all levels (hardware, operating system, applications, Anti-Virus scans, Host based Intrusion Detection Systems) and be suitable to interoperate in large distributed systems (e.g. cluster/cloud).


Team members

Jean-Christian Kouamé École Polytechnique de Montréal Master Student


Documents and presentations

OpenStack analysis, state of the art

Openstack Service Analysis

Hardware Tracing with Intel PT

Using Tracing to Analyze Hard Disk Performance

Speeding up State History Tree

Tracing embedded heterogeneous systems

Trace Compass Update

State of the art

Distributed traces modelling and critical path analysis