EDBT-ICDT 2018

整理筆記時，翻到兩年前去研討會的筆記
想說放著也不會增值，就整理出來了
雖然大部分的內容的印象都已經有點模糊了
不過就加減把當初的筆記湊起來

當時我覺得最有趣的論文是 Interactive Rule Refinement for Fraud Detection.
不過竟然沒有做到太多筆記

Day 1 - Keynote

In theoretical CS
- Polynomial time → easy/fast
  - However, that's not always the case
  - e.g., \(O(n^{100})\)
  - When n grows, even \(O(n^2)\) is not efficient
We're stuck on many problems even just in \(O(n^2)\)
No \(N^{2-\epsilon}\) time algorithms known for
- String matching
- computational geometry
- graph problem in sparse graphs
- many problems from database
- many other problems
Why are we stuck?
- The traditional hardness in complexity tells us little about runtime
- fine-grained hardness idea
  1. identify key hard problem
  2. ......

Currently, ML community cares about new models instead of theory and fundamental ML design

Distributed ML
- Most ML systems use a "parameter server" model
  - Essentially a distributed key-value pair
- Negatives
  - Parameter server compute model very limiting
Data Parallel ML
- Each compute server runs same computation on different data
- Global state updated via aggregation

scale out ineffective in data parallel param server
- no easy way to add machines and have a graph execute faster
Only easy way to scale out is to add compute servers

Current ML systems are easily applicable only to
- Relatively small model problems
- That is run on a single machine

Attack Vector: File Tampering
- Occurs at the OS level → outside DBMS control
  - Bypass DBMS control
Page Deconstruction
- Page Header
  - Checksum
  - PageID
  - Row Count
DBStorageAuditor
- Goal: find inconsistency in storage
  - which is created by direct file manipulation

Time series: Any data that is ordered
Time Series Classification
- similarity-based kNN (e.g., kNN-ED, kNN-DTW)
  - similarity can be unreliable
- Shaplets
  - high computation complexity
Why multiscale
- sometimes global features are more important while sometimes local features are more important
- in this research, both global and local are considered
Visibility Graphs
Multiscale Visibility Graphs