整理筆記時，翻到兩年前去研討會的筆記

想說放著也不會增值，就整理出來了

雖然大部分的內容的印象都已經有點模糊了

不過就加減把當初的筆記湊起來

當時我覺得最有趣的論文是 Interactive Rule Refinement for Fraud Detection.

不過竟然沒有做到太多筆記

## Day 1 - Keynote

- In theoretical CS
- Polynomial time → easy/fast
- However, that's not always the case
- e.g., \(O(n^{100})\)
- When
`n`

grows, even \(O(n^2)\) is not efficient

- Polynomial time → easy/fast
- We're stuck on many problems even just in \(O(n^2)\)
- No \(N^{2-\epsilon}\) time algorithms known for
- String matching
- computational geometry
- graph problem in sparse graphs
- many problems from database
- many other problems

**Why are we stuck?**- The traditional hardness in complexity tells us little about runtime
- fine-grained hardness idea
- identify key hard problem
- ......

## Large Scale Machine Learning: Where Do Relational Systems Fit In? (by Chris Jermaine)

Currently, ML community cares about new models instead of theory and fundamental ML design

### ML vs AI

- ML is one approach to AI
- Classic AI: a programmer/expert imparting knowledge to a system
- ML is fundamentally statistical

### Intro to ML

- Distributed ML
- Most ML systems use a "parameter server" model
- Essentially a distributed key-value pair

- Negatives
- Parameter server compute model very limiting

- Most ML systems use a "parameter server" model
- Data Parallel ML
- Each compute server runs same computation on different data
- Global state updated via aggregation

#### Want to scale out to speed up learning?

- scale out ineffective in data parallel param server
- no easy way to add machines and have a graph execute faster

- Only easy way to scale out is to add compute servers

### Take-Home Point

- Current ML systems are
**easily**applicable only to- Relatively small model problems
- That is run on a single machine

## Detecting Database File Tampering through Page Carving

- Attack Vector: File Tampering
- Occurs at the OS level → outside DBMS control
- Bypass DBMS control

- Occurs at the OS level → outside DBMS control
- Page Deconstruction
- Page Header
- Checksum
- PageID
- Row Count

- Page Header
- DBStorageAuditor
- Goal: find inconsistency in storage
- which is created by direct file manipulation

- Goal: find inconsistency in storage

## Extracting Statistical Graph Features for Accurate and Efficient Time Series Classification

- Time series: Any data that is ordered
- Time Series Classification
- similarity-based kNN (e.g., kNN-ED, kNN-DTW)
- similarity can be unreliable

- Shaplets
- high computation complexity

- similarity-based kNN (e.g., kNN-ED, kNN-DTW)
- Why multiscale
- sometimes global features are more important while sometimes local features are more important
- in this research, both global and local are considered

- Visibility Graphs
- Multiscale Visibility Graphs

###### Comments

Do you like this article? What do your tink about it? Leave you comment below