之前好多课上记的乱七八糟的东西都丢在了博客上,显得很乱,以后笔记可能还是写在ppt上多一些,关于课程的信息和一些总结会丢在博客上,保证博客的质量。
CMSC5741 - Big Data Technology and Applications
textbook: Mining of Massive Dataset(已借)
Instructor: Prof. Michael R. Lyu
Tutor: Zeng Jichuan
exam: Nov.6 Midterm exam
Assessment Scheme and Deadlines:
- 20% Assignment
- 40% Midterm examination
- 40% Project : Proposal, Presentation, Report
Backgroud Knowledge: Tensorflow, Amazon EC2
CSCI5570 - Large Scale Data Processing Systems
website Account:
Username : csci5570
Password : huskydatalab
Instructor: Prof. James CHENG
Tutor: Tatiana Jin
Lecture/Lab:
- Tuesday 13:30 Lecture && Lab
- Wednesday 14:30 Lecture
Assessment Criteria:
- 30% Survey paper : select one topics (DDL: Dec 10, 2018)
- 70% project : deadline: DEC 20
CMSC5724Data Mining and Knowledge Discovery
Instructor: Yufei Tao
Tutor: Shangqi Lu
Assessment Criteria:
- 30% Project
- 30% Short Tests (three times in class)
- 40% Final (Open-book)
CMSC 5720 - Project I
Instructor: Prof. James CHENG
Options:
- NN-descent (kn-graph的近似算法)
- search with fa2ss(facebook的相似性检索库)
- multiprobe with tree(基于树的哈希方法)
- LSH for MZPS (lsh)
- 数据收集->存储->分析 系统
- topic modeling on ps archetective -> (LDA FlexPS->parameter server拓展)
- 调度算法,同步/异步 任务,在不同集群下测试算法,分布式,任务的表现
- 矩阵分解 Distributed MF(矩阵分解) on Actor Framework (nomad,lftf acf或者akka -> cpu to gpu to scheduling)
- Clustering-aware query (database query optimizer)