之前好多课上记的乱七八糟的东西都丢在了博客上,显得很乱,以后笔记可能还是写在ppt上多一些,关于课程的信息和一些总结会丢在博客上,保证博客的质量。


CMSC5741 - Big Data Technology and Applications

textbook: Mining of Massive Dataset(已借)

Instructor: Prof. Michael R. Lyu

Tutor: Zeng Jichuan

exam: Nov.6 Midterm exam

Assessment Scheme and Deadlines:

  • 20% Assignment
  • 40% Midterm examination
  • 40% Project : Proposal, Presentation, Report

Backgroud Knowledge: Tensorflow, Amazon EC2


CSCI5570 - Large Scale Data Processing Systems

website Account:

  • Username : csci5570

  • Password : huskydatalab

Instructor: Prof. James CHENG

Tutor: Tatiana Jin

Lecture/Lab:

  • Tuesday 13:30 Lecture && Lab
  • Wednesday 14:30 Lecture

Assessment Criteria:

  • 30% Survey paper : select one topics (DDL: Dec 10, 2018)
  • 70% project : deadline: DEC 20

CMSC5724Data Mining and Knowledge Discovery

Instructor: Yufei Tao

Tutor: Shangqi Lu

Assessment Criteria:

  • 30% Project
  • 30% Short Tests (three times in class)
  • 40% Final (Open-book)

CMSC 5720 - Project I

Instructor: Prof. James CHENG

Options:

  1. NN-descent (kn-graph的近似算法)
  2. search with fa2ss(facebook的相似性检索库)
  3. multiprobe with tree(基于树的哈希方法)
  4. LSH for MZPS (lsh)
  5. 数据收集->存储->分析 系统
  6. topic modeling on ps archetective -> (LDA FlexPS->parameter server拓展)
  7. 调度算法,同步/异步 任务,在不同集群下测试算法,分布式,任务的表现
  8. 矩阵分解 Distributed MF(矩阵分解) on Actor Framework (nomad,lftf acf或者akka -> cpu to gpu to scheduling)
  9. Clustering-aware query (database query optimizer)