Data Management Class HP (データマネージメント特論)

Contact e-mail address:
morimo@hiroshima-u.ac.jp
If can ask question about the lecture, you can use this e-mail.

When you email me, use the following format in the subject field of the mail.
(in order to distinguish mails from other spams)
[08F-FR-2] <Student ID number> <subject>
(e.g.: [08F-FR-2] M08xxxx Association Rule)

Topics

・Database Essentials
 (relational algebra, SQL, transaction management, relational model, etc.)
・Knowledge Discovery and Data Mining
 (association rule, decision tree, etc.)
・Information Retrieval
 (vector space model, ratent semantic indexing, search engines)
・E-Commerce and Business Intelligence
 (Recommendation Systems)


講義スライド

Slides (for the first half of the class, password is required)

10/03 Guidance print
10/10 Database Essentials (1) print
Relational data model, Relational Algebra, and SQL query
10/17 Database Essentials (2) print
DDL and Integrity Constraints
10/24 Database Essentials (3) print
Transaction Management, Concurrency Control
10/31 Data Warehouse print
Online Analytical Process
11/07 Association Rule Mining print
a.k.a. Apriori algorithm and its fundamental features
11/14 Numeric Association Rule (1) print
Optimized Confidence Rule
11/21 Numeric Association Rule (2) print
Optimized Support Rule
11/28 Prediction print
Entropy and Discriminant Rules on Numerical Attributes
12/05 Classification Rules print
Discriminant Rules on Categorical Attributes
12/12 Classification Tree (aka Decision Tree) print
Prediction Model for Classification Problem
12/19 Regression Problem print
Regiression Rules and Regression Tree
01/09 Range Rules on Numerical Attribute print
Touching Oracle of Range, Max-subarray algorithm
01/16 Region Rules print
Branch and Bound Search on Convex Hull
01/23 Region Rules print
Touching Oracle of X-monotone Region

Subject to change.
注: 進行状況に応じて内容は変更されることがあります.

参考資料
References



宿題
Homework
(Programming Exercise, Paper Reading, DBMS exercise)

 Mission I
  Compute Association Rules from "sample08F.csv"
  (See the print of the class in 11/07 for detail.)

 Mission II
  Compute the optimized confidence rule from "bkt08F.csv".
  (See the print of the class in 11/21 for detail.)

 Mission III
  Compute prediction rules (classification rules and regression rules)
  (See the print of the class: 12/12, 12/19)

 Mission IV
  Compute Optimal Range Rule
  Compute X-monotone Region on a Matrix
  (See the print of the class 1/23)

Necessary Data and Reference for Homework
・Sample POS data for computing association rules
  sample08F.csv
・Sample bucket data for numeric association rules
  bkt08F.csv
・Sample training data of members profiles
  cls08F.csv
  reg08F.csv

発表割当と発表の進行計画
(毎回の進み方しだいで変わるので注意)
Paper Assignments (for the latter part of class, subject to change)

TBD

日程(Date) 担当者(Speaker) 質問者
各発表者は10部レジメを作成して25分程度で内容を説明すること.
また,担当質問者は発表に対し,2件以上の質問,またはコメントをすること.


各トピックに関する課題論文(Assignment Papers 200X)

Progress of Data Mining Techniques


教科書・参考書 (前半では主にこれを利用します)

Raghu Ramakrishnan and Johannes Gehrke, "Database Management System", McGraw-HIll.
Jiawei Han and Micheline Kambar, "Data Mining: Concepts and Techniques," Morgan Kaufmann.
福田剛志,森本康彦,徳山豪,”データマイニング, (データサイエンスシリーズ3)” 共立出版