当前位置:首页 >> 建筑/土木 >>

资管硕一甲


資管碩一甲

資料探勘 Spring 2006, Thur. 9:10~12:00am (B320)

Instructor Jinn-Yi Yeh, Ph.D.(葉進儀), Room: A817, Tel: (05)273-2899, Fax: (05)273-2893, Email: jyeh@mail.ncyu.edu.tw, Office Hours: Thusday and Wednesday, 10:00-12:00am. Required Text Witten, I.H. and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2nd Ed., Morgan Kaufmann, 2005. Supplementary Texts 1. Han, J. and M. Kamber, Data Mining—concepts and techniques, 2001, Morgan Kaufmann, NY. 2. Kantardzic, M., Data Mining—Concepts, Models, Methods, and Algorithms, 2002, WileyInterscience, NJ. 3. 丁一賢、陳牧言,資料探勘,滄海書局,2005 4. Related research papers Description Data Mining studies algorithms and computational paradigms that allow computers to find patterns and regularities in databases, perform prediction and forecasting, and generally improve their performance through interaction with data. It is currently regarded as the key element of a more general process called Knowledge Discovery that deals with extracting useful knowledge from raw data. The knowledge discovery process includes data selection, cleaning, coding, using different statistical, pattern recognition and machine learning techniques, and reporting and visualization of the generated structures. The course will cover all these issues and will illustrate the whole process by examples of practical applications. The students will use recent Data Mining software. Objectives ? To introduce students to the basic concepts and techniques of Data Mining. ? To develop skills of using recent data mining software for solving practical problems. ? To gain experience of doing independent study and research.

Prerequisites Students must have basic knowledge of database, algebra, discrete math and statistics. Assignments and projects: There will be some projects requiring independent study and practical work with a data mining system (e.g. Weka or DBMiner) for solving data mining tasks and 2 quizzes quizzes.

Tentative Schedule Week Topic 1 Course Organization; Introduction 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Data Warehouse and OLAP Input: Concepts, instances, and attributes Output: Knowledge representation Quiz 1 Data mining algorithms: Classification Data mining algorithms: Association rules Data mining algorithms: Prediction Midterm Exam Paper presentation Introduction to Weka Credibility: Evaluating what's been learned Implementations: Real machine learning schemes Quiz 2 Transformations: Engineering the input and output Moving on: Extensions and applications Paper presentation Final Exam

Reading Assignment Witten & Frank - Chapter 1 Han & Kamber – Chapter 2 Witten & Frank - Chapter 2 Witten & Frank - Chapter 3

Remark

Quiz 1 Witten & Frank - Sections 4.1, 4.3, 4.4. Witten & Frank - Section 4.5 Witten & Frank - Sections 4.2, 4.6, 4.7. Midterm Exam Witten & Frank - Chapter 9-15 Witten & Frank – Chapter 5 Witten & Frank – Chapter 6 Quiz 2 Witten & Frank - Chapter 7 Witten & Frank - Chapter 8

Final Exam

Paper presentation “Reading”: you will search the literature and critically review a recent paper on an appropriate topic and then give an oral presentation (to be annouced). Course evaluation - 40% quizes and lab assignments - 20% midterm exam - 20% paper presentation - 20% final exam (comprehensive) Related research papers Marketing Mining Customer Value: From Association Rules to Direct Marketing 著者: Wong, Ke; Zhou, Senqiang; Yang, Qiang, 和其他 資料來源: Data Mining and Knowledge Discovery 11, no. 1 (2005): 57-79

Creating profitable customers through the magic of data mining 著者: Ryals, Lynette 資料來源: Journal of Targeting, Measurement and Analysis for Marketing 11, no. 4 (2003): 343-349 Mining data to discover customer segments 著者: Kelly 資料來源: Interactive Marketing 4, no. 3 (2003): 235-242 Web mining Web outlier mining: Discovering outliers from web datasets 著者: Agyemang, Malik; Barker, Ken; Alhajj, Reda 資料來源: Intelligent Data Analysis 9, no. 5 (2005): 473-486 (14 頁) Mining Web Log Sequential Patterns with Position Coded Pre-Order Linked WAP-Tree 著者: Ezeife, C.I.; Lu, Yi 資料來源: Data Mining and Knowledge Discovery 10, no. 1 (2005): 5-38 KM DATA MINING FOR THE MANAGEMENT OF SOFTWARE DEVELOPMENT PROCESS 著者: #193LVAREZ-MAC?AS, J. L.; MATA-V?ZQUEZ, J.; RIQUELME-SANTOS, J. C. 資料來 源: International Journal of Software Engineering and Knowledge Engineering 14, no. 06 (2004): 665-695 Data mining for shopping centres - customer knowledge-management framework 著者: Dennis, Charles; Marsland, David; Cockett, Tony 資料來源: Journal of Knowledge Management 5, no. 4 (2001): 368-374 Bioinformatics Common denominator procedure: a novel approach to gene-expression data mining for identification of phenotype-specific genes 著者: Korn, René; R?hrig, Sascha; Schulze-Kremer, Steffen, 和其他 資料來源: Bioinformatics 21, no. 11 (2005): 2766-2772 Microarray data mining with visual programming 著者: Curk, Tomaz; Demsar, Janez; Xu, Qikai, 和其他 資料來源: Bioinformatics 21, no. 3 (2005): 396-398 Mining SARS-CoV protease cleavage data using non-orthogonal decision trees: a novel method for decisive template selection 著者: Yang, Zheng Rong 資料來源: Bioinformatics 21, no. 11 (2005): 2644-2650 Mining HIV protease cleavage data using genetic programming with a sum-product function 著者: Yang, Zheng Rong; Dalby, Andrew R.; Qiu, Jing 資料來源: Bioinformatics 20, no. 18 (2004): 3398-3405 Finance An Architecture for Advanced Services in Cyberspace Through Data Mining: A Framework With Case Studies in Finance and Engineering 著者: Kim, Steven H. 資料來源: Journal of Organizational Computing and Electronic Commerce 10, no. 4 (2000): 257-270 Data mining in finance: using counterfactuals to generate knowledge from organizational information systems 著者: Dhar, V. 資料來源: Information systems. 23, no. 7, (1998): 423

CRM Data Mining: The Intelligence Behind CRM - If you want to understand your customers, first you have to understand your data. 著者: Linoff, Gordon 資料來源: Inform. 13, no. 9, (1999): 18 (8 頁) Data Mining Application in Customer Relationship Management (CRM) 著者: Bandosz, M. 資料 來源: Prace naukowe Akademii Ekonomicznej imienia Oskara Langego we Wroclawiu. no. 1064, (0, 2005): 23-30 Data Mining Oriented CRM Systems Based on MUSASHI: C-MUSASHI 著者: Yada, K.; Hamuro, Y.; Katoh, N., 和其他 資料來源: Lecture notes in computer science. no. 3430, (2005): 152-173

Image Advantages of Unbiased Support Vector Classifiers for Data Mining Applications 著者: Navia-Vázquez, A.; Pérez-Cruz, F.; Artés-Rodríguez, A., 和其他 資料來源: The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 37, no. 2-3 (2004): 223-235 (13 頁) Discriminatory Mining of Gene Expression Microarray Data 著者: Wang, Zuyi; Wang, Yue; Lu, Jianping, 和其他 資料來源: The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 35, no. 3 (2003): 255-272 (18 頁) Direct Manipulation Graphics for Data Mining 著者: Unwin, Antony R.; Hofmann, Heike; Wilhelm, Adalbert F. X. 資料來源: International Journal of Image and Graphics 2, no. 1 (2002): 49-65 Data Mining - Using One-Class and Two-Class SVMs for Multiclass Image Annotation 著者: Goh, K-S; Chang, E Y; Li, B 資料來源: IEEE transactions on knowledge and data engineering. 17, no. 10, (2005): 1333 (14 頁) Image mining for investigative pathology using optimized feature extraction and data fusion 著者: Chen, Wenjin; Meer, Peter; Georgescu, Bogdan, 和其他 資料來源: Computer methods and programs in biomedicine. 79, no. 1, (2005): 59 (14 頁) Algorithm Efficient Genetic Algorithm Based Data Mining Using Feature Selection with Hausdorff Distance 著者: Sikora, Riyaz; Piramuthu, Selwyn 資料來源: Information Technology and Management 6, no. 4 (2005): 315-331 A Fuzzy Data Mining Algorithm for Finding Sequential Patterns 著者: Hu, Yi-Chung; Chen, Ruey-Shun; Tzeng, Gwo-Hshiung, 和其他 資料來源: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 11, no. 02 (2003): 173-193 Set covering submodular maximization: An optimal algorithm for data mining in bioinformatics and medical informatics 著者: Genkin, Alexander; Kulikowski, Casimir A.; Muchnik, Ilya 資料來源: Journal of Intelligent & Fuzzy Systems 12, no. 1 (2002): 5-17 (13 頁)


相关文章:
更多相关标签: