📘 AM605PC

Data Mining Lab

Access study materials and notes for this subject

DATA MINING LAB MANUAL

PDF Document

Syllabus Overview

List of Experiments

Experiment 1: Data Processing Techniques

  • a) Data Cleaning
  • b) Data Transformation – Normalization
  • c) Data Integration

Experiment 2: Data Partitioning Techniques

  • Implement Horizontal Partitioning
  • Implement Vertical Partitioning
  • Implement Round Robin Partitioning
  • Implement Hash-based Partitioning

Experiment 3: Data Warehouse Schemas

  • Design Star Schema
  • Design Snowflake Schema
  • Design Fact Constellation Schema

Experiment 4: Data Cube and OLAP Operations

  • Construct Data Cube
  • Perform OLAP Operations: Roll-up, Drill-down, Slice, Dice, Pivot

Experiment 5: ETL Operations

  • Perform Data Extraction
  • Apply Transformations
  • Load Processed Data into Target Warehouse (using Pentaho/Python)

Experiment 6: Attribute-Oriented Induction

  • Implement Attribute-Oriented Induction Algorithm for Concept Description

Experiment 7: Apriori Algorithm

  • Implement Apriori Algorithm for Frequent Itemset Mining
  • Generate Association Rules from Frequent Itemsets

Experiment 8: FP-Growth Algorithm

  • Construct FP-Tree
  • Mine Frequent Patterns using FP-Growth Algorithm

Experiment 9: Decision Tree Induction

  • Implement Decision Tree Algorithm (e.g., ID3 or C4.5)
  • Visualize the Generated Tree

Experiment 10: Information Gain Calculation

  • Calculate Entropy and Information Gain for Attributes
  • Use Gain to Select Root Node in Decision Tree

Experiment 11: Naive Bayes Classification

  • Implement Naive Bayes Classifier
  • Classify Test Instances using Probability Estimation

Experiment 12: K-Nearest Neighbour (K-NN)

  • Implement K-NN Algorithm for Classification
  • Experiment with Different Distance Metrics (Euclidean, Manhattan)

Experiment 13: K-Means Clustering

  • Implement K-Means Algorithm
  • Visualize Clusters and Analyze Convergence

Experiment 14: BIRCH Clustering

  • Implement BIRCH Algorithm for Large Dataset Clustering
  • Understand CF Tree Construction

Experiment 15: PAM (Partitioning Around Medoids)

  • Implement PAM Algorithm (K-Medoids)
  • Compare with K-Means in Terms of Robustness to Outliers

Experiment 16: DBSCAN Clustering

  • Implement DBSCAN Algorithm
  • Identify Core, Border, and Noise Points
  • Analyze Clustering with Different Epsilon and MinPts Values
Data Mining Lab Notes