Results
Practical Exercises Results - 9.12.2020
 
Lecture Materials
Slides and files conected to lectures will be available on MS Teams in the group related to the lecture - 
MS Teams link 
  
        
          - What is data mining - a map
- Supervised, unsupervised and semi-supervised learning
- Interference vs. prediction
- Statistics vs. machine learning
		  - Regression vs. classiffication problem
- Building a model: training, validation and test data
- Model flexibility, overfitting, bias/variance decomposition
- Optimal (Bayes) model, naive Bayes model
- K-nearest neighbours model
- Curse of dimensionality
- Parametric vs. non-parametric models
          - Linear regression, model assumptions, scatter plots
- Point and interval estimation of model parameters
- Hypothesis testing, p-value, T-statistic, F-statistic
- R-squared, R-squared adjusted
- Collinarity, interaction terms
		  - Dealing with data
- Categorical vs. continuous variables
- One-hot encoding
- Outliers and high-leverage points
- Resampling methods: cross-validation and bootstrap
- Data representation, creating a data pipeline
		  - Classification algorithms overview
- Binomial and multinomial logistic regression
- Decision Trees + entropy and Gini index
- Random Forests +  model ensembling technique 
- No free lunch thoerem + decision boundaries visualization
- Loss function: cross-entropy, Kullback–Leibler divergence
- Model hyperparameters and grid-search
		  - Neural networks and deep learning
- Automatic feature extraction
- Stochastic gradient descent
- Backpropagation algorithm
- Standard layers and activation functions
- Multi-input and multi-output networks
			- Convolutional neural networks
- Local connectivity and parameter sharing
- Convolution and MaxPooling layers
- Constraints on kernel size, strides and padding
			
- 1x1 convolution, Inception model and its evolution
			- Regularization techniques
- L1 and L2 regularization
- Dropout, data augmentation
- Transfer learning and fine tunning
          - Unsupervised learning
- Clustering, anomaly detection
- K-means and hierarchical clustering
- Visualization techniques for multidimensional data