Geeky Codes-Machine Learning

Data Science Machine Learning Unsupervised Learning

Gaussian Mixtures Models in Machine Learning

March 4, 2024July 2, 2025

A Gaussian mixture model (GMM) is a probabilistic model that assumes that the instances were generated from a mixture of…

Data Science Machine Learning Unsupervised Learning

Introduction to Unsupervised Learning Techniques

March 2, 2024July 1, 2025

Introduction Although most of the applications of Machine Learning today are based on supervised learning (and as a result, this…

Data Science Machine Learning

Kernel PCA in Machine Learning

February 29, 2024July 2, 2025

The post discusses Kernel Principal Component Analysis (kPCA), highlighting its application in nonlinear dimensionality reduction and suggesting methods for selecting kernels and tuning hyperparameters through grid search and reconstruction pre-image error minimization.

Data Science Machine Learning Python

Hierarchical Clustering in Machine Learning

February 28, 2024July 2, 2025

The content discusses K-Means and Hierarchical Clustering algorithms. K-Means requires predefined clusters and is sensitive to initial centroids and outliers. Hierarchical Clustering offers an agglomerative and divisive approach without preset clusters. The document also explores various linkage methods, dendrograms for visualization, and the validity of clusters over time.

Data Science Machine Learning Python

Choosing the Right Number of Dimensions in Dimensionality Reduction

February 28, 2024July 1, 2025

The content discusses dimensionality reduction using PCA, emphasizing the importance of preserving a significant portion of variance, typically 95%. It explains how to compute PCA, options for variance preservation, and the benefits of compression on datasets like MNIST. Additionally, it introduces Randomized PCA and Incremental PCA for efficiency in handling large datasets.

Data Science Machine Learning Python

Main Approaches for Dimensionality Reduction

February 28, 2024July 1, 2025

This content discusses dimensionality reduction approaches, focusing on projection and Manifold Learning. It explains how projection simplifies high-dimensional data, exemplified by datasets like the Swiss roll. Principal Component Analysis (PCA) is highlighted as a key algorithm for preserving variance while reducing dimensions, with SVD as a method for determining principal components.

Data Science Machine Learning Pandas

Introduction to Dimensionality Reduction

February 27, 2024July 1, 2025

The text discusses the curse of dimensionality in machine learning, highlighting challenges in high-dimensional spaces. It suggests reducing features to improve training efficiency and visualization, while addressing potential information loss and risks of overfitting with increased dimensions. Dimensionality reduction techniques will be explored further.

Data Science Decision Tree Deep Learning Machine Learning Random Forest

Gradient Boosting in Machine Learning

February 25, 2024July 1, 2025

Another very popular Boosting algorithm is Gradient Boosting. Just like AdaBoost,Gradient Boosting works by sequentially adding predictors to an ensemble,…

Data Science Decision Tree Machine Learning

Boosting( Ensemble) Trees | Machine Learning from Scratch

February 25, 2024July 1, 2025

Introduction Boosting (originally called hypothesis boosting) refers to any Ensemble method that can combine several weak learners into a strong…

Data Science Machine Learning Python

Sending Data in Unstructured File Form

February 23, 2024July 1, 2025

Unstructured data files consist of a series of bits. The file doesn’t separate the bits from each other in any…

Data Science Machine Learning Random Forest

Random Forests | Machine Learning from Scratch

February 23, 2024July 1, 2025

As we have discussed, a Random Forest is an ensemble of Decision Trees, generally trained via the bagging method (or…

Data Science Machine Learning Python

Uploading, Streaming, and Sampling Data Using Python

Geeky CodesFebruary 22, 2024July 1, 2025

Introduction Storing data in local computer memory represents the fastest and most reliable means to access it. The data could…

Data Science Deep Learning Machine Learning

What is Artificial Neural Network (ANN)?

Geeky CodesFebruary 20, 2024July 1, 2025

A. Introduction to neural networksB. ANN architectures C. Learning methods D. Learning rule on supervised learning E. Feedforward neural network…

Data Science Decision Tree Machine Learning Random Forest

What is Bagging and Pasting? Machine Learning from Scratch

Geeky CodesFebruary 19, 2024July 1, 2025

Introduction One way to get a diverse set of classifiers is to use very different training algorithms, as just discussed.…

Data Science Decision Tree Machine Learning

What is Ensemble Learning? | Machine Learning from Scratch

Geeky CodesFebruary 15, 2024July 1, 2025

Introduction: Welcome to our comprehensive tutorial on Ensemble Learning! In this guide, we’ll delve into the fascinating world of Ensemble…

Data Science Decision Tree Machine Learning

Decision Tree Regression | Machine Learning from Scratch

Geeky CodesFebruary 14, 2024July 1, 2025

Decision Trees are also capable of performing regression tasks. Let’s build a regression tree using Scikit-Learn’s DecisionTreeRegressor class, training it…

Data Science Decision Tree Machine Learning

Gini Impurity or Entropy? How to decide the root node in decision tree?

February 12, 2024July 1, 2025

By default, the Gini impurity measure is used, but you can select the entropy impurity measure instead by setting the…

Data Science Machine Learning Pandas Python

Linear Regression from Scratch: A Step-by-Step Guide

February 10, 2024July 1, 2025

Introduction: Linear regression is one of the fundamental techniques in machine learning and statistics used for modeling the relationship between…

Data Science Decision Tree Machine Learning

Decision Trees | Machine Learning from Scratch

February 10, 2024July 1, 2025

Like SVMs, Decision Trees are versatile Machine Learning algorithms that can perform both classification and regression tasks, and even multioutput…

Data Science Machine Learning Support Vector Machine

SVM Regression | Machine Learning from Scratch

February 9, 2024July 1, 2025

Introduction As we mentioned earlier, the SVM algorithm is quite versatile: not only does it support linear and nonlinear classification,…