Machine Learning

Which one to use – RandomForest vs SVM vs KNN?

Which one to use – RandomForest vs SVM vs KNN?

The basic steps to deciding which algorithm to use will depend on a number of factors. A few factors which one can look for are listed below: The number of examples in the training set.Dimensions of featured space.Do we have correlated features?Is overfitting a problem? These are just a few factors on which the selection of the algorithm may depend. Once you have the answers to all these questions, you can move ahead to decide the algorithm. SVM The main reason to use an SVM instead is that the problem might not be linearly separable. In that case, we will…
Read More
Clustering & Visualization of Clusters using PCA

Clustering & Visualization of Clusters using PCA

Customer's Segmentation based on their Credit Card usage behavior Dataset for this notebook consists of the credit card usage behavior of customers with 18 behavioral features. Segmentation of customers can be used to define marketing strategies. Content of this Kernel: Data PreprocessingClustering using KMeansInterpretation of ClustersVisualization of Clusters using PCA # This Python 3 environment comes with many helpful analytics libraries installed # It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python # For example, here's several helpful packages to load in import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g.…
Read More
What is the difference between artificial and convolutional neural networks?

What is the difference between artificial and convolutional neural networks?

A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image, and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics. The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons…
Read More
What is the VGG 19 neural network?

What is the VGG 19 neural network?

VGG 19 is a convolutional neural network architecture that is 19 layers deep. The main purpose for which the VGG net was designed was to win the ILSVRC imagenet competition. Let’s take a brief look at the architecture of VGG19. Input: The VGG-19 takes in an image input size of 224×224.Convolutional Layers: VGG’s convolutional layers leverage a minimal receptive field, i.e., 3×3, the smallest possible size that still captures up/down and left/right. This is followed by a ReLU activation function. ReLU stands for rectified linear unit activation function, it is a piecewise linear function that will output the input if positive otherwise, the output is zero. Stride is…
Read More
What is clustering in machine learning?

What is clustering in machine learning?

Clustering is one of the most popular techniques in unsupervised learning where data is grouped based on the similarity of the data points. The basic principle behind clustering is the assignment of a given set of observations into subgroups or clusters such that observations present in the same clusters have a degree of similarity. It is a method of unsupervised learning since there is no label attached to the data points. The machine has to learn the features and patterns all by itself without any given input-output mapping. There are several clustering in Machine Learning, Some common clustering algorithms are Centroid-based clustering: The first and foremost…
Read More
What is the root mean square error?

What is the root mean square error?

Root Mean square error (RMSE) is one of the most commonly used loss functions for regression problems. One way to assess how well a regression model fits a dataset is to calculate the root mean square error, RMSE is the standard deviation of the residuals. Residuals are a measure of how far from the regression line data points are. Residuals are nothing but prediction errors, we can find it by subtracting the predicted value from the actual value. it can be defined mathematically as, The lower the RMSE, the better a given model is able to “fit” a dataset. RMSE is a measure…
Read More
What is Digital image processing in simple terms?

What is Digital image processing in simple terms?

Before we dive deeper into digital image processing, we need to understand what an image actually is, A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels. Why do we process? If you what to make a cup of tea we do need to follow some processing steps, in the same way, if you have pictorial data in the form of image or video generated by a device or sensor cameras. And then you want to make something else as per your requirement as examples beautify, compress, crop, sharpen, enlarge,…
Read More
Stationarity Analysis in Time Series Data

Stationarity Analysis in Time Series Data

Hey Geeks !!! in this blog, we'll dive into the concept of stationarity using time series data. We'll first understand what is time-series data, what is stationarity, why and when data should be stationary etc...We'll use the dataset I created specifically for this blog to analyze whether the data is stationary or not. We'll also see how to convert the non-stationary data to stationary. Index IntroductionImport Libraries and DependenciesDefine TimeSeriesData ClassImport DatasetAccumulating Number of Sales by monthCreate objectStationarity TestsGraphical TestRolling-Statistics TestAugmented Dickey-Fuller Test (ADF)Kwiatkowski-Phillips-Schmidt-Shin Test (KPSS)Zivot-Andrews TestConclusionConvert data to StationaryDerivativesTransformation using Logarithmic FunctionADF TestKPSS TestZivot-Andrews TestRolling-Statistics TestConclusion 1. Introduction 1.1…
Read More
How to create Movie Recommendation System

How to create Movie Recommendation System

In this notebook, I will try to use a few recommendation algorithms (content-based, popular-based and shared filters) and try to build a collection of these models to come up with our final movie recommendation system. For us, we have two MovieLens data sets. Full Data Set: Contains 26,000,000 ratings and 750,000 tag requests applied to 45,000 movies by 270,000 users. Includes genome tag data with 12 million affiliate scores on 1,100 tags.Small Data Set: Includes 100,000 ratings and 1,300 tag applications applied to 9,000 movies by 700 users.I will create a Simple Recommendation using movies from the Full Database while…
Read More
How to Predict Movie will be Flop or Hit and it’s Revenue?

How to Predict Movie will be Flop or Hit and it’s Revenue?

The Birth of the motion picture camera in the late 18th century gave birth to the most powerful form of entertainment available: Cinema. Movies have been able to entertain audiences from the emergence of a single second of horse racing in the 1890s to the introduction of sound in the 1920s to the birth of color in the 1930s to create 3D Movies in early 2010. Cinema had a humble background in terms of design, direction and acting (especially due to its very short time in its early days) but since then, the film industry around the world has been…
Read More