DataScience

How to do Feature Encoding and Exploratory Data Analysis

How to do Feature Encoding and Exploratory Data Analysis

Categorical variables are those values that are selected from a group of categories or labels. For example, the variable Gender with the values of male or female is categorical, and so is the variable marital status with the values of never married, married, divorced, or widowed. In some categorical variables, the labels have an intrinsic order, for example, in the variable Student's grade, the values of A, B, C, or Fail are ordered, A being the highest grade and Fail the lowest. These are called ordinal categorical variables. Variables in which the categories do not have an intrinsic order are…
Read More
What are Bias and Variance in Machine Learning

What are Bias and Variance in Machine Learning

As machine learning is increasingly used in applications, machine learning algorithms have gained more scrutiny. With larger data sets, various implementations, algorithms, and learning requirements, it has become even more complex to create and evaluate ML models since all those factors directly impact the overall accuracy and learning outcome of the model. This is further skewed by false assumptions, noise, and outliers. Machine learning models cannot be a black box. The user needs to be fully aware of their data and algorithms to trust the outputs and outcomes. Any issues in the algorithm or polluted data set can negatively impact the ML model. The main…
Read More
Which one to use – RandomForest vs SVM vs KNN?

Which one to use – RandomForest vs SVM vs KNN?

The basic steps to deciding which algorithm to use will depend on a number of factors. A few factors which one can look for are listed below: The number of examples in the training set.Dimensions of featured space.Do we have correlated features?Is overfitting a problem? These are just a few factors on which the selection of the algorithm may depend. Once you have the answers to all these questions, you can move ahead to decide the algorithm. SVM The main reason to use an SVM instead is that the problem might not be linearly separable. In that case, we will…
Read More
Clustering & Visualization of Clusters using PCA

Clustering & Visualization of Clusters using PCA

Customer's Segmentation based on their Credit Card usage behavior Dataset for this notebook consists of the credit card usage behavior of customers with 18 behavioral features. Segmentation of customers can be used to define marketing strategies. Content of this Kernel: Data PreprocessingClustering using KMeansInterpretation of ClustersVisualization of Clusters using PCA # This Python 3 environment comes with many helpful analytics libraries installed # It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python # For example, here's several helpful packages to load in import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g.…
Read More
How do you count repeated words in a list in Python?

How do you count repeated words in a list in Python?

In this post, we will talk about how to count repeated words in python list. It can be done in many ways. Using collections.Counter() # Importing counter function. from collections import Counter words = ["a", "b", "a", "c", "c", "a", "c"] duplicate_dict = Counter(words) print(duplicate_dict)#to get occurence of each of the element. print(duplicate_dict['a'])# to get occurence of specific element. Output: Counter({'a': 3, 'c': 3, 'b': 1}) 3 Using count() letter = ["b", "a", "a", "c", "b", "a", "c",'a'] counting=letter.count('a') print(counting) Output: > 4 Hope this helps! Important Notice for college students If you’re a college student and have skills in programming languages,…
Read More
What is the difference between artificial and convolutional neural networks?

What is the difference between artificial and convolutional neural networks?

A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm that can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image, and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics. The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons…
Read More
What is clustering in machine learning?

What is clustering in machine learning?

Clustering is one of the most popular techniques in unsupervised learning where data is grouped based on the similarity of the data points. The basic principle behind clustering is the assignment of a given set of observations into subgroups or clusters such that observations present in the same clusters have a degree of similarity. It is a method of unsupervised learning since there is no label attached to the data points. The machine has to learn the features and patterns all by itself without any given input-output mapping. There are several clustering in Machine Learning, Some common clustering algorithms are Centroid-based clustering: The first and foremost…
Read More
What is Digital image processing in simple terms?

What is Digital image processing in simple terms?

Before we dive deeper into digital image processing, we need to understand what an image actually is, A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels. Why do we process? If you what to make a cup of tea we do need to follow some processing steps, in the same way, if you have pictorial data in the form of image or video generated by a device or sensor cameras. And then you want to make something else as per your requirement as examples beautify, compress, crop, sharpen, enlarge,…
Read More
How do I solve tough programming problems in HackerRank?

How do I solve tough programming problems in HackerRank?

If you are a beginner then go with easy problems first, try to solve as much you can for getting confidence. If, you are not able to solve easy problems then spend some time on that try to think harder, even you can't solve the problem go for editorial read the approach, don't read the code and try to solve your own spend time and practice gives you high return for sure. If you think you can solve easy problems go for a medium one. Go with the same approach as you have done with easy problems…even giving more time…
Read More
What is the difference between the append() and insert() list methods in Python?

What is the difference between the append() and insert() list methods in Python?

Difference between append() and insert () Append(): This function is used to modify an already existing list. Adds a new specific element at the end of the list. Syntax: List_Name.append(item) Insert(): This function also modifies an already existing list. The only difference between append() and insert() is that the insert function allows us to add a specific element at a specified index of the list unlike append() where we can add the element only at end of the list. Syntax: List_Name.insert(index, item) Refer below example for better understanding Important Notice for college students If you’re a college student and have…
Read More