Machine Learning

How to create Movie Recommendation System

How to create Movie Recommendation System

In this notebook, I will try to use a few recommendation algorithms (content-based, popular-based and shared filters) and try to build a collection of these models to come up with our final movie recommendation system. For us, we have two MovieLens data sets. Full Data Set: Contains 26,000,000 ratings and 750,000 tag requests applied to 45,000 movies by 270,000 users. Includes genome tag data with 12 million affiliate scores on 1,100 tags.Small Data Set: Includes 100,000 ratings and 1,300 tag applications applied to 9,000 movies by 700 users.I will create a Simple Recommendation using movies from the Full Database while…
Read More
How to Predict Movie will be Flop or Hit and it’s Revenue?

How to Predict Movie will be Flop or Hit and it’s Revenue?

The Birth of the motion picture camera in the late 18th century gave birth to the most powerful form of entertainment available: Cinema. Movies have been able to entertain audiences from the emergence of a single second of horse racing in the 1890s to the introduction of sound in the 1920s to the birth of color in the 1930s to create 3D Movies in early 2010. Cinema had a humble background in terms of design, direction and acting (especially due to its very short time in its early days) but since then, the film industry around the world has been…
Read More
How to create a simple movie recommendation System

How to create a simple movie recommendation System

Introduction This part of the Content Editor Internship ” “Every time I go to a movie, it's magical, no matter what the movie is about. - Steven Spielberg Everyone loves movies regardless of age, gender, race, color, or location. We are all in some way connected to each other in this amazing way. But most interesting is the fact that our choices and combinations are different in terms of our preferences. Some people like movies that are specific to the genre, be it entertaining, romantic, or sci-fi, while others focus on the main characters and directors. Considering all of that,…
Read More
Starting Data Pipelines | Fundamentals of Data Engineering

Starting Data Pipelines | Fundamentals of Data Engineering

This article includes a comprehensive introduction with step-by-step definitions and code in data pipelines to introduce the basics of data engineering. Data pipelines are widely used in data science and machine learning and are essential in the process of machine learning to integrate data from multiple streams to gain business intelligence for competitive and profitable analysis. What is a Data Pipeline? Data pipeline is a set of rules that motivates and converts data from multiple sources to an area where new values ​​can be obtained. In the simplest way, the pipeline can only extract data from various sources such as…
Read More
What is Dimensionality Reduction? Overview, Objectives, and Popular Techniques

What is Dimensionality Reduction? Overview, Objectives, and Popular Techniques

Table of Contents What is Dimensionality ReductionWhy Dimensionality Reduction is ImportantDimensionality Reduction Methods and ApproachesDimensionality Reduction TechniquesDimensionality Reduction Example Learning by machine is not an easy task. Okay, so that's a lesser statement. Artificial Intelligence and machine learning represent a major step in making computers think like humans, but both concepts are challenging to understand. Fortunately, the profit is worth the effort. Today we are dealing with the process of reducing size, analyzing a key component in machine learning. We will cover its meaning, why it is important, how to do it, and give you a related example to illustrate…
Read More
Interpreting ACF and PACF | Time Series

Interpreting ACF and PACF | Time Series

Introduction Autocorrelation analysis is an important step in the Exploratory Data Analysis (EDA) of time series. The autocorrelation analysis helps in detecting hidden patterns and seasonality and in checking for randomness. It is especially important when you intend to use an ARIMA model for forecasting because the autocorrelation analysis helps to identify the AR and MA parameters for the ARIMA model. Overview FundamentalsAuto-Regressive and Moving Average ModelsStationarityAutocorrelation Function and Partial Autocorrelation FunctionOrder of AR, MA, and ARMA ModelExamplesAR(1) ProcessAR(2) ProcessMA(1) ProcessMA(2) ProcessPeriodicalTrendWhite NoiseRandom-WalkConstant🚀 Cheat SheetCase StudyBitcoinEthereumDiscussion on Random-Walk import numpy as np # linear algebra from numpy.random import seed import math import…
Read More
Predictive Analysis with different approaches

Predictive Analysis with different approaches

The goal of this notebook is not to do the best model for each Time series. It is just a comparison of few models when you have one Time Series. The presentation present a different approaches to forecast a Time Series.In this notebook we will be using web traffic data from kaggle. The plan of the notebook is: I. Importation & Data CleaningII. Aggregation & VisualizationIII. Machine Learning ApproachIV Basic Model ApproachV. ARIMA approach (Autoregressive Integrated Moving Average)VI. (FB) Prophet ApproachVII. Keras StarterVIII. Comparaison & Conclusion I. Importation & Data Cleaning In this first part we will choose the Time…
Read More
All Cheat Sheets related to Machine Learning

All Cheat Sheets related to Machine Learning

Here we're going to provide Cheat sheets for machine learning are plentiful. Quality, concise technical cheat sheets, on the other hand... not so much. A good set of resources covering theoretical machine learning concepts would be invaluable. Shervine Amidi, graduate student at Stanford, and Afshine Amidi, of MIT and Uber, have created just such a set of resources. The VIP cheat sheets, as Shervine and Afshine have dubbed them (Github repo with PDFs available here), are structured around covering key top-level topics in Stanford's CS 229 Machine Learning course, including: Notation and general conceptsLinear modelsClassificationClusteringNeural networks... and much more Links to individual cheat…
Read More
Feature engineering and SGDReg with Regularization With Students Performance Data

Feature engineering and SGDReg with Regularization With Students Performance Data

All Need Imports for the data import pandas as pd pd.options.display.max_colwidth = 80 import numpy as np import matplotlib.pyplot as plt %matplotlib inline from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score from sklearn.preprocessing import StandardScaler from sklearn.linear_model import SGDRegressor from sklearn.svm import SVC # SVM model with kernels from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross_val_score from sklearn.metrics import mean_squared_error import warnings warnings.filterwarnings('ignore') Loading and Exploring Data There are two files of students performance in two subjects: math and Portuguese (Portugal is the country the dataset is from). Important notice : description (later on, as DESCR) tells that "there are…
Read More
Build Knowledge Graph Using Python

Build Knowledge Graph Using Python

What is a Knowledge Graph A Knowledge Graph is a set of data points connected by relations that describe a domain, for instance, a business, an organization, or a field of study. It is a powerful way of representing data because Knowledge Graphs can be built automatically and can then be explored to reveal new insights about the domain. The concept of Knowledge Graphs borrows from the Graph Theory. In this particular representation, we store data as: Entity 1 and Entity 2 are called nodes and the Relationship is called an edge. Of course, in a real-world knowledge graph, there…
Read More