Blogs -

Customer segmentation with clustering

Posted on April 1, 10112 | Laura Funderburk

Customer segmentation using clustering This mini-project is based on this blog post by yhat. Data The dataset contains information on marketing newsletters/e-mail campaigns (e-mail offers sent to customers) and transaction level data from customers. The transactional data shows which offer customers responded to, and what the customer ended up buying. The data is presented as an Excel workbook containing two worksheets. Each worksheet contains a different dataset. %matplotlib inline import pandas as pd import sklearn import seaborn as sns import warnings from sklearn import cluster import numpy as np warnings. [Read More]

Ethics in AI

Posted on April 1, 10112 | Laura Funderburk

Ethics in Artificial Intelligence: Introduction to the Fairlearn package Literature and code in this notebook was inspired by Selbst et al. “Fairness and Abstraction in Sociotechnical Systems”, Fairlearn’s Python package documentation, as well as Fairlearn’s 2021 SciPy tutorial: SciPy 2021 Tutorial: Fairness in AI systems: From social context to practice using Fairlearn by Manojit Nandi, Miroslav Dudík, Triveni Gandhi, Lisa Ibañez, Adrin Jalali, Michael Madaio, Hanna Wallach, Hilde Weerts is licensed under CC BY 4. [Read More]

Hyperparameter tuning Decision Trees and Random Forest Walks

Posted on April 1, 10112 | Laura Funderburk

Classifying the “German Credit” Dataset This dataset has two classes (these would be considered labels in Machine Learning terms) to describe the worthiness of a personal loan: “Good” or “Bad”. There are predictors related to attributes, such as: checking account status, duration, credit history, purpose of the loan, amount of the loan, savings accounts or bonds, employment duration, installment rate in percentage of disposable income, personal information, other debtors/guarantors, residence duration, property, age, other installment plans, housing, number of existing credits, job information, number of people being liable to provide maintenance for, telephone, and foreign worker status. [Read More]

Predicting customer satisfaction

Posted on April 1, 10112 | Laura Funderburk

Predicting customer satisfaction Attributes X1 to X6 indicate the responses for each question and have values from 1 to 5 where the smaller number indicates less and the higher number indicates more towards the answer. X1 = my order was delivered on time X2 = contents of my order was as I expected X3 = I ordered everything I wanted to order X4 = I paid a good price for my order X5 = I am satisfied with my courier X6 = the app makes ordering easy for me Y = target attribute (Y) with values indicating 0 (unhappy) and 1 (happy) customers import sys, os import pandas as pd import numpy as np import matplotlib. [Read More]

Recommender system

Posted on April 1, 10112 | Laura Funderburk

Movie recommender system Data The data used here has been compiled from various movie datasets like Netflix and IMDb. Filename: movie_titles.csv: MovieID: MovieID does not correspond to actual Netflix movie ids or IMDB movie ids YearOfRelease: YearOfRelease can range from 1890 to 2005 and may correspond to the release of corresponding DVD, not necessarily its theaterical release Title: Title is the Netflix movie title and may not correspond to titles used on other sites. [Read More]

Time series forecasting

Posted on April 1, 10112 | Laura Funderburk

Time Series Forecasting Time Series is a big component of our everyday lives. They are in fact used in medicine (EEG analysis), finance (Stock Prices) and electronics (Sensor Data Analysis). Many Machine Learning models have been created in order to tackle these types of tasks, two examples are ARIMA (AutoRegressive Integrated Moving Average) models and RNNs (Recurrent Neural Networks). Data Source For Time series analysis, we are going to deal with Stock market Analysis. [Read More]