Retail Analysis Of Walmart DataLast Updated on May 3, 2021
One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An
ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.
Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.
This is the historical data which covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:
- Store - the store number
- Date - the week of sales
- Weekly_Sales - sales for the given store
- Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week
- Temperature - Temperature on the day of sale
- Fuel_Price - Cost of fuel in the region
- CPI – Prevailing consumer price index
- Unemployment - Prevailing unemployment rate
Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13
Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13
Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13
Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13
Basic Statistics tasks
- Which store has maximum sales
- Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation
- Which store/s has good quarterly growth rate in Q3’2012
- Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together
- Provide a monthly and semester view of sales in units and give insights
For Store 1 – Build prediction models to forecast demand
- Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.
- Change dates into days by creating new variable.
Select the model which gives best accuracy.
Permission Management SystemLast Updated on May 3, 2021
Permission Management System is web project developed for the newly joined employees to get their resumes validated by the manager and if manager is impressed or feels that the employee is fit for the job, he grants permission to access their official site as an employee where employee can manage his work and gets permission to view all the details of that job. The aim of this project is to make the tasks of newly joining employees and manager of the company easy. Many employees apply for the job and the manager need to validate all the details of these employees, so our project aim is to provide a database that can store all the applied employee resumes and remove all the resumes which are not fit for the job and validate the selected employee resumes. If he/she is selected for the job then the manager will be able to give access to the official site, where the employee can view all the other employees available in company and previous employee details and he/she can manage data in the official site. And even manager can give permission whether the employee is just an employee for the job or the admin to handle all the tasks in official site. The main objective of this project is to grant permissions for the newly joined employees based on his/her resume and work. The project is developed as a web-based application which works for a company to maintain their records and grant permissions for new employees. But later on, the project can be modified to all the companies by making partial changes to the site by providing their company details in the site online. Permission Management System utilitarian scope is enabled through the concepts of computing mainly Database management and the user interface aspects enabled through various web interfaces and technologies. Who can use this application in real life? 1. Employees who are willing to apply for a job. 2. Manager who gives permissions for the employee to give a job.
Black Friday Sales PredictionLast Updated on May 3, 2021
Black Friday Sales Prediction is simply a prediction of sales of different products. Main goal of this project is to find out customer purchase behaviour against various products of different categories. I have purchase summary of various customers for selected high volume products from last month. The data set also contains customer demographics (age, gender, marital status, city type, stay in current city), product details (product id and product category) and Total purchase amount from last month. Based on this data we will predict sales.
For simplicity i divided my projects into small parts-
- Data Collection :- I collected data from 'Anylitical Vidhya' as a CSV file. We have two CSV file one is train data which is used for training the data and other is test data which is used for prediction based on training of model.
- Import Libraries:- I import differnt sklearn package for algorithm and different tasks.
- Reading data:- i read the data using pandas 'read csv()' function.
- Data Preprocessing -: In this part i first found missing values then i remove a column or imputed some value (mean,mode,median) According to the amount of data missing for a particular column.
I checked the unique value in each column. Then i did label encoding to convert all string types data to integer value. I find out correlation matrix which shows the correlation between columns to each other.
Then i split the data. Then i create a regression model. I trained that regression model using Random Forest Algorithm .I feed training dataset to model using random forest algorithm. After creating model i did similiar data preprocessing to test dataset . And then i feed test dataset to trained regression model which predict the values of this test dataset. And then i found accuracy of this model using actual target value which is given in training dataset. and predict target value which we predict from test dataset.
Ai Real Time Car And Pedestrian Tracking AppLast Updated on May 3, 2021
AI REAL-TIME CAR AND PEDESTRIAN DETECTING APP USING PYTHON AND importing OpenCV
A real-time app using python as the programming language with importing open cv.
Learning from this:
- Haar features and algorithms
- how the haar cascade algorithm works in real-time upon grayscaled images
- why it works better on grayscaled images than taking colored frames instead.
- simple lines of code can do magic just putting the right things at right places
The result from this:
- we can detect images of person and vehicle and identify them in real-time webcam support to get the real time frame or taking the video as the import
- multiple real-time images can be detected and also with regular changing of dimensions
- this can lead to avoidance of the accident as also suggested by the tesla in their dashcam video
- the most important challenge is to train the data and it's time-consuming so to build a simple prototype taking OpenCV trained data is beneficial as it saves lots of time.
- haar algorithm how it works is again one of the most important challenges as it has to be quite accurate to detect the face in real-time
- importing OpenCV required installation of multiple packages and different versions of python have different versions of that library.
- detecting person with nonliving vehicles is itself a challenge to make the training data in its work for both using two different cascade classifiers
Hospital ReadmissionLast Updated on May 3, 2021
Hospital readmissions of diabetic patients are expensive as hospitals face penalties if their readmission rate is higher than expected and reflects the inadequacies in health care system. For these reasons, it is important for the hospitals to improve focus on reducing readmission rates.
We have to Identify the key factors that will influence readmission for diabetes and to predict the probability of patient readmission.
A leading hospital in the US is suddenly seeing increase in the patient readmission in less than 30 days. This is serious concern for the hospital as it may indicate insufficient treatment or diagnosis when the patient was admitted first and later released under clean bill of health. Hence it is in Hospital’s interest to support their diagnosis by a better predictive model which we are going to build.
Here the objective is: Classify the patients treated by this hospital into two primary categories:
· Readmitted within 30 days
· Not readmitted
The dataset chosen is that available on the UCI website which contains the patient data for the past 10 years for 130 hospitals. The code has been written in Python using different libraries like scikit-learn, seaborn, matplotlib etc. Different machine learning techniques for classification and regression like Logistic regression, Random forest etc .have been used to achieve the objective.