Loan Analysis Using Machine Learning

Last Updated on May 3, 2021

About

In this project I have created a classifier model that is based on machine learning and I provide some past data to it and my model learn through the past data using some machine learning algorithm and then predict output based on the previous learning.

Basically I have data of some person who has taken loan from the bank and I have also some record of the person which indicates the person ability for paying the loan amount and also the previous record that show, in the past time they have paid loan or not. So my model learn through it and then predicted the output. As it is a classifier type so it predicts yes or no. So based on the output bank can give loan or not.

For this project I got 99 percentile for this and I got my best result through logistic regression.

So I think my model is well trained and it can be useful to the bank application.

More Details: loan analysis using Machine Learning

Submitted By


Share with someone who needs it

Regression Analysis On Wallmart Sales Data

Last Updated on May 3, 2021

About

One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An 

ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.

 Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.

 Dataset Description

This is the historical data which covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:

·        Store - the store number

·        Date - the week of sales

·        Weekly_Sales - sales for the given store

·        Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week

·        Temperature - Temperature on the day of sale

·        Fuel_Price - Cost of fuel in the region

·        CPI – Prevailing consumer price index

·        Unemployment - Prevailing unemployment rate

 Holiday Events

Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13

Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13

Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13

Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13

 Analysis Tasks

Basic Statistics tasks

1.     Which store has maximum sales

2.     Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation

3.     Which store/s has good quarterly growth rate in Q3’2012

4.     Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together

5.     Provide a monthly and semester view of sales in units and give insights

 Statistical Model

For Store 1 – Build prediction models to forecast demand

·        Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.

·        Change dates into days by creating new variable.

Select the model which gives best accuracy.

More Details: Regression Analysis on Wallmart Sales Data

Research Paper On Face Detection Using Haar Cascade Classifier

Last Updated on May 3, 2021

About

Abstract:

In the last several years, face detection has been listed as one of the most engaging fields in research. Face detection algorithms are used for the detection of frontal human faces. Face detection finds use in many applications such as face tracking, face analysis, and face recognition. In this paper, we are going to discuss face detection using a haar cascade classifier and OpenCV. In this study, we would be focusing on some of the face detection technology in use.



Conclusion:

In this study, we covered and studied in detail about face detection technique using haar cascades classifier and OpenCV to get the desired output. Using the OpenCV library, the haar cascade classifier was able to perform successful face detection with high accuracy and efficiency. We also used the OpenCV package to extract some of the features of the face to compare them. Also, we discussed some popular face detection methods. Further, we discussed the scope of face detection in the future and some of its applications. At last, we conclude that the future of facial detection technology is bright Security and surveillance is the major segments that will be deeply influenced. Other areas that are now welcoming it are private industries, public buildings, and schools

More Details: Research paper On FACE DETECTION USING HAAR CASCADE CLASSIFIER

Submitted By


Natural Language Processing

Last Updated on May 3, 2021

About

The problem statement is about allocation of projects using given dataset. We are provided with some requirements like project details (project name, project location and required project skills) and

candidate details (candidate id, location, candidate skills and description). From the given dataset, we have to filter the perfect candidate based on the requirements and their skills. Our work is to check whether the candidate is having required skills to do the project and also determine the evaluation status based on their location. If suppose the candidates is having required skills and match the location, the candidate is selected for that project, if does not match we reject the candidate for that project. In such case the rejected

candidates are checked with other projects. The foremost step is to clean up the data to highlight attributes.

Cleaning (or pre-processing) the data typically consists of a number of steps like remove punctuation, tokenization and remove stop words. I have taken a set of keywords which is most related to the skills that’s given in the project based on certain criteria .To describe the presence of keywords within the cleaned data we need to vectorize the data by Bag of Words. We are going to filter the candidate skills according to the current trends. Based on their number of skills known(languages) they will be prioritized. So, we want to use NLP Toolkit to arrange the candidates by their preferences. By doing this process in the given dataset, we can able to filter 50% of data. If the skills of the prioritized candidates match with same location of the project, the similarities will be calculated and the candidate is selected for that project else the candidate is rejected.

More Details: Natural Language Processing

Real Time Object Detection Using Tensorflow

Last Updated on May 3, 2021

About

Object detection is a computer vision technique in which a software system can detect, locate, and trace the object from a given image or video. The special attribute about object detection is that it identifies the class of object (person, table, chair, etc.) and their location-specific coordinates in the given image. The location is pointed out by drawing a bounding box around the object. The bounding box may or may not accurately locate the position of the object. The ability to locate the object inside an image defines the performance of the algorithm used for detection. Face detection is one of the examples of object detection.

These object detection algorithms might be pre-trained or can be trained from scratch. In most use cases, we use pre-trained weights from pre-trained models and then fine-tune them as per our requirements and different use cases.

Generally, the object detection task is carried out in three steps:

  • Generates the small segments in the input as shown in the image below. As you can see the large set of bounding boxes are spanning the full image

  • Feature extraction is carried out for each segmented rectangular area to predict whether the rectangle contains a valid object.

  • Overlapping boxes are combined into a single bounding rectangle (Non-Maximum Suppression)

Tensorflow is an open-source library for numerical computation and large-scale machine learning that ease Google Brain TensorFlow, the process of acquiring data, training models, serving predictions, and refining future results.

  • Tensorflow bundles together Machine Learning and Deep Learning models and algorithms. 
  • It uses Python as a convenient front-end and runs it efficiently in optimized C++.
  • Tensorflow allows developers to create a graph of computations to perform. 
  • Each node in the graph represents a mathematical operation and each connection represents data. Hence, instead of dealing with low-details like figuring out proper ways to hitch the output of one function to the input of another, the developer can focus on the overall logic of the application.

The TensorFlow Object Detection API is an open-source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models.

  • There are already pre-trained models in their framework which are referred to as Model Zoo. 
  • It includes a collection of pre-trained models trained on various datasets such as the 
  • COCO (Common Objects in Context) dataset, 
  • the KITTI dataset, 
  • and the Open Images Dataset.

As you may see below there are various models available so what is different in these models. These various models have different architecture and thus provide different accuracies but there is a trade-off between speed of execution and the accuracy in placing bounding boxes.

Tensorflow bundles together Machine Learning and Deep Learning models and algorithms. It uses Python as a convenient front-end and runs it efficiently in optimized C++.

Tensorflow allows developers to create a graph of computations to perform. Each node in the graph represents a mathematical operation and each connection represents data. Hence, instead of dealing with low-details like figuring out proper ways to hitch the output of one function to the input of another, the developer can focus on the overall logic of the application.

The deep learning artificial intelligence research team at Google, Google Brain, in the year 2015 developed TensorFlow for Google’s internal use. This Open-Source Software library is used by the research team to perform several important tasks.

TensorFlow is at present the most popular software library. There are several real-world applications of deep learning that makes TensorFlow popular. Being an Open-Source library for deep learning and machine learning, TensorFlow finds a role to play in text-based applications, image recognition, voice search, and many more. DeepFace, Facebook’s image recognition system, uses TensorFlow for image recognition. It is used by Apple’s Siri for voice recognition. Every Google app that you use has made good use of TensorFlow to make your experience better.

Here mAP (mean average precision) is the product of precision and recall on detecting bounding boxes. It’s a good combined measure for how sensitive the network is to objects of interest and how well it avoids false alarms. The higher the mAP score, the more accurate the network is but that comes at the cost of execution speed which we want to avoid here.

As my PC is a low-end machine with not much processing power, I am using the model ssd_mobilenet_v1_coco which is trained on COCO dataset. This model has decent mAP score and less execution time. Also, the COCO is a dataset of 300k images of 90 most commonly found objects so the model can recognise 90 objects.

This brings us to the end of this project where we learned how to use Tensorflow object detection API to detect objects in images 

More Details: Real Time Object Detection using Tensorflow

Submitted By


False Alarm Detection System

Last Updated on May 3, 2021

About

This project was made for a chemical industry which had sensors installed in various parts of the factory to detect H2S gas which is hazardous to health. Every time one or multiple sensors detected the H2S leak, an emergency alarm rings to alert the workers. For every alarm, the industry calls a team which sanitizes the place and checks for the leak and this was a big cost to the company.

A few of the alarms that ring are not even hazardous. The company gave us the data for each alarm with a final column stating the alarm was dangerous or not.

Ambient Temperature

Calibration(days)

Unwanted substance deposition (0/1)

Humidity (%)

H2S Content(ppm)

Dangerous (0/1)


 

The data was first pre-processed and analysis libraries like Numpy and Pandas were used to make it ready to be utilized by a machine learning algorithm.

Problems like standard scaling, categorical data and missing values were handled with appropriate techniques.

Then, we used Logistic Regression model to make a classifier with first five column as independent columns and dangerous column as dependent/target column.

Now whenever, there is a leakage and the alarm rings, the data is sent to us and we predict if it is dangerous or not. If found dangerous then only the team is called to sanitize the place and fix the leak. This saved a lot of money for the company. 

More Details: False Alarm Detection System

Submitted By