Customer Segmentation

Last Updated on May 3, 2021


I used K-means clustering algorithm.K means clustering is segmentation of customers to get a better understanding of them which in turn could be used to increase the revenue of the company.

More Details: Customer Segmentation

Submitted By

Share with someone who needs it

False Alarm Detection System

Last Updated on May 3, 2021


This project was made for a chemical industry which had sensors installed in various parts of the factory to detect H2S gas which is hazardous to health. Every time one or multiple sensors detected the H2S leak, an emergency alarm rings to alert the workers. For every alarm, the industry calls a team which sanitizes the place and checks for the leak and this was a big cost to the company.

A few of the alarms that ring are not even hazardous. The company gave us the data for each alarm with a final column stating the alarm was dangerous or not.

Ambient Temperature


Unwanted substance deposition (0/1)

Humidity (%)

H2S Content(ppm)

Dangerous (0/1)


The data was first pre-processed and analysis libraries like Numpy and Pandas were used to make it ready to be utilized by a machine learning algorithm.

Problems like standard scaling, categorical data and missing values were handled with appropriate techniques.

Then, we used Logistic Regression model to make a classifier with first five column as independent columns and dangerous column as dependent/target column.

Now whenever, there is a leakage and the alarm rings, the data is sent to us and we predict if it is dangerous or not. If found dangerous then only the team is called to sanitize the place and fix the leak. This saved a lot of money for the company. 

More Details: False Alarm Detection System

Submitted By

Smart Health Monitoring App

Last Updated on May 3, 2021


The proposed solution will be an online mobile based application. This app will contain information regarding pre and post maternal session. The app will help a pregnant lady to know about pregnancy milestone and when to worry and when to not. According to this app, user needs to register by entering name, age, mobile number and preferred language. The app will be user friendly making it multi-lingual and audio-video guide to help people who have impaired hearing or sight keeping in mind women who reside in rural areas and one deprived of primary education. The app will encompass two sections pre-natal and post- natal.

           In case of emergency i.e. when the water breaks (indication) there will be a provision to send emergency message (notification) that will be sent to FCM (Firebase Cloud Messaging), it then at first tries to access the GPS settings in cell, in case the GPS isn’t on, Geolocation API will be used. Using Wi-Fi nodes that mobile device can detect, Internet, Google’s datasets, nearby towers, a precise location is generated and sent via Geocoding to FCM, that in turn generates push notifications, and the tokens will be sent to registered user’s, hospitals, nearby doctors, etc. and necessary actions will be implemented, so that timely            help will be provided

More Details: Smart Health Monitoring App

Submitted By

Salary Predictor

Last Updated on May 3, 2021


This is a web app created using open source python library called Streamlit. This library is mainly used to create web apps for machine learning and data science. In this Project I collected data required from the

Kaggle. I Used Sklearn library to get the model required for the data and I fitted the data using in-built methods in it. So I created a web app which contain two pages named Home and Prediction. In home page I displayed the data collected and a scatter graph plotted using the matplotlib library with the help of data collected from Kaggle. In prediction page there will be a text filed where we can enter the experience of the employee and click the button which ultimately shows the precited salary for that employee. Stream lit Web app gives the output of a local host URL. So we have to deploy it globally. So I deployed the web app in Heroku platform. Here in this project I just downloaded a small data set to test how it works. So here a large data set can also be taken but the process will be different in training the model. For large datasets the data should be split to train and testing data so that we can train the model accurately and advanced algorithms to train the model is also used. So based on our convenience and requirements we can do machine learning models and save it into a file and this file can be used while creating a web app.

More Details: Salary Predictor

Submitted By

Determination Of A Person’S Health

Last Updated on May 3, 2021


Determination of person’s health

The project was built with the intend of helping the society. It has been calculated that approx. 1.9 billion people die due to health-related problems every year. This rate is very high, and the disease is easily preventable

The project has been made with the help of Data Analysis and Machine Learning using Python with a GUI output page. In this project, the machine will analyse the already present data first and then conclude upon a person’s health on his/her given factors.

In this project, gender and either height or weight will be given to the machine. If the height is given then the weight will be predicted and vice-versa. Through these predictions the machine will tell us about the health of a person.

The main goal is to help the society for its betterment as far as health is concerned.

The data set used is from UCI repository. It includes four attributes-

1.     Gender

2.     Height

3.     Weight

4.     Index

The machine will be trained in these aspects to determine a person’s health or weight and the category it will lie in.

The categories are-

1.     0 – Underweight

2.     1 – Normal weight

3.     2 – Healthy

4.     3 – Over weight

5.     4 – Obesity

The methods followed in chronological form are-

1.     Loading dataset (using pandas library)

2.     Dataset cleaning (using pandas and numpy libraries)

3.     Dataset pre-processing

4.     Data visualization (using seaborn, matplotlib and matplotlib.pyplot libraries)

4.1  Univariate analysis

4.2  Bivariate analysis

5.     Correlation matrix 

The machine learning algorithms applied were-

1.     Linear Regression

2.     Logistic Regression

3.     KNN Classifier

4.     Decision Tree Classifier

5.     Random Forest Classifier

Random Forest Classifier gave highest accuracy of about 95% while logistic regression gave the leas with about 76%.

The user in the GUI page will be asked:

1. Full name

2. Gender

3. Whether they know their height or weight

4. Their height or weight

More Details: Determination of a Person’s Health

Submitted By

Nyc Yellow Taxi Prediction

Last Updated on May 3, 2021


I did this project in my second semester of Mtech studies at Ahmedabad University. In NYC, taxicabs come in two varieties: yellow and green; they are widely recognizable symbols of the city. Taxis painted yellow (medallion taxis) are able to pick up passengers anywhere in the five boroughs. in Upper Manhattan, the Bronx, Brooklyn, Queens, Staten Island. The yellow taxi cab was first introduced in 1915 by a car salesman named John Hertz. Hertz decided to paint his taxis yellow because of a study by a Chicago university to establish what color would grab the attention of passers-by more easily. The results proved that yellow with a touch of red was most noticeable. As a result, Hertz started to paint all his taxicabs yellow and went on to start the Chicago-based Yellow Cab Company in 1915. During pre-processing of data there were many outliers such as there was 100 dollars fare for a 0-mile trip. Then there were few outliers in rate code id. We pre-processed and removed them all and cleaned the data. After cleaning the data we visualized data in which we got different insights people like to travel single in the taxi. Area 236 has the most taxi bookings. Also, we observed that at midnight (1 to 6 am) people don’t like to travel much often. FOr the prediction part, we predicted the fare using different regression methods and for taxi booking, we used k means clustering.

More Details: NYC yellow taxi prediction

Submitted By