Cab Fare Prediction

Last Updated on May 3, 2021


Aim: to predict cab fare based on various parameters like source,destination, time of pickup etc.

libraries used: pandas,

numpy, seaborn,matplotlib, scikit-learn.

algorithms used: linear regression, decision trees, random forest.

problem type: regression

dataset: provided by institute

model evaluation: MAE, RMSE,MSE, residual plot.

project status : web-app and deployment left.

More Details: cab fare prediction

Submitted By

Share with someone who needs it

Smart Glasses For Visually Impaired

Last Updated on May 3, 2021


This is our Second year Hardware & Software Tools project . We wanted to invent 


something that would benefit handicapped people in some way. He came up with this idea 


for glasses that could help blind people sense if there was an object in front of them that 


they might hit their head on. The white cane that they use when walking is used for helping 


them navigate the ground but does not do much for up above. Using an Arduino Pro Mini


MCU, Ultrasonic Sensor, and a buzzer, we created these glasses that will sense the distance 


of an object in front and beep to alert the person that something is in front of them. Simple


and inexpensive to make. Credit to for some of the parts.


These “Smart Glasses” are designed to help the blind people to read and translate the typed text 

which is written in the English language. These kinds of inventions consider a solution to 

motivate blind students to complete their education despite all their difficulties. Its main 

objective is to develop a new way of reading texts for blind people and facilitate their 

communication. The first task of the glasses is to scan any text image and convert it into audio 

text, so the person will listen to the audio through a headphone that’s connected to the glasses. 

The second task is to translate the whole text or some words of it by pressing a button that is 

connected to the glasses as well. The glasses used many technologies to perform its tasks which 

are OCR, (gTTS) and Google translation. Detecting the text in the image was done using the 

OpenCV and Optical Character Recognition technology (OCR) with Tesseract and Efficient and 

Accurate Scene Text Detector (EAST). In order to convert the text into speech, it used Text to 

Speech technology (gTTS). For translating the text, the glasses used Google translation API. The 

glasses are provided by Ultrasonic sensor which is used to measure the required distance 

between the user and the object that has an image to be able to take a clear picture. The picture 

will be taken when the user presses the button. Moreover, the motion sensor was used to 

introduce the user to the university’s halls, classes and labs locations using Radio-frequency 

identification (RFID) reader. All the computing and processing operations were done using the 

Raspberry Pi 3 B+ and Raspberry pi 3 B. For the result, the combination of using OCR with

EAST detector provide really high accuracy which showed the ability of the glasses to recognize 

almost 99% of the text. However, the glasses have some drawbacks such as: supporting only the 

English language and the maximum distance of capturing the images is between 40-150 cm. As a 

future plan, it is possible to support many languages and enhance the design to make it smaller 

and more comfortable to wear.


More Details: Smart Glasses for visually impaired

Submitted By

Finding Donors For Charity Ml

Last Updated on May 3, 2021


In this project, you will employ several supervised algorithms of your choice to accurately model individuals' income using data collected from the 1994 U.S. Census. You will then choose the best candidate algorithm from preliminary results and further optimize this algorithm to best model the data. Your goal with this implementation is to construct a model that accurately predicts whether an individual makes more than $50,000. This sort of task can arise in a non-profit setting, where organizations survive on donations. Understanding an individual's income can help a non-profit better understand how large of a donation to request, or whether or not they should reach out to begin with. While it can be difficult to determine an individual's general income bracket directly from public sources, we can (as we will see) infer this value from other publically available features.

The dataset for this project originates from the UCI Machine Learning Repository. The datset was donated by Ron Kohavi and Barry Becker, after being published in the article "Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid". You can find the article by Ron Kohavi online. The data we investigate here consists of small changes to the original dataset, such as removing the 'fnlwgt' feature and records with missing or ill-formatted entries.

More Details: Finding Donors for Charity ML

Submitted By

Tic-Tac-Toe Game

Last Updated on May 3, 2021


This project Tic Tac Toe game against a simple artificial intelligence. An artificial intelligence (or AI) is a computer program that can intelligently respond to the player’s moves. This game doesn’t introduce any complicated new concepts. The artificial intelligence that plays Tic Tac Toe is really just a few lines of code.

Two people play Tic Tac Toe with paper and pencil. One player is X and the other player is O. Players take turns placing their X or O. If a player gets three of their marks on the board in a row, column or one of the two diagonals, they win. When the board fills up with neither player winning, the game ends in a draw.

This chapter doesn’t introduce many new programming concepts. It makes use of our existing programming knowledge to make an intelligent Tic Tac Toe player. The player makes their move by entering the number of the space they want to go. These numbers are in the same places as the number keys on your keyboard's keypad

First, you must figure out how to represent the board as data in a variable. On paper, the Tic Tac Toe board is drawn as a pair of horizontal lines and a pair of vertical lines, with either an X, O, or empty space in each of the nine spaces.

In the program, the Tic Tac Toe board is represented as a list of strings. Each string will represent one of the nine spaces on the board. To make it easier to remember which index in the list is for which space, they will mirror the numbers on a keyboard’s number keypad.

The strings will either be 'X' for the X player, 'O' for the O player, or a single space ' ' for a blank space.

So if a list with ten strings was stored in a variable named board, then board[7] would be the top-left space on the board. board[5] would be the center. board[4] would be the left side space, and so on. The program will ignore the string at index 0 in the list. The player will enter a number from 1 to 9 to tell the game which space they want to move on.

Creating a program that can play a game comes down to carefully considering all the possible situations the AI can be in and how it should respond in each of those situations. The Tic Tac Toe AI is simple because there are not many possible moves in Tic Tac Toe compared to a game like chess or checkers.

Our AI checks if any possible move can allow itself to win. Otherwise, it checks if it must block the player’s move. Then the AI simply chooses any available corner space, then the center space, then the side spaces. This is a simple algorithm for the computer to follow.

The key to implementing our AI is by making copies of the board data and simulating moves on the copy. That way, the AI code can see if a move results in a win or loss. Then the AI can make that move on the real board. This type of simulation is effective at predicting what is a good move or not.

More Details: Tic-Tac-Toe Game

Submitted By

Air Quality Analysis And Prediction Of Italian City

Last Updated on May 3, 2021


Problem statement

  • Predict
  • The value of CO in mg/m^3 reference value with respect to the available data. Please assume if you need, but do specify the same.
  • Forecast
  • The value pf CO in mg/m^3 for the next 3 3 weeks on hourly averaged concentration

Data Set Information

located on the field in a significantly polluted area, at road level,within an Italian city. Data were recorded from March 2004 to February 2005 (one year)representing the longest freely available recordings of on field deployed air quality chemical sensor devices responses. Ground Truth hourly averaged concentrations for CO, Non Metanic Hydrocarbons, Benzene, Total Nitrogen Oxides (NOx) and Nitrogen Dioxide (NO2) and were provided by a co-located reference certified analyzer. Evidences of cross-sensitivities as well as both concept and sensor drifts are present as described in De Vito et al., Sens. And Act. B, Vol. 129,2,2008 (citation required) eventually affecting sensors concentration

Data collection

0 Date (DD/MM/YYYY)

1 Time (HH.MM.SS)

2 True hourly averaged concentration CO in mg/m^3 (reference analyzer)

3 PT08.S1 (tin oxide) hourly averaged sensor response (nominally CO targeted)

4 True hourly averaged overall Non Metanic HydroCarbons concentration in microg/m^3 (reference analyzer)

5 True hourly averaged Benzene concentration in microg/m^3 (reference analyzer)

6 PT08.S2 (titania) hourly averaged sensor response (nominally NMHC targeted)

7 True hourly averaged NOx concentration in ppb (reference analyzer)

8 PT08.S3 (tungsten oxide) hourly averaged sensor response (nominally NOx targeted)

9 True hourly averaged NO2 concentration in microg/m^3 (reference analyzer)

10 PT08.S4 (tungsten oxide) hourly averaged sensor response (nominally NO2 targeted)

11 PT08.S5 (indium oxide) hourly averaged sensor response (nominally O3 targeted)

12 Temperature in °C

13 Relative Humidity (%)

14 AH Absolute Humidity.

More Details: Air quality analysis and Prediction of Italian city

Submitted By


Last Updated on May 3, 2021


The Objective of this problem is to predict whether a person is ‘Defaulted’ or ‘Not Defaulted’ on the basis of the given 8 predictor variables.

The data consists of 8 Independent Variables and 1 dependent variable. The Independent Variables are I. Age: It is a continuous variable. This feature depicts the age of the person. II. Ed: It is a categorical variable. This feature has the education category of the person converted to numerical form. III. Employ: It is a categorical variable. This feature contains information about the geographic location of the person. This column has also been converted to numeric values. IV. Income: It is a continuous variable. This feature contains the gross income of each person. V. DebtInc: It is a continuous variable. This feature tells us an individual’s debt to his or her gross income. VI. Creddebt: It is a continuous variable. This feature tells us about the debt-to-credit ratio. It is a measurement of how much a person owes their creditors as a percentage of its available credit. VII. Othdebt: It is a continuous variable. It tells about any other debt a person owes. VIII. Default: It is a categorical variable. It tells whether a person is a Default (1) or Not-Default (0).

After performing extensive exploratory data analysis the data is given to multiple models like Logistic Regression, Decision Tree classifier, Random Forest classifier, KNN, Gradient Boosting classifier with and without hyperparameter tuning, the final results are obtained and compared on metrics like precision score, recall score, AUC-ROC score.

More Details: Bank_Loan_Default_Case

Submitted By