Opinion Mining On Twitter DatasetLast Updated on May 3, 2021
Opinion Mining is the task of extracting opinions from a sentence on an instance. In these project, I have collected the data from twitter and had proposed a methodology to judge the opinion of various tweets into three different classes "Offensive language", "Hate speech" and "Neutral speech" using various machine learning algorithm. I have analysed the results briefly
Share with someone who needs it
Truth-SocketsLast Updated on May 3, 2021
A website where you can go and play the game of truth game in sync.
In the game, you can create or join an existing room, Once you have joined the room you are asked to add your questions which you might ask someone in the game. Once everyone has entered their questions and is ready to play. You click on begin the game, which starts a 10-sec countdown after which all the entered questions from all the users in the room are laid out randomly on cards that are flipped so you can't see the questions.
And a random person is chosen by the game to flip the card and answer the question, which is also flipped in sync across all the users in the room
The project is built in a Node.js environment and uses Socket.io to communicate with servers and across all the users in a room. When a user selects a card, a message regarding that particular card is sent to the server which in turn broadcasts the message to all the clients in a room and thus allowing the game to be played in sync. The server maintains the status of all the rooms and their current state.
Clone the repo, and move to the folder, and run the command node server.js
Smart FridgeLast Updated on May 3, 2021
A Smart Fridge that uses Computer Vision to log in food, keeps user updated by SMS, and provide recommendations.
We saw the brand new Samsung Family Hub smart fridge at the CES 2017, which require manual data log-in for the goods stored inside. We got inspired to create a smart fridge that can automatically log in what's inside the fridge, enable users to access the data remotely and have information recommended for the users based on what they have in the fridge.
What it does
This is an IoT-based smart fridge that uses Computer Vision to automatically log in food, informs the users through text messages of what's stored inside and expiration data, and recommend healthier and better use of user's’ current storage through features like checking nutrition and search for recipes related to some items.
How we built it
We used a button on an Arduino board to emulate the action of “closing the fridge door”. The signal created by the button is sent to a PC through a serial COM port. When PC receives that signal, the kinect camera is triggered to capture a photo of the current status in the fridge. The photo is then compressed and sent to our web server. Our web server is coded on Python+Flask and deployed on Google App Engine Flexible Environment. This web server also contains some logics for responding to Twilio messages, which will be mentioned later. When the web server receives that photo, it puts the photo in Google Cloud Storage. It also keeps some basic image metadata in Google Cloud Datastore database. Then the Google Cloud Vision API is called to analyze the photo and label it by what the item is and which category it belongs to. The labels (coming out of cloud vision api) are then passed to Google KnowledgeGraph API to be further narrowed down to things people would normally put in a fridge. The results coming out of Google KnowledgeGraph are then stored in Google Cloud Datastore database. Now the fridge basically identifies the items that were put in it by automatically capturing and analyzing photos. Every time new items are added to the fridge, Twilio would send a notification through SMS to inform user Users are also able to text Twilio some basic commands to:
- Check what is currently in the fridge
- Check which item is about to pass its expiration date
- Check the nutrition of the food stored
- Search for recipes related to some items
Challenges we ran into
1) Capture the kinect photo with the least noise and incorporated Arduino-based trigger for the photo
2) Integrate the local image capture, python web server, google cloud platform, and twilio together and make them work flawlessly. Specifically, the challenges include the following:
- Image format conversion
- Image compression and processing
- Handling HTTP POST/GET requests between Local and web servers for images as well as web servers and twilio for sending and receiving texts
- Create appropriate database structure to store images and item labels
3) At first, it was really hard to pick the right label from about 10 labels returned by cloud vision api. We used KnowledgeGraph first to narraw the list down to 3-5 labels, and then manually process them according to how “general” or “specific” they are.
4) There were some misleading parts in the documentation of cloud vision api in Python. The URI stated in the doc is not the correct format required by the actual function. We finally figured it out by looking into the C# version of that documentation.
Accomplishments that we're proud of
We finished it early enough to write this :p
What we learned
Learned so much about technical stuffs and non-technical stuffs along the way of development
What's next for Smart Fridge
Computer Vision System
- Better recognition of photos containing multiple items of different categories
- More accurate and systematic labeling of new items
Data log-in/Request methods
- Use speech recognition to log in data, complementary to Computer Vision
- A smarter twilio assistant capable of natural language processing
Data Utilization Features
- Automatically refill necessity through Google Express
LogisticregressionLast Updated on May 3, 2021
Problem Statement :
- X Education sells online courses to industry professionals. The company markets its courses on several websites and search engines like Google.
- Once these people land on the website, they might browse the courses or fill up a form for the course or watch some videos. When these people fill up a form providing their email address or phone number, they are classified to be a lead. Moreover, the company also gets leads through past referrals.
- Once these leads are acquired, employees from the sales team start making calls, writing emails, etc. Through this process, some of the leads get converted while most do not. The typical lead conversion rate at X education is around 30%.
- X Education needs help in selecting the most promising leads, i.e. the leads that are most likely to convert into paying customers.
- The company needs a model wherein you a lead score is assigned to each of the leads such that the customers with higher lead score have a higher conversion chance and the customers with lower lead score have a lower conversion chance.
- The CEO, in particular, has given a ballpark of the target lead conversion rate to be around 80%.
- Source the data for analysis
- Clean and prepare the data
- Exploratory Data Analysis.
- Feature Scaling ? Splitting the data into Test and Train dataset.
- Building a logistic Regression model and calculate Lead Score.
- Evaluating the model by using different metrics - Specificity and Sensitivity or Precision and Recall.
- Applying the best model in Test data based on the Sensitivity and Specificity Metrics.
- Designed logistic Regression model and calculate the Lead Score
- Predicted the leads with a accuracy of 80% and found Important features responsible for good conversion rate or the ones' which contributes more towards the probability of a lead getting converted.
- Prepared a power point presentation with great visualization for clients and Managers.
Comcast Telecom Consumer ComplaintsLast Updated on May 3, 2021
Comcast is an American global telecommunication company. The firm has been providing terrible customer service. They continue to fall short despite repeated promises to improve. Only last month (October 2016) the authority fined them a $2.3 million, after receiving over 1000 consumer complaints.
The existing database will serve as a repository of public customer complaints filed against Comcast.
It will help to pin down what is wrong with Comcast's customer service.
- Ticket #: Ticket number assigned to each complaint
- Customer Complaint: Description of complaint
- Date: Date of complaint
- Time: Time of complaint
- Received Via: Mode of communication of the complaint
- City: Customer city
- State: Customer state
- Zipcode: Customer zip
- Status: Status of complaint
- Filing on behalf of someone
To perform these tasks, you can use any of the different Python libraries such as NumPy, SciPy, Pandas, scikit-learn, matplotlib, and BeautifulSoup.
- Import data into Python environment.
- Provide the trend chart for the number of complaints at monthly and daily granularity levels.
- Provide a table with the frequency of complaint types.
- Which complaint types are maximum i.e., around internet, network issues, or across any other domains.
- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.
- Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:
- Which state has the maximum complaints
- Which state has the highest percentage of unresolved complaints
- Provide the percentage of complaints resolved till date, which were received through the Internet and customer care calls.
The analysis results to be provided with insights wherever applicable.
Dog And Cat Image ClassificationLast Updated on May 3, 2021
Dog and cat image classification
The project classifies an image into a dog or a cat. The model has been built by using Convolutional Neural Network or also known as CNN. CNN is a part of deep learning which deals with analysing images. It is widely used for image recognition and classification. This project was developed by using Python. Python is an interpreted, high-level and general-purpose programming language. Python was implemented on Jupyter Notebook.
Libraries and Functions used-
Various Python libraries were used while developing the ML model. The tools used were:
1. tensorflow- It focusses on training of neural networks
2. load_model- This library is used to load a model and construct it identically
3. tkinter- It is a python GUI toolkit
4. PIL- It is Python Image Library that supports in doing operations with images
5. Filedialog- It is used for selecting a file/directory
6. Playsound- It is used for playing audios
7. ImageDataGenerator- It is a class of Keras library used for real-time data augmentation
8. Flow_from_directory- It is an image augmentation tool
9. keras Preprocessing- It is the data preprocessing module of keras which provides utilites for working with image data.
10. load_img- It loads the image in PIL format.
11. img_to_array- It changes the image into a numpy array.
12. expand_dim- It expands the dimension to add an extra dimension for a batch of only one image with axis=0.
In this neural network 2 activation functions were used-
The methods followed were:
1. Pre-processing of data
1.1 Training data
1.2 Testing data
2. Building CNN
2.1 Adding the first convolution layer
2.3 Adding the second convolution layer
2.5 Full connection
2.6 Output layer
The accuracy of last(50) epoch was 97%
This function loads the ML model and take the image input given by the user and then pre-process it. Later the pre-processed image goes as an input to ML model which gives the prediction. For our output, this code plays a sound corresponding to the prediction.
The final page asks the user to select an image from the local computer. The tab’s name is ‘Image Classifier’.
Once the user selects the image, the model successfully predicts whether the image is of a dog or a cat. The model also plays a sound stating about the prediction.