Dice SimulatorLast Updated on May 3, 2021
Python offers various packages to design the GUI, i.e. the Graphical User Interface. Tkinter is the most common, fast, and easy to use Python package used to build Graphical User Interface applications. It provides a powerful Object-Oriented Interface and is easy to use. Also, you develop an application; you can use it on any platform, which reduces the need of amendments required to use an app on Windows, Mac, or Linux.
It’s a simple cube with numbers from 1 to 6 written on its face. The simulation is the making of computer model. Thus, a dice simulator is a simple computer model that can roll a dice for us.
The first step is importing the required module where we import Tkinter which is used to make GUI applications and also the random module to generate random numbers.
The next step is Building a top-level widget to make the main window for our application here we will build the main window of our application, where the buttons, labels, and images will reside. We also give it a title by title() function.
The third step is designing the buttons:
Here, we use pack() to arrange our widgets in row and column form. The ‘BlankLine’ label is to skip a line, whereas we use ‘HeadingLabel’ label to give a heading.
The ‘rolling_dice’ function is a function that is executed every time a button is clicked. This is attained through the ‘command=rolling_dice’ parameter while defining a button.
Then ‘root.mainloop()’ is used to open the main window. It acts as the main function of our program.
We have successfully developed a cool application – Dice Rolling Simulator in Python. Now, you can just click on a button and get your next number.
Sense+Last Updated on May 3, 2021
Sense+ makes the approach to helping those in need proactive compared to the traditional reactive approach. It utilises speech, facial recognition and other technologies to infer emotions of users.
The global pandemic has revealed the growing issue and importance of mental health, in particular one’s accessibility to mental health services and the detection of someone suffering from stress, anxiety or other mental health conditions.
We personally have seen that being mentally well allows us ability to work and study productively.
It is the on going issue of those mentally unwell not approaching anyone due to societal stigma of seeking treatment that worries us.
Our project/proof of concept aims to make the change the approach of helping those in need proactive, rather than waiting for individuals to come forward by themselves, all whilst aiding to reducing the stigma associated with suffering from mental health issues
What it does
Our program integrates voice and facial recognition to detect/infer an individual’s emotions.
The voice using sentiment analysis to detect keywords from an audio transcript. These keywords are categorised as neutral, positive or negative. Natural language processing and regular expressions are utilised to break down audio transcripts into multiple sentences/segments.
The facial recognition uses convolutional neural networks to pick up features of ones faces, to identify emotions. Videos broken down into multiple frames which are fed into neutral network to make the predication.
This model is trained and validated using Facial Expression Recognition data from Kaggle (2013).
As of now we have nearly turned the above concept into an app which allows users to upload multiple videos, which are then analysed and results/predictions are returned about the emotional state of an individual.
The implications of this is that it can aid in indicating whether the user should seek professional help, or at the very least make them possibly aware of their current mental state.
How we built it
The frontend was developed using Java (Android Studio), whilst our backend was developed in Python, with the help of python packages such as TensorFlow, Keras and speech recognition. The frontend and backend communicate through Amazon AWS platform. AWS lambda is utilised so our code can be ran serverless and asynchronously. S3 is employed as a bucket to upload videos from the frontend so the backend process them. Additionally, output from the backend is stored as JSON in S3 so the frontend can retrieve for display purposes.
Challenges we ran into
The main challenge we faced was learning how to make our frontend and backend communicate. With the help of mentors, from Telstra, Atlassian and Australia Post they provided us insights into solving our main issue. Though we did not quite get everything integrate into a single working piece of software.
Learning aspects of AWS was also challenging considering no one on our team had any prior experience.
On top of that applying TensorFlow and Keras in a full project context was challenging in terms of the lack of resources (hardware) and training data was a timely process.
Accomplishments that we're proud of
Despite not completing a functioning prototype at this point in time, we are proud that we delved into new software, tools and packages that we never had prior experience with and tried our best to utilise them. Finally, we are proud of how we conducted ourselves as a team, given the diverse nature and range and variation of skills and knowledge.
What we learned
First of all, the importance of communicating as a team is crucial. Main points include team ideation, being critical and delegating appropriately according to each team members strengths. Another point is learning to approach mentors or team members when you are struggling. Overcoming the stigma or anxiety of admitting being ‘lost’ is important lesson, and we found when we overcame these barriers, we were able to progress.
What's next for Sense+
At the moment the Sense+ remains at its core an idea, not necessarily a piece of deliverable software. In the future we seek to improve upon accuracy when analysing and detecting emotion. This includes but isn’t limited to; more sophisticated sentiment analysis, improving the modelling and taking advantage of other bio-metrics that may come with the advanced of technology such as detecting heartbeat etc.
In terms of reach and usage, possibly uses is that companies could employ such software to monitor the well-being of employees. In the future the software could be more passive so that individuals can be monitored (of course with consent and confidential) in a more natural manner. This would yield accurate information on employee well-being rather than self-reports where people may lie because of stigma and fear. This could greatly boost the overall productivity and mental well-being within the company.
Other sectors this could be applied in is hospitals and education.
Virtual Dental ClinicLast Updated on May 3, 2021
Ongoing under the guidance of Dr. Sateesh Kumar Peddoju, Department of Computer Science & Engineering from November 2020 to present. In this project we are creating a platform using Nodejs where the patients can consult with dentists regarding their symptoms in a virtual environment made available via both a web-based application and mobile-based application compatible on android and ios devices. The patient will be able to easily connect to the dentists for timely collaboration and consultation according to their time and space feasibility. The patients can consult with a dentist of their choice via audio/video streaming and text-based messaging. The patients can receive diagnosis and prescription at a time and place more convenient to the patient. Patients will have to upload their current symptoms and the dentists, on the other hand, will analyze the patient’s reports and prior records to write and upload the prescriptions. The application will also maintain patients records for future reference in a secure database. We ensured the functional and non-functional requirements and design for such an application with emphasis on efficiency, reliability, and security of the services provided by the application and the data stored. The developed application will allow the patients with a quick, easy, and secure way of consulting with a dentist of their choice.
Machine Learning Implementation On Crop Health Monitoring System.Last Updated on May 3, 2021
The objective of our study is to provide a solution for Smart Agriculture by monitoring the agricultural field which can assist the farmers in increasing productivity to a great extent. Weather forecast data obtained from IMD (Indian Metrological Department) such as temperature and rainfall and soil parameters repository gives insight into which crops are suitable to be cultivated in a particular area. Thus, the proposed system takes the location of the user as an input. From the location, the soil moisture is obtained. The processing part also take into consideration two more datasets i.e. one obtained from weather department, forecasting the weather expected in current year and the other data being static data. This static data is the crop production and data related to demands of various crops obtained from various government websites. The proposed system applies machine learning and prediction algorithm like Decision Tree, Naive Bayes and Random Forest to identify the pattern among data and then process it as per input conditions. This in turn will propose the best feasible crops according to given environmental conditions. Thus, this system will only require the location of the user and it will suggest number of profitable crops providing a choice directly to the farmer about which crop to cultivate. As past year production is also taken into account, the prediction will be more accurate.
Covid Tracket On Twitter Using Data Science And AiLast Updated on May 3, 2021
Hi folks, I hope you are doing well in these difficult times! We all are going through the unprecedented time of the Corona Virus pandemic. Some people lost their lives, but many of us successfully defeated this new strain i.e. Covid-19. The virus was declared a pandemic by World Health Organization on 11th March 2020. This article will analyze various types of “Tweets” gathered during pandemic times. The study can be helpful for different stakeholders.
For example, Government can make use of this information in policymaking as they can able to know how people are reacting to this new strain, what all challenges they are facing such as food scarcity, panic attacks, etc. Various profit organizations can make a profit by analyzing various sentiments as one of the tweets telling us about the scarcity of masks and toilet papers. These organizations can able to start the production of essential items thereby can make profits. Various NGOs can decide their strategy of how to rehabilitate people by using pertinent facts and information.
In this project, we are going to predict the Sentiments of COVID-19 tweets. The data gathered from the Tweeter and I’m going to use Python environment to implement this project.
The given challenge is to build a classification model to predict the sentiment of Covid-19 tweets. The tweets have been pulled from Twitter and manual tagging has been done. We are given information like Location, Tweet At, Original Tweet, and Sentiment.
Approach To Analyze Various Sentiments
Before we proceed further, One should know what is mean by Sentiment Analysis. Sentiment Analysis is the process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer’s attitude towards a particular topic is Positive, Negative, or Neutral. (Oxford Dictionary)
Following is the Standard Operating Procedure to tackle the Sentiment Analysis kind of project. We will be going through this procedure to predict what we supposed to predict!
- Exploratory Data Analysis.
- Data Preprocessing.
- Classification Models.
Let’s Guess some tweets
I will read the tweet and can you tell me the sentiment of that tweet whether it is Positive, Negative, Or Neutral. So the first tweet is “Still shocked by the number of #Toronto supermarket employees working without some sort of mask. We all know by now, employees can be asymptomatic while spreading #coronavirus”. What’s your guess? Yeah, you are correct. This is a Negative tweet because it contains negative words like “shocked”.
If you can’t able to guess the above tweet, don’t worry I have another tweet for you. Let’s guess this tweet-“Due to the Covid-19 situation, we have increased demand for all food products. The wait time may be longer for all online orders, particularly beef share and freezer packs. We thank you for your patience during this time”. This time you are absolutely correct in predicting this tweet as “Positive”. The words like “thank you”, “increased demand” are optimistic in nature hence these words categorized the tweet into positive.
The original dataset has 6 columns and 41157 rows. In order to analyze various sentiments, We require just two columns named Original Tweet and Sentiment. There are five types of sentiments- Extremely Negative, Negative, Neutral, Positive, and Extremely Positive as you can see in the following picture.
Summary Of Dataset
Basic Exploratory Data Analysis
The columns such as “UserName” and “ScreenName” do not give any meaningful insights for our analysis. Hence we are not using these features for model building. All the tweets data collected from the months of March and April 2020. The following Bar plot shows us the number of unique values in each column.
There are some null values in the location column but we don’t need to deal with them as we are just going to use two columns i.e. “Sentiment” and “Original Tweet”. Maximum tweets came from London(11.7%) location as evident from the following figure.
There are some words like ‘coronavirus’, ‘grocery store’, having the maximum frequency in our dataset. We can see it from the following word cloud. There are various #hashtags in the tweets column. But they are almost the same in all sentiments hence they are not giving us meaningful full information.
World Cloud showing the words having a maximum frequency in our Tweet column
When we try to explore the ‘Sentiment’ column, we came to know that most of the peoples are having positive sentiments about various issues shows us their optimism during pandemic times. Very few people are having extremely negatives thoughts about Covid-19.
The preprocessing of the text data is an essential step as it makes the raw text ready for mining. The objective of this step is to clean noise those are less relevant to find the sentiment of tweets such as punctuation(.,?,” etc.), special characters(@,%,&,$, etc.), numbers(1,2,3, etc.), tweeter handle, links(HTTPS: / HTTP:)and terms which don’t carry much weightage in context to the text.
Also, we need to remove stop words from tweets. Stop words are those words in natural language that have very little meaning, such as “is”, “an”, “the”, etc. To remove stop words from a sentence, you can divide your text into words and then remove the word if it exists in the list of stop words provided by NLTK.
Then we need to normalize tweets by using Stemming or Lemmatization. “Stemming” is a rule-based process of stripping the suffixes (“ing”, “ly”, “es”, “ed”, “s” etc) from a word. For example — “play”, “player”, “played”, “plays” and “playing” are the different variations of the word — “play”.
Stemming will not convert original words into meaningful words. As you can see “considered” gets stemmed into “condit” which does not have meaning and a spelling mistake too. The better way is to use Lemmatization instead of stemming process.
Lemmatization is a more powerful operation, and it takes into consideration the morphological analysis of the words. It returns the lemma which is the base form of all its inflectional forms.
Here in the Lemmatization process, we are converting the word “raising” to its basic form “raise”. We also need to convert all tweets into the lower case before we do the normalization process.
We can include the process of tokenization. In tokenization, we convert a group of sentences into tokens. It is also called text segmentation or lexical analysis. It is basically splitting data into a small chunk of words. Tokenization in python can be done by the python NLTK library’s word_tokenize() function.
We can use a count vectorizer or a TF-IDF vectorizer. Count Vectorizer will create a sparse matrix of all words and the number of times they are present in a document.
TFIDF, short for term frequency-inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. The TF–IDF value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general. (wiki)
Building Classification Models
The given problem is Ordinal Multiclass classification. There are five types of sentiments so we have to train our models so that they can give us the correct label for the test dataset. I am going to built different models like Naive Bayes, Logistic Regression, Random Forest, XGBoost, Support Vector Machines, CatBoost, and Stochastic Gradient Descent.
I have used the given problem of Multiclass Classification that is dependent variable has the values -Positive, Extremely Positive, Neutral, Negative, Extremely Negative. I also convert this problem into binary classification i.e. I clubbed all tweets into just two types Positive and Negative. You can also go for three-class classification i.e. Positive, Negative and Neutral in order to achieve greater accuracy. In the evaluation phase, we will be comparing the results of these algorithms.
The feature importance (variable importance) describes which features are relevant. It can help with a better understanding of the solved problem and sometimes lead to model improvements by employing feature selection. The top three important feature words are panic, crisis, and scam as we can see from the following graph.
In this way, we can explore more from various textual data and tweets. Our models will try to predict the various sentiments correctly. I have used various models for training our dataset but some models show greater accuracy while some do not. For multiclass classification, the best model for this dataset would be CatBoost. For binary classification, the best model for this dataset would be Stochastic Gradient Descent.