E-Commerce Website

Last Updated on May 3, 2021


E-commerce website developed in Django framework. On this website, there are four modules the first module is the homepage on the homepage there is a list of products. the Second module is a contact page. the third module is the tracker page if the user wants to track the item then the tracker page is helps to track the item.

More Details: E-commerce website

Submitted By

Share with someone who needs it

Image Captioning Bot Using Rnn And Cnn

Last Updated on May 3, 2021


What does an Image Captioning Problem entail?

Suppose you see this picture –

What is the first thing that comes to you mind? (PS: Let me know in the comments below!).

Here are a few sentences that people could come up with :

A man and a girl sit on the ground and eat .
A man and a little girl are sitting on a sidewalk near a blue bag eating .
A man wearing a black shirt and a little girl wearing an orange dress share a treat .

A quick glance is sufficient for you to understand and describe what is happening in the picture. Automatically generating this textual description from an artificial system is the task of image captioning.

The task is straightforward – the generated output is expected to describe in a single sentence what is shown in the image – the objects present, their properties, the actions being performed and the interaction between the objects, etc. But to replicate this behaviour in an artificial system is a huge task, as with any other image processing problem and hence the use of complex and advanced techniques such as Deep Learning to solve the task.

Before I go on, I want to give special thanks to Andrej Kartpathy et. al, who helped me understand the topic with his insightful course – CS231n.


Methodology to Solve the Task

The task of image captioning can be divided into two modules logically – one is an image based model – which extracts the features and nuances out of our image, and the other is a language based model – which translates the features and objects given by our image based model to a natural sentence.

For our image based model (viz encoder) – we usually rely on a Convolutional Neural Network model. And for our language based model (viz decoder) – we rely on a Recurrent Neural Network. The image below summarizes the approach given above.

Usually, a pretrained CNN extracts the features from our input image. The feature vector is linearly transformed to have the same dimension as the input dimension of the RNN/LSTM network. This network is trained as a language model on our feature vector.

For training our LSTM model, we predefine our label and target text. For example, if the caption is “A man and a girl sit on the ground and eat.”, our label and target would be as follows –

Label – [ <start>, A, man, and, a, girl, sit, on, the, ground, and, eat, . ] 

Target – [ A, man, and, a, girl, sit, on, the, ground, and, eat, ., <end> ]

This is done so that our model understands the start and end of our labelled sequence.



Walkthrough of Implementation

Let’s look at a simple implementation of image captioning in Pytorch. We will take an image as input, and predict its description using a Deep Learning model.

The code for this example can be found on GitHub. The original author of this code is Yunjey Choi. Hats off to his excellent examples in Pytorch!

In this walkthrough, a pre-trained resnet-152 model is used as an encoder, and the decoder is an LSTM network.

To run the code given in this example, you have to install the pre-requisites. Make sure you have a working python environment, preferably with anaconda installed. Then run the following commands to install the rest of the required libraries.

git clone https://github.com/pdollar/coco.git

cd coco/PythonAPI/
python setup.py build
python setup.py install

cd ../../

git clone https://github.com/yunjey/pytorch-tutorial.git
cd pytorch-tutorial/tutorials/03-advanced/image_captioning/

pip install -r requirements.txt

After you have setup your system, you should download the dataset required to train the model. Here we will be using the MS-COCO dataset. To download the dataset automatically, you can run the following commands:

chmod +x download.sh

Now you can go on and start your model building process. First – you have to process the input:

# Search for all the possible words in the dataset and 
# build a vocabulary list
python build_vocab.py   

# resize all the images to bring them to shape 224x224
python resize.py

Now you can start training your model by running the below command:

python train.py --num_epochs 10 --learning_rate 0.01

Just to peek under the hood and check out how we defined our model, you can refer to the code written in the model.py file.

import torch
import torch.nn as nn
import torchvision.models as models
from torch.nn.utils.rnn import pack_padded_sequence
from torch.autograd import Variable

class EncoderCNN(nn.Module):
    def __init__(self, embed_size):
        """Load the pretrained ResNet-152 and replace top fc layer."""
        super(EncoderCNN, self).__init__()
        resnet = models.resnet152(pretrained=True)
        modules = list(resnet.children())[:-1]      # delete the last fc layer.
        self.resnet = nn.Sequential(*modules)
        self.linear = nn.Linear(resnet.fc.in_features, embed_size)
        self.bn = nn.BatchNorm1d(embed_size, momentum=0.01)
    def init_weights(self):
        """Initialize the weights."""
        self.linear.weight.data.normal_(0.0, 0.02)
    def forward(self, images):
        """Extract the image feature vectors."""
        features = self.resnet(images)
        features = Variable(features.data)
        features = features.view(features.size(0), -1)
        features = self.bn(self.linear(features))
        return features
class DecoderRNN(nn.Module):
    def __init__(self, embed_size, hidden_size, vocab_size, num_layers):
        """Set the hyper-parameters and build the layers."""
        super(DecoderRNN, self).__init__()
        self.embed = nn.Embedding(vocab_size, embed_size)
        self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True)
        self.linear = nn.Linear(hidden_size, vocab_size)
    def init_weights(self):
        """Initialize weights."""
        self.embed.weight.data.uniform_(-0.1, 0.1)
        self.linear.weight.data.uniform_(-0.1, 0.1)
    def forward(self, features, captions, lengths):
        """Decode image feature vectors and generates captions."""
        embeddings = self.embed(captions)
        embeddings = torch.cat((features.unsqueeze(1), embeddings), 1)
        packed = pack_padded_sequence(embeddings, lengths, batch_first=True) 
        hiddens, _ = self.lstm(packed)
        outputs = self.linear(hiddens[0])
        return outputs
    def sample(self, features, states=None):
        """Samples captions for given image features (Greedy search)."""
        sampled_ids = []
        inputs = features.unsqueeze(1)
        for i in range(20):                                    # maximum sampling length
            hiddens, states = self.lstm(inputs, states)        # (batch_size, 1, hidden_size), 
            outputs = self.linear(hiddens.squeeze(1))          # (batch_size, vocab_size)
            predicted = outputs.max(1)[1]
            inputs = self.embed(predicted)
            inputs = inputs.unsqueeze(1)                       # (batch_size, 1, embed_size)
        sampled_ids = torch.cat(sampled_ids, 1)                # (batch_size, 20)
        return sampled_ids.squeeze()

Now we can test our model using:

python sample.py --image='png/example.png'

For our example image, our model gives us this output:

<start> a group of giraffes standing in a grassy area . <end>

And that’s how you build a Deep Learning model for image captioning!



The model which we saw above was just the tip of the iceberg. There has been a lot of research done on this topic. Currently, the state-of-the-art model in image captioning is Microsoft’s CaptionBot. You can look at a demo of the system on their official website (link : www.captionbot.ai).

I will list down a few ideas which you can use to build a better image captioning model.


More Details: Image Captioning Bot using RNN and CNN

Submitted By


Last Updated on May 3, 2021


The Objective of this problem is to predict whether a person is ‘Defaulted’ or ‘Not Defaulted’ on the basis of the given 8 predictor variables.

The data consists of 8 Independent Variables and 1 dependent variable. The Independent Variables are I. Age: It is a continuous variable. This feature depicts the age of the person. II. Ed: It is a categorical variable. This feature has the education category of the person converted to numerical form. III. Employ: It is a categorical variable. This feature contains information about the geographic location of the person. This column has also been converted to numeric values. IV. Income: It is a continuous variable. This feature contains the gross income of each person. V. DebtInc: It is a continuous variable. This feature tells us an individual’s debt to his or her gross income. VI. Creddebt: It is a continuous variable. This feature tells us about the debt-to-credit ratio. It is a measurement of how much a person owes their creditors as a percentage of its available credit. VII. Othdebt: It is a continuous variable. It tells about any other debt a person owes. VIII. Default: It is a categorical variable. It tells whether a person is a Default (1) or Not-Default (0).

After performing extensive exploratory data analysis the data is given to multiple models like Logistic Regression, Decision Tree classifier, Random Forest classifier, KNN, Gradient Boosting classifier with and without hyperparameter tuning, the final results are obtained and compared on metrics like precision score, recall score, AUC-ROC score.

More Details: Bank_Loan_Default_Case

Submitted By

Indian Railways

Last Updated on May 3, 2021


- Implement and design an Indian Railway Website which I started from 13th March to 20st March 2021.

- In this website user can able to fetch the proper trains schedule in all over the India like their Arrival Time, Departure Time , Number of Stoppage with their Station Name and Code .

- Applied HTML, CSS, JavaScript and Bootstrap as well as Rest API to fetch all the detail schedule of various train in India.

- This website takes the Train Number and Date as an input from the user which he/she want to fetch the details of that particular train and after clicking the Get Schedule button , a number of Flex Cards appeared on the screen on the basis of number of stoppage including the Source and Destination station.

- This cards have two faces one is front which contains the Serial no. at the top and Station Name with their Code at below and another is back , which includes the no. of Days , Arrival Time and Departure Time.

- By default it shows the front side of the card but on hover the card it shows the back side details.

- This project also includes some CSS Animation and Live Timer Clock in the middle of the page.

More Details: Indian Railways

Submitted By


Last Updated on May 3, 2021


Problem Statement :

  • X Education sells online courses to industry professionals. The company markets its courses on several websites and search engines like Google.
  • Once these people land on the website, they might browse the courses or fill up a form for the course or watch some videos. When these people fill up a form providing their email address or phone number, they are classified to be a lead. Moreover, the company also gets leads through past referrals.
  • Once these leads are acquired, employees from the sales team start making calls, writing emails, etc. Through this process, some of the leads get converted while most do not. The typical lead conversion rate at X education is around 30%.

Business Goal:

  • X Education needs help in selecting the most promising leads, i.e. the leads that are most likely to convert into paying customers.
  • The company needs a model wherein you a lead score is assigned to each of the leads such that the customers with higher lead score have a higher conversion chance and the customers with lower lead score have a lower conversion chance.
  • The CEO, in particular, has given a ballpark of the target lead conversion rate to be around 80%.


  • Source the data for analysis
  • Clean and prepare the data
  • Exploratory Data Analysis.
  • Feature Scaling ? Splitting the data into Test and Train dataset.
  • Building a logistic Regression model and calculate Lead Score.
  • Evaluating the model by using different metrics - Specificity and Sensitivity or Precision and Recall.
  • Applying the best model in Test data based on the Sensitivity and Specificity Metrics.
  • Solution: 
  • Designed logistic Regression model and calculate the Lead Score 

Key Achievement: 

  • Predicted the leads with a accuracy of 80% and found Important features responsible for good conversion rate or the ones' which contributes more towards the probability of a lead getting converted.
  • Prepared a power point presentation with great visualization for clients and Managers.
More Details: LogisticRegression

Submitted By

''Human Identification And Detection Of Diseases By Extracting Sclera Veins''

Last Updated on May 3, 2021


Biometric includes the physiological and knowledge based method in that Sclera vein recognition is the one of the methods which is utilized for the most accurate method to identify the person and also detect the diseases.

Many researchers have developed more methods for ID. In this project, firstly have to efficiently partition the eye images into clusters depends on their region of interest(ROI) for that we have apply the segmentation.

Here the k-means clustering method is used to cluster the image and to separate the sclera part from image of the eye. This divides the three cluster sclera, IRIS and around the eye to take the sclera part for the person identification.

Sclera vessel pattern's images are saturated and the organization of vessel patterns is multi layered and also it is quite complex the features of vein from sclera.

In order that enhancement is necessary because the vessel patterns are not prominent in the sclera, so here Gabor filter is used to filter out the unwanted part or noise to extract the feature for the further use here local binary patter is used and classifications can be done to ID person and discover the diseases.

More Details: ''Human Identification and Detection of Diseases by Extracting Sclera Veins''

Submitted By