e-commerce-classifier

A predictive app is to be deployed soon based on this project.

Please find the project here.

This consists of the notebook that predicts the category of the items of the e-commerce shopping list as given in the dataset to one of the 27 categories. You could find the dataset at https://www.kaggle.com/c/uw-cs480-fall20. This was a part of the Kaggle in-class competition.

It includes a text classification technique using RNN with LSTM unit, image classification technique using InceptionResNet model and an ensemble learning technique.

Text Classification

We have build and trained a basic RNN to classify the noisy text description into one of the 27 ctaegories. The RNN reads the description as a series of words - outputting a prediction and “hidden state” at each step. We take the final prediction to be the output i.e., which class the word belongs to.

Preparing the Data

We pre-process the data by converting the text into ASCII and stemming them to get the words in the base form. Now, We build a dictionary of all unique words in the noisy text description of the products. We now turn the description into tensors to make use of them in the model. To represent a single word, we use “one-hot vector” of size <1 x len(dictionary)>. A one-hot vector is filled with 0s except for a 1 at index of the current word. We join a bunch of those 2D tensors to make a product description.

Model

We usde LSTM units because it mitigates the problem of vanishing gradient and gradient explosion by the use of gated structures.

Training

Now we train the model on the train dataset. For the loss function nn.NLLLoss is appropriate since the last layer of the RNN is nn.LofSoftmax. Each loop is trained by:

creating input and target tensors
creating a zeroed initial hidden state
read each word in and keep hidden state for the next word
comapre final output to target
back propagate
return the output and loss

Image Classification

The image of each item can appear in different poses or even on or off human models. Our dataset consists of 43k images across 27 classes. And our training dataset contains 41.6k images. Currently, CNNs are the best machine learning models to classify images. Training these powerful neural nets takes a lot of computing power and data. So, we go for Transfer Learning where we use a pre trained model(InceptionResNet) that has been trained on Image-Net dataset, since, it is capable of extracting useful features from images of a wide range of classes.

Preprocessing

In order to use our images with a network trained on the ImageNet dataset we need to rescale the images to 224 x 224 and normalize them as per ImageNet standards.

We also applied Data Augmentation, which is the generation of more training data by cropping and re-aligning the original image of the train dataset to make the network more robust to varying image orientations of the same product.

Tying the models together

A predictive app is to be deployed soon based on this project.