Solving Interview Problems with Deep Learning

3 min readJun 27, 2017

Its been a while since I’ve practiced programming interview questions, and I worry that my skills are lacking. Its always important to work through some every now and then to stay sharp so here we go:

Lets start with Fizz Buzz:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

This solution was actually inspired by Joel Grus. We can represent numbers with binary encoding:

Now, we can generate some training data (we will generate training data from 101 to 1024) since our actual problem will solve fizzbuzz for 1–100.

And now, lets define our multilayer perceptron model that will actually perform the bulk of the work. First, we define the inputs to the model:

Next, lets define the weights and biases that we will learn via backpropagation:

And now, lets feed our input through the model:

Then, lets compute the loss and tell tensorflow to minimize that loss:

Now, lets actually feed our real training data into that model:

And print out the results:

Results:

1:  (✔)
2:  (✔)
3:  (x)
...
97:  (✔)
98:  (✔)
99: fizz (✔)
Total Accuracy: 89.89%

For all that work, we achieved an embarrassingly low 90% accuracy. I was going to try to fix this but then quickly lost interest. Lets move onto something a bit more interesting: If the interviewer asks you about the runtime, make something up and insist you are correct until they believe you.

Write a program that counts the number of unique characters in a string.

This sounds like a problem that can be solved with an LSTM. Lets generate some data.

Next lets create our LSTM model:

Now, lets go ahead and run our model:

Here is the output from epochs of 280 and 290:

cost: 0.0496655665338, epoch: 280
string: eeaccbdeagac, pred: 6, actual: 6.0
string: gefaddbbcfac, pred: 7, actual: 7.0
string: acedcbgdcagf, pred: 7, actual: 7.0
string: aadebcdacefg, pred: 7, actual: 7.0
string: abeeaebcbbag, pred: 5, actual: 5.0
100% accuracycost: 0.0413268692791, epoch: 290
string: ebbfbfaebede, pred: 5, actual: 5.0
string: fgaccdbabedg, pred: 7, actual: 7.0
string: dgdcbaefcdad, pred: 7, actual: 7.0
string: ffagdfedccad, pred: 6, actual: 6.0
string: gfggfgbebcdb, pred: 6, actual: 6.0
99% accuracy

Hopefully the interviewer won’t be disappointed that we cannot solve this problem for any string thats greater than MAX_LENGTH, but overall, it achieved a 99% accuracy. I didn’t deal with variable length inputs in this case, we could’ve easily done that by adding padding.

Now, instead of just printing out the number of unique characters, print out the actual unique characters in the order they appear

Ex: “acabdb” => “acbd”

Oof. Thats a much tougher problem. In this case, the value we need to return is a sequence of characters instead of a single character or number so we will need some form of sequence to sequence model in order to learn this relationship.

Again, we can one hot encode the sequences, but this time, we will add some padding since the length of the output string is variable. Instead of trying to return a variable length sequence, we will just return a sequence that is equal to the length of the input, and pad the output with spaces.

For example, “abdab” would map to a string of the same length, but with spaces as padding: “abd “.

Lets create a test and training split.

Now we can set up our model. Lets start with the encoder:

And now, the decoder:

Now, we can compute the predictions by multiplying the decoder’s hidden layer output with a fully connected layer:

Now, we can compute the loss:

And now lets run our model:

training cost: 2.07600975037, epoch: 0
cebbdf: cebdf  -       
dffcbc: dfcb   -       
aadeab: adeb   -       
faceec: face   -       
bfaeec: bfaec  -       
0.0 % accuracy...training cost: 0.219934388995, epoch: 25
cebbdf: cebdf  - cebdf 
dffcbc: dfcb   - dfcb  
aadeab: adeb   - adeb  
faceec: face   - face  
bfaeec: bfaec  - bfaec 
100.0 % accuracy

It looks like by 25 epochs, our model has a fairly good understanding of how to solve this problem.

So those are three pretty easy warm up problems. In our next post, I’ll (try to) implement some more complex architectures to solve more difficult interview problems.

Please do not solve real interview problems in this way.

Solving Interview Problems with Deep Learning

Written by Chris Zhu

No responses yet