Machine Learning and Ar.ficial Intelligence for Bioinforma.cs
Homework 5 – Due October 6th at 10am
Each of the 3 ques.ons below is worth 33.333 points (you get 0.001 free to reach 100).
This homework needs to be completed on the Google Collaboratory, and the results submi:ed as screenshots in a .doc
or .pdf. Please include the completed run of the corresponding code the ques?on refers too along with your wri:en
answer (you can include addi?onal code if you want). You will need to
Also you are welcome to run Tensorflow code outside of the Collaboratory, if you have such a setup, please note though
that the submission need to follow the same format, meaning code cells –> output as shown on the Collaboratory (for
example do not submit Python interac?ve command terminal code)
In prepara?on for the homework, you can review again the Google Collaboratory posted in the last lecture. Please watch
the following videos in order to become familiar with the Collaboratory (feel free to watch any addi?onal on Youtube):
h:ps://www.youtube.com/watch?v=i-HnvsehuSw
h:ps://www.youtube.com/watch?v=RLYoEyIHL6A
For the codes in the ques?ons below, each shaded box showing a code can run in a separate Google Collaboratory cell.
Note: You need to run cells from top to bo:om (since top code cells generate dependencies for the lower cells), so you
have to copy-paste and run the code cells in your own Google Collaboratory, in the same order shown in the code each
ques?on points you too. Then as the ques?ons request you to do (for example, adjus?ng the number of epochs), you
have to edit the code in the corresponding cells and re-run each cell. If you are s?ll confused on how this works, re-watch
the above videos with tutorials on the Google Collaboratory and also addi?onal videos.
Ques.on 1.
NOTE: Use instead of “from keras.layers.normaliza?on import BatchNormaliza?on” the “from keras.layers import
BatchNormaliza?on”.
Run the following code on the Collaboratory. Tip: If you are logged in your Google account and click the “Copy to Drive”
bu:on on the top. This will make a full copy of this Google Collaboratory sheet under your own account, and save you a
lot of typing and copy-pas?ng compared to star?ng a new sheet and transferring everything over manually.
h:ps://colab.research.google.com/github/AviatorMoser/keras-mnist-tutorial/blob/master/MNIST in Keras.ipynb
a. How many different types of neural networks (and what kind of networks) are being used to classify the digits –
show the corresponding part of the code where these networks are implemented.
b. Run the code with both types of neural networks that are in it, based on the metrics, which one does it classify
the digits be:er ? Please explain your answer by also defining the metrics (so you understand what each metric
means).
c. Could you try a different ac?va?on func?on instead of sobmax in the final layer and see what happens with the
model predic?ons and its metrics ? Choose one from the list h:ps://www.tensorflow.org/api_docs/python/d/
keras/ac?va?ons
Due March 13th 2024
Ques.on 2.
Run the following code on the Collaboratory (you can skip the part showing the images if you wish). You will need to
copy this code in a new, clean sheet of the Google Collaboratory.
h:ps://www.tensorflow.org/tutorials/load_data/images
a. Modify the number of Convolu?onal and Max Pooling layers, for example add a pair or two, and remove a layer
or two :
model = tf.keras.Sequential([
tf.keras.layers.experimental.preprocessing.Rescaling(1./255),
tf.keras.layers.Conv2D(32, 3, activation=’relu’),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Conv2D(32, 3, activation=’relu’),
….
Then rerun the training with the modifica?ons
model.compile(
…
and also
model.fit(
train_ds,
..
What do you observe changing in the metrics ? (just run it for 3 epochs as it is)
b. Modify the number of epochs increasing them gradually (you might reach a point where it gets too slow in the
Google Collaboratory). What do you observe in the metrics as you increase the epochs, is there a point where
the metrics plateau?
c. In which part of the code we split the dataset in training / valida?ons and what por?ons ? What is the purpose
of doing this ?
d. Look at the structure of the Convolu?onal Neural Network as specified in the code for this image classifica?on
example h:ps://www.tensorflow.org/tutorials/images/classifica?on. What are the differences ? Make those
adjustments to modify the code you just made on a – c above, and re-run the model (use 5 epochs or so). What
do you observe in the model metrics ?
Ques.on 3.
Run the following code on Deep Learning for genomics on the Google Collaboratory:
h:ps://colab.research.google.com/github/TankMermaid/1000-genomes-gene?c-maps/blob/master/
A_Primer_on_Deep_Learning_in_Genomics_Public.ipynb
a. Describe in a couple of sentences the overall func?on of this neural network for bioinforma?cs predic?ons –
what the predic?ons taking place, what are the data used, and what type of neural network we are using ? From
which parts of the code you can find the answers to each of these points ?
b. How many predic?on classes this neural network has, and describe what are these classes. In addi?on to finding
this from the text cells in the code, please also point the parts of the actual code that would demonstrate the
number of predic?on classes (it should be one of the final layers in the network).
c. What por?on of the data we use for training, valida?on and tes?ng ? Where do you see that in the code ?
d. Run the code in your Google Collaboratory up to the point where we have the model lost / accuracy graphs
(including prin?ng these graphs). What do you observe in these graphs if you modify the tes?ng and valida?on
por?ons of the datasets ? You would need to re-run the cells from all the way up (where we define the training /
valida?on por?ons) up and including the cells genera?ng the graphs. Similarly if you reduce significantly the
number of epochs, what do you observe in those graphs ?