Jump to content

Fossil Classification Project


Trevor

Recommended Posts

Hello Fellow Forum-Goers,

 

Lately I have been somewhat inactive on the forum, and also have not had the opportunity to go fossil hunting in New Jersey since I am at college. But those things do not deter me though. I am here today to tell you about a project I have been doing with fossil classification, specifically classification of some fossil species from the Cretaceous of New Jersey. The goal is to be able to give my computer of a fossil and have it tell me with a certain degree of confidence the probability that it is any one of several New Jersey Cretaceous fossil species.

 

For this project, I began by taking photos of some of my fossils. Here are some examples of the what the photos looked like:

DRUM.jpg.124db66d5431c9b6c15d11ffa8177e72.jpgRAT.jpg.de95f8896935647273808a14c79cca28.jpgRAY.jpg.97ad2e66f60e26e07b55284e09599137.jpg

 

 Anomoeodus phaseolus                                 Ischyodus bifurcatus                                        

Brachyrhizodus wichitaensis

 

The data consisted of around 150 photos, spanning across 6 fossil species. Not represented in the photos above were: 

Belemnitella americana 

Enchodus petrosus

Ischyrhiza mira

 

To later label this data, I wrote a csv file with labels. From the contents of this file, you can see how many of each specie there were. Note how the common name for these species are used as the labels.

 

id,species
IMG_4749,Belemnite-1
IMG_4780,Belemnite-2
IMG_4812,Ray-1
IMG_4813,Ray-2
IMG_4814,Ray-3
IMG_4815,Ray-4
IMG_4816,Ray-5
IMG_4817,Ray-6
IMG_4818,Ray-7
IMG_4819,Ray-8
IMG_4820,Ray-9
IMG_4821,Ray-10
IMG_4822,Ray-11
IMG_4823,Ray-12
IMG_4824,Ray-13
IMG_4825,Ray-14
IMG_4826,Ray-15
IMG_4827,Ray-16
IMG_4828,Ray-17
IMG_4829,Ray-18
IMG_4830,Ray-19
IMG_4831,Ray-20
IMG_4832,Ray-21
IMG_4833,Ray-22
IMG_4834,Ratfish-1
IMG_4835,Ratfish-2
IMG_4836,Ratfish-3
IMG_4837,Ratfish-4
IMG_4838,Ratfish-5
IMG_4839,Ratfish-6
IMG_4840,Ratfish-7
IMG_4841,Ratfish-8
IMG_4842,Ratfish-9
IMG_4843,Ratfish-10
IMG_4844,Ratfish-11
IMG_4845,Ratfish-12
IMG_4846,Ratfish-13
IMG_4847,Ratfish-14
IMG_4848,Ratfish-15
IMG_4849,Ratfish-16
IMG_4850,Ratfish-17
IMG_4851,Ratfish-18
IMG_4852,Ratfish-19
IMG_4853,Ratfish-20
IMG_4854,Ratfish-21
IMG_4855,Ratfish-22
IMG_4856,Ratfish-23
IMG_4857,Ratfish-24
IMG_4858,Ratfish-25
IMG_4859,Ratfish-26
IMG_4860,Ratfish-27
IMG_4861,Ratfish-28
IMG_4862,Ratfish-29
IMG_4863,Ratfish-30
IMG_4864,Ratfish-31
IMG_4865,Ratfish-32
IMG_4866,Ratfish-33
IMG_4867,Ratfish-34
IMG_4868,Enchodus-1
IMG_4869,Enchodus-2
IMG_4870,Enchodus-3
IMG_4871,Enchodus-4
IMG_4872,Enchodus-5
IMG_4873,Enchodus-6
IMG_4875,Enchodus-7
IMG_4876,Enchodus-8
IMG_4877,Enchodus-9
IMG_4878,Enchodus-10
IMG_4879,Enchodus-11
IMG_4888,Enchodus-12
IMG_4889,Enchodus-13
IMG_4890,Enchodus-14
IMG_4891,Enchodus-15
IMG_4892,Enchodus-16
IMG_4903,Pychodont-1
IMG_4905,Pychodont-2
IMG_4906,Pychodont-3
IMG_4907,Pychodont-4
IMG_4908,Pychodont-5
IMG_4909,Pychodont-6
IMG_4910,Pychodont-7
IMG_4911,Pychodont-8

Now, with the labels and data. I began to make a program that fed in the images and then used Keras ( a machine learning library that has the tools for something called a convolutional neural network) in the programming language Python. 

 

Here is the beginning of the code:

 

import numpy  <- This gets me NumPy, which allows for easy use of vectors and matrices to work with data

import collections <- This allows me to make better data structures called "dictionaries"

import os <- This allows me to get the path of the image in my computer

import imageio <- This allows me to write edited images to other folders

from PIL import Image <- This allows me to manipulate the images, in way such as flipping or rotating.

from random import shuffle <- This allows me to randomly shuffle the training data.

 

This code puts each fossil image and its label into something like a container together, this "container" is called a dictionary

species_dictionary = collections.OrderedDict()

our_file = open("fossil_labels.csv","r")

file_contents = our_file.read()

file_contents = file_contents.split('\n')

for iteration in range(1,len(file_contents)):

      file_contents[iteration] = file_contents[iteration].split(',')

      species_dictionary[file_contents[iteration][0]] = file_contents[iteration][1]

 

I will skip the other code and now discuss convolutional neural networks, which are used in image classification

So, right now I have all the fossil image with their label. The network can now use these to find clusters of pixels in 

a given photos that correspond to a particular fossil species. Overtime, the network letters that this or that cluster of

pixels is common to one a single fossil species. Then, it can recognize that cluster in a new or novel image that it

has not been trained on.

 

Here are the convolutional layers:

 

model = Sequential()

model.add(Conv2D(32, kernel_size = (3, 3), activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 1)))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(BatchNormalization())

model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(BatchNormalization())

 

model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(BatchNormalization())

model.add(Conv2D(96, kernel_size=(3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(BatchNormalization())

model.add(Conv2D(32, kernel_size=(3,3), activation='relu'))

model.add(MaxPooling2D(pool_size=(2,2)))

model.add(BatchNormalization())

model.add(Dropout(0.2))

 

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dense(5, activation = 'softmax'))

 

model.compile(loss='binary_crossentropy', optimizer='adam', metrics = ['accuracy'])

model.fit(training_images, training_labels, batch_size = 50, epochs = 10, verbose = 1)

 

I do not expect this code to be fully understood.The network uses weights or sensitivities to different pixel clusters. Then as it learns

how its predictions for a photo compares to the actual training photo I gave it, it updates the weights to reflect this. By the end this

"error loss" should reach towards 0, and when it does, we know that its predictions correspond very close with the actual photo, now

allowing it to classify fossil images for these six species well. If it were training on 10 species it would classify all 10 well.

 

Here is the output of the training:

# Epoch 1/10
# 164/164 [==============================] - 497s 3s/step - loss: 0.3975 - acc: 0.8329
# Epoch 2/10
# 164/164 [==============================] - 139s 846ms/step - loss: 0.1026 - acc: 0.9610
# Epoch 3/10
# 164/164 [==============================] - 139s 848ms/step - loss: 0.0427 - acc: 0.9902
# Epoch 4/10
# 164/164 [==============================] - 126s 771ms/step - loss: 0.0232 - acc: 0.9939
# Epoch 5/10
# 164/164 [==============================] - 119s 728ms/step - loss: 0.0153 - acc: 0.9963
# Epoch 6/10
# 164/164 [==============================] - 2258s 14s/step - loss: 0.0066 - acc: 0.9976
# Epoch 7/10
# 164/164 [==============================] - 141s 861ms/step - loss: 0.0047 - acc: 1.0000
# Epoch 8/10
# 164/164 [==============================] - 135s 824ms/step - loss: 0.0048 - acc: 1.0000
# Epoch 9/10
# 164/164 [==============================] - 132s 803ms/step - loss: 0.0027 - acc: 1.0000
# Epoch 10/10
# 164/164 [==============================] - 122s 746ms/step - loss: 0.0043 - acc: 1.0000

You can see that the loss keeps going down with more and more training.

 

For the future, I definitely need to take more photos to get more data and allow it to train on a graphical processing unit (GPU) as opposed to the normal CPU that you use on a laptop. The GPU is better at parallel processing and can train the network in seconds (on my computer it took 15 minutes). 

 

Well that is the current state of the project. I still need to do more but thank you for staying here till the end. I hope you have a nice day.

 

-Trevor

  • I found this Informative 5

: )

Link to comment
Share on other sites

Sounds like a great project Trevor.

So in the future you hope to develope this into a "Fossil Recognition App"?

 

Then what will we have to squabble about in the Fossil ID topic?!!! :default_rofl:

Dorensigbadges.JPG       

Link to comment
Share on other sites

2 minutes ago, caldigger said:

Sounds like a great project Trevor.

So in the future you hope to develope this into a "Fossil Recognition App"?

 

Then what will we have to squabble about in the Fossil ID topic?!!! :default_rofl:

 

Ha, that's much more difficult! I don't think the ID section will be going away anytime soon. Also, these networks are really buggy sometimes and if a fossil is rotated 180 degrees they can give strange predictions. This is more for fun. I also do not know anything about app development.

: )

Link to comment
Share on other sites

wow... I usually understand what is going on here at TFF, but this is way over my head.  I hope it works...

 

 

Link to comment
Share on other sites

Awesome idea Trevor! Looks like it will be a lot of work with the picture taking and training, but I hope it pays off.

“You must take your opponent into a deep dark forest where 2+2=5, and the path leading out is only wide enough for one.” ― Mikhail Tal

Link to comment
Share on other sites

@Trevor i know it would be a great idea if i could understand it... ;)

  • I found this Informative 1

On The Hunt For The Trophy Otodus!

 

Link to comment
Share on other sites

That’s a cool idea! Hopefully, it pans out well.

Each dot is 50,000,000 years:

Hadean............Archean..............................Proterozoic.......................................Phanerozoic...........

                                                                                                                    Paleo......Meso....Ceno..

                                                                                                           Ꞓ.OSD.C.P.Tr.J.K..Pg.NgQ< You are here

Doesn't time just fly by?

 

Link to comment
Share on other sites

It's well beyond me - all I can see is the robots will be taking over very soon......

BTW I don't know if this has any bearing on your ultimate IDs, but shouldn't 'pychodont' be 'pycnodont'?

  • I found this Informative 1
Link to comment
Share on other sites

On 4/30/2019 at 3:53 AM, Wrangellian said:

It's well beyond me - all I can see is the robots will be taking over very soon......

BTW I don't know if this has any bearing on your ultimate IDs, but shouldn't 'pychodont' be 'pycnodont'?

 

Ha, I have to redo part of it I guess. That shouldn't be too difficult, thank you for pointing that out. 

: )

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...