I'm Oli, a Physicist disguised as a Data Scientist. Feel free to skip
the pleasantries and
take a look at my projects.
I have 3 years of professional experience in Data Science, currently leading a team of two other Data Scientists. I graduated with a First Class Honours degree in Physics from Queen Mary University
of London, and was awarded Principal's Prize for outstanding academic achievement.
I have worked on various projects ranging from Generative AI, to Data Analysis and Visualisation, Conversational AI using Natural Language Processing, and much more.
This portfolio contains some of my favourite projects that
I have worked on throughout my professional career and my studies.
When COVID-19 began to sweep across the UK, there was huge
emphasis on the R number. Government officials insisted
that we must get R below 1, and keep it there. No
exact values were ever broadcasted, only wide ranges in which
R may lie, and so I decided try and estimate
R (within error) for each day. The figure on the left
shows some of my key findings for the UK up until the middle of
May 2020 (see my GitLab for more).
Languages: Python (Pandas, NumPy).
Generate courses for any topic at any level in any language within minutes.
The tool uses a variety of large language models to create course outlines, lesson content, quizzes and interactive content, along with translation models to convert the material into any language, and voice synthesis models to bring the content to life with audio.
Two distinct image generation models are used to generate lesson images, depending on the type of content.
Languages: Python (OpenAI, Anthropic, Microsoft Azure AI Speech Services, AWS).
I built a chatbot to make life easier for students, for example enabling them to change their enrolled courses,
find out their teachers for each course, and book mental health appointments with their
college's mental health professional.
Languages: Python (using the RASA framework).
Perovskites are chemical compounds that are used in fuel cells
and electrochemical sensing. Knowing their properties is very
important for this reason, and thus I built a dataset of over 100
perovskites with over 30 features for each and used classification
and regression models to predict these features.
Languages: Python (Scikit-Learn, Pandas, NumPy).
As an extension of the previous project, I utilised artificial neural
networks with a dataset of 2 million compounds to predict the formation
energy, which is a property that dictates compound stability. I compared
various neural network structures to find which performed best.
Languages: Python (Keras, Pandas, NumPy).
Fitness is a passion of mine, and so I decided to start
developing a mobile app that meets my workout needs. The long term plan for this
fitness tracker is to accomodate those who want the
most detailed analysis of their workouts, for example by
importing AppleWatch workout data. This is an ongoing
project that I work on during my spare time.
Languages: JavaScript and React Native for front-end, Python for
data analysis.
Sleep is one of the most important factors in living a healthy
and happy life, and I find it extremely interesting. I tracked some
key metrics describing sleep quality, along with many variables
that I hypothesised could influence quality of sleep.
For example, the interactive graph shows how
temperature affects my sleep.
Languages: Python.
Tools: Power BI, Excel.
Ever wondered how the number of pieces of Lego per set has
changed over the last 70 years? I did, and so I used some
Lego data to find the answer (along with the answer to some
other questions...).
Languages: Python.
Dielectric materials are used in all sorts of electrical
components. I investigated at which temperature the
behaviour of these dielectrics change, and what effect this
has on the energy barrier to molecules within the
dielectrics.
Languages: Python.