Finally completed the training of my gpt (decoder) model. It's a tiny language model trained on the screenplay of Titanic movie with 167k parameters. The results are not too good but they're are fine.
Finally looked a bit into llm fine-tuning and stuff
Cya;)
Day 99 of ML
Worked some more on improving apis and models
Implemented multi head and blocks to the transformer but this model won't train due to low gpu , I'll try to train it again on cloud gpu tmrw
(Btw here's the code ->)
Cya;)
Day 97 of ML
Some hard work again towards the model deployment and testing (also might learn AWS, since needed for work)
More progress in creating transformer, starting to implement attention tmrw, btw this is how results look after little training
Cya;)
Day 96 of ML
Worked on improving previous models and data pipelines
More brain scratching on building transformer from scratch
Am not doing much because I have college from 9-5 daily and then i/ship work, and finally learn (I think i need to grind more)
Cya;)
Day 95 of ML
(coming back on track)
Finally completed the entire api with fastapi, docker and also deployed the models (feeling very satisfied rn, created a big thing from scratch)
Got back on llm study and started coding, going with a decoder block that writes quotes
Cya;)
Day 94 of ML
PRETTY BIG DAY, as i completed ~80-85% of the api work and now just minor things left + some tweaking (learnt fastapi in the process)
watched the karpathy Transformers video for the second time but now i understand more concepts (need to start coding now)
Cya;)
Day 92 of ML
Worked on fixing broken lr of one of the models (still unfixed suggestions welcomed)
Also learnt about the communication inside self attention and coded them (still don't know why sensei used the 2nd method just for history purposes when 1st was easier)
Cya;)
Day 91 of ML
Traveled back to uni and started the grind. But the charger messed up and now I have to fix it
Still managed to complete the data processing part for gpt.
Decided to build it along rather than only watching lectures, also building it on different dataset
Cya;)
Day 90 of ML
Worked on data preprocessing, api and learnt about parallel processing for work
Learnt some assembly and wrote first code in it
Learnt some things about attention, will start coding it tmrw (ig coding it is the best way to learn it)
Cya;)
Day 88 of ML
Mostly caught up with work (working with models, APIs etc.)
Learnt more about RNN (seq2seq, nonseq2seq concepts)
Also a little info on attention (found this great vid. Do watch)
Cya;)
Day 87 of ML
Fixed all the bugs and issues in the api (really happy about that, i had almost given hope jk)
Learnt some more about RNN, i was thinking about implementing it on my own (with minor help), is that a good idea
Aint feeling too good today, will continue tmrw
Cya;)
Day 86 of ML
Worked a lot on api and finally solved a major bug bugging me (now it shouldn't take much long to finish)
Also continued learning about Transformers, learnt a little about RNNs (FAFO) and found a great lecture around it, will continue more of it
Cya;)
Day 84 of ML
Implemented the entire distilbert model back again cause i couldn't find the issue (and the issue still persists) :(
Also wifi is down again so can't study, today was so gloomy(just one thing happening after another), so I'll just read some manga or sleep
Cya;)
Day 83 of ML
Rebuilt the previously made model with different approach (changed encoder and stuff), but now the model is overfitting :( , will have to try different approach
Also started the gpt part of the series, really excited to build it...
Day 82 of ML
Completed the data pipeline for one of the models (overcame all the bugs to finally build api)
Starting working on another's
Finally implemented the wavenet part of the wordmore, but couldn't train it completely (was taking a lot of time, will do tmrw)
Cya;)
Day 81 of ML
Handled a lot of data processing today for the API until the data changed (did it again) but yeah the models are now training over api
used some techniques you guys suggested (they are working)
I remembr wordmore and i swear Ill implement wavenet tmrw
Cya;)
Day 80 of ML
Made the api routes for the api and also set up the docker image. (It took a lot of time since i had zero idea about this)
Pytorchified the wordmore codebase (added functions)
Guys I'm experiencing low productivity these days any suggestions are welcome!
Cya;)
Day 79 of ML
Worked on the deployment of the dl models. Now only other major parts remain but i have at least figured out the basic stuff.
Couldnt work on wavenet today, got too busy with the above stuff and Im a little confused about dims, so planning to do it slowly
Cya;)
Day 77 of ML
learnt about Docker today and made progress in fastapi Built an api with endpoint
did some more backprop in the code (i think I'm good at this)
started wavenet part, will build using wavenet tmrw
Also a coincidence that I'm posting about day 77 on 7/7 XD
Cya;)
Day 76 of ML
got confused whether to use flask or fastapi and ended up learning about both. (spent more than half of the day figuring this stuff)
Completed some more backprop activity, this time i did most of the backprop on my own without help (except the end)
Cya;)
Day 75 of ML
Rebuilt the entire transformer model i made a few days ago to find shortcomings and improve it (took the entire day but it was worth it + i understood most of the code this time)
advanced on my way to become a backprop ninja (this is really helping me out)
Cya;)
Day 73 of ML
learnt about batch normalisation and implemented the kaimin init initialisation for the weights and added batch norms in the layers, hence now bringing down the loss to ~2.05 (still need to work on that)
researched about flagging strategies and stuff(work)
Cya;)
Day 72 of ML
Learnt about activations, initialising the gradients, and a lot about batch norms (sensei is a god for existing these things so simply)
did some research and results comparisons of models
will apply all the learnt things tomorrow
Cya;)
Day 70 of ML
made and tested multiple models today (for work, took almost the entire day)
resumed the mlp part of makemore (watched the lecture, implementation left)
Finally watched the champions lift the trophy.
LET'S GOOOOOO!!!!
Cya;)
Day 69 of ML
completed the wordmore model implementation in Go (nn and mlp left)
made a model using autogluon and man it is great (somehow feels like cheating though)
Also didn't realise today was the 69th day, i could've done smthn special, missed opportunity, smh
Cya;)
Day 68 of ML
started makemore implementation (making wordmore actually with English words) in golang, and just the starting parts gave me a hard time
Played along with transformer models (work), also explored other model algo like autogluon, etc
Finally some ind vs eng
Cya;)
Day 67 of ML
added Matrix multiplication and other minor functions in the go autograd (thinking of taking it multidimensional)
Revised makemore model (also planning to make it in go)
Did a lot of model training, tuning and debugging for work model
Cya;)
Day 66 of ML
Added layers, MLP and other minor functions to this autograd project
Implemented the first karpathy vid on nn in golang (the project is complete till there)
did a lot of data processing and model searching research for internship tasks
Ended with rest
Cya;)
Day 65 of ML
added a whole neuron func and some other minor functions to the model, also half of layer done
improved the file structure and created it's packages (i had no idea about packages (skill issue))
secured an internship
ended the day doing internship task
Cya;)
Day 64 of ML
added activation functions (sigmoid, relu, tanh) in the go autograd (suggest a good name for this)
created backward function to backpropagate through all nodes (layers, mlp left)
ended the day with studying about constant pointers and pointers with const
Cya;)
Day 62 of ML
learned the theory part of mlp makemore. (Context window, embedding, etc.)
couldn't build it though, somehow slowed down today
will come back tmrw!!!
Cya;)
Day 61 of ML
added neural network to the newsmore (single layer) but the loss is same (nn with 1 layer is pretty useless imo)
it is yapping gibberish, I'll add layers tmrw and see how it works
also got some ideas for it, will implement in later days
Got back pain
Cya;)
Day 60 of ML
started working on makemore
Learnt about making a bigram model, multinomial distribution, and predicting the characters. (Below is a set of headlines made by model)
will make it based on nn tmrw
Cya;)
Day 59 of ML
today was so complex like i can't collect what i did throughout the day.
played around with tokenizers and stuff, experimented with mojo and finally ended up on golang.
I need to make a proper schedule and stick to it ig.
Cya;)
Day 58 of ML
studied about attention, embedding matrices, etc
also learnt about tokenizer (this sh*t is not as easy as it looks like)
couldn't do hyperparameters (since done of you suggested Bayesian, guess I'll pivot to that)
Will cook tomorrow
Cya;)
Day 57 of ML
well learnt about Transformers (intro to attention), gotta learn more since making a project around it
Also continued the hyperparameters study (types of hyperparameters)
Couldn't focus after all the things that happened, will watch anime and sleep now
Cya;)
Day 55 of ML
finally the exams ended and I returned home (to a lot of time)
again learnt about the tuning and stuff
Dealt with some imp work
decided a maybe future project
Cya;)
Day 54 of ML
completed the previous model today and got done with it
Back to learning hyperparameters tuning
learnt how to choose batch size, training throughput, and effect of batch size on hyperparameters
More about it to come...
Cya:)
Day 53 of ML
finally managed to bring the model to train by creating dataloaders of the dataset and then training over them (~72%)
Have to tune the hyperparameters
learnt about dataloaders, batches, etc. Actually, learnt a lot about these things in the last few days.
Cya;)
Day 51 of ML
Learning about fine-tuning models through the playbook and applying it
Learning about Transformers and how to train models using int and text data for a project (any resources for them are welcome)
Cya;)
Day 49 of ML
played around with regularisation methods (L1, L2, Dropout) and implemented them together in the model (turned out inefficient)
read the steps to train model by karpathy (will implement them tmrw)
will participate in competition tmrw, suggest a good one
Cya;)
Day 48 of ML
learnt the concept of regularisation. L2, L1, Dropout regularisation and many more and found great resource
Replaced dropout with L2 and the model is performing a tiny bit better (tiny)
Real question: can we use dropout and L2 together?
Resource below
Cya;)
Day 47 of ML
Tried improving yesterday's kaggle model by adjusting hyperparameters, increasing neurons and increasing layers but result is same (please suggest how to do it)
Studied data science concepts for tomorrow's exams (great revision)
Will grind harder tomorrow
Cya;)
Day 46 of ML
Participated in the kaggle competition and made the model myself, used the wrong optimizer but then changed it and the results improved (took a lot of time), will improve it more tomorrow
also completed the implementation part of neuron, layers and mlp
Cya;)
Day 45 of ML
completed the neuron, layers and mlp part of the nn video, i will implement it in code tomorrow
Got the TOC exam tomorrow so spent most of the day studying for it
Will do more tomorrow
Cya;)
Day 44 of ML
Implemented backpropapation from scratch in python, took a lot of time + notes (pen+paper), but it was all worth it(karpathy sensei is a god fr)
will implement the rest of the layer and mlp tomorrow
was thinking of implementing it in CPP, should i do it?
Cya;)
Day 42 of ML
have a computer networks exams tomorrow so I studied for it for almost the entire day
finally completed ~17.25 hours of pytorch course video
learnt about creating non linear layers for simple cnn models
Will compensate for the lack of ML tomorrow
Cya:)
Day 41 of ML
Watched cs231n's backprop lecture to understand backprop basics in depth, i might make it from scratch
~16.5 hours into the pytorch course and learnt to make dataloaders, training and testing loops functions and other things
will revise cnn before sleep
Cya:)
Day 40 of ML
Got ~16 hours into the pytorch course and now learning computer vision basics
learnt about dataloaders, flatten layer and other layers in the model.
couldn't do much since exams are ongoing and had to study for tomorrow, will compensate it tomorrow ;)
Cya :)
Day 38 of ML
did a lot of linear algebra today, since I have an exam on the same tomorrow (good for ml)
made MNIST model on kaggle for the competition (needs some improvements, tomm)
no pytorch today, but will do it tom
also came across this great vid, check it out
Cya :)
Day 37 of ML
completed ~14 hours of the pytorch course and learnt about non linear layers and use of their functions
learnt about multi class classification models and the various activation functions to use (softmax, sigmoid, etc)
Will take part in competition tom
Cya:)
Day 36 of ML
almost 10.5 hours into pytorch and learnt about making classification models
Learnt about 'Sequential'. Had a lot of doubts around how to add activations across each layer but got them
No cpp today :(
also stumbled upon a great video by Samson Zhang!
Cya;)
Day 35 of ML
~8.5 hours into the course and i completed the second part, learnt about losses, optimisers and other stuff
tried building my own model on my laptop, had to cut down 1.1 million rows down to 10k rows (CPU issue)
Finally some CPP to end the day
Cya ;)
Day 34 of ML
6 hours into the pytorch course and learnt the fundamentals of tensors
learnt about loss and optimisers
Made my first model with class (all the models i made so far were from scratch, as taught in f. ai course), adding train loop tomorrow
Cya;)
Day 33 of ML
Decided to learn pytorch in detail and found this video
learnt everything about tensors and various functions for mat mul and others (can't build stuff due to bad internet + compute issue)
Finally ended the day with regular CPP.
Cya ;)
Day 32 of ML
spent half of the day figuring about hidden layers and neurons until i made it from scratch.
added the layer to the model but the laptop memory ran out
Eureka moment when I saw how to make nn with classes in pytorch
Finally some CPP to end the day
Cya ;)
Day 30 of ML
completed the fast ai part 1 course today
Learnt about convolution today
also learnt about pooling and dropout
since the course is done, I'll take it slow from here, maybe rewatch some parts and focus on building now
finally some CPP quiz to end the day
Day 29 of ML
did a lot of CPP, almost spent more than half of the day in it
continued fast ai lesson 8
learnt about embedding and making embedding function from scratch in pytorch
Will dig in more tomorrow!!!
Day 28 of ML
did cpp for most of the day
continued the 3b1b nn series and watched backprop and the calculus behind it
I was thinking of making nn in cpp, but the calculus part looks harsh, is there a library in cpp to deal with these derivatives
also is it viable to make it?
Day 27 of ML
studied weight decay
learned nn from 3b1b, since I wanted to understand nn in more depth
For the next few, I've decided to take things a little slow and implement as well as learn in depth, what I've studied so far.
Now, i gotta watch demon slayer
Cya :)
Day 26 of ML
learnt about collaborative filtering, a very good way of creating recommendation systems
got to know the importance of biases
will focus on creating basic neural networks
Would have slept today and not done anything if not for this streak.
Cya :)
Day 24 of ML
Done with exams, resumed the grind.
Revised the topics i learnt earlier
Made the notes in obsidian
PS:- This is my brain XD
Day 23 of ML
learnt about gradient accumulation
learnt about gpu usage and batch size importance
learnt about ensembling models
will learn more of it tomorrow
Have a special thing to say to y'all, but i will do it in the morning
Day 22 of ML
learnt about fastkaggle
Learnt the proper way of analysing problems on kaggle and the proper way of submissions
Understood the difference between resnet and convnext models. (Speed and accuracy)
This college is really cooking me fr
Day 21 of ML
learnt about OOB
learnt about partial dependence
learnt about bagging and boosting
Will study them again for better understanding
Currently I'm not able to study more since I've a lot of assignments this week(exams next week), I'm extremely sorry
Day 20 of ML
revised binary split as it seemed confusing
started lec 6 of the course
learnt about the decision tree, gini, bagging and feature importance
Will learn OOB error tomorrow
I am half asleep as i write this
Day 18 of ML
tried adding hidden layer to the previously made model (face errors)
Seeing a lot of buzz around LLMs so decided to take an overview of LLM therefore watched Andrej Karpathy's 'intro to llm', got a lot to learn.
Was too busy with assignments today, will do tomm
Day 17 of ML
made a model and did my first kaggle competition submission today
Completed ch-5 and learnt about Binary split
will revisit some basic data analysis topics before sleep
wanted to take an overview of llm, will do it tomorrow
Day 16 of ML
revised the previous learnt nn model algorithm (got a bit confused in it) and made its notes
completed lecture 5 of dl fast ai
will study the concepts of data analysis since I wanna learn the basics
Will train models from scratch on a bigger dataset tomorrow
Day 15 of ML
reached the halfway of the lecture 5 of fastai
used pytorch for the very first time in my life
made a simple linear model from scratch and trained it using gradient descent
Will make another model tomm
Taking a little break today, maybe I'll watch some anime
Day 14 of ML
made my first NLP model based on tweet sentiment analysis and memory ran out
Learnt about making a neural network from scratch
learnt about tensors in pytorch and how Matrix multiplication takes place in it
I'll try to implement this and make my own nn tomm
Day 13 of ML
Made my own 'Tweet Sentiment Analysis' model
collected 3 datasets (2 failed), 1 was good, so started making it
Faced tons of errors, solved them
finally, trained it 'n the CPU space ran out, so I'll have to resume tomm
Began fastai lect 5
Day 12 of ML
Couldn't understand NLP well, so studied it again and i understand it better now
made Matrix multi(*) algo in CPP for no reason (most of it is hardcoded, will make it better tomm)
i have data analysis tomm, so will study it(good chance to revise for ml)
Day 11 of ML
Revised some concepts from yesterday's lesson
Studied NLP, introduced to Transformers (will learn in detail)
Learnt the concept of tokenization and numericalization
Was caught up in college assignment so couldn't do much
I'll continue NLP tomorrow
Day 10 of ML
Deployed my first ml model
Learnt neural network in depth- ch- 3(loss, activation fn, gradient descent, etc) (might do again)
Huge thanks to
@jeremyphoward
, taught nn in the best way possible
Did some cp problems
Satisfied?- somewhat
T.fore, will do some cp
Day 9 of ML
MADE another model(Eye disease detection, posted it)
Learnt about deployment of ML models(fast ai lecture 2)
Was Awestruck by knowing that i can make NextJS websites with ML API(expect some projects soon)
Revised some CPP
Didn't do much today, will try tomm
Day 8 of ML
Re-watched fastai lecture 1 to understand everything better
Understood all the concepts like data blocks, learning and fine tuning
made my own simple model
tomorrow I'll try to make more complex model
Is the grind over for today? Maybe not. Will do cp tonit
Day 6 of ML
Learnt about gradient descent and stochastic gradient descent today
also learnt backpropagation
(Will learn in detail about all of them)
Finally coded my first neural network using tf.
couldn't do anything for cp sadly.
College sucked all the energy from me
Day 5 of ML
Revised CPP (almost 60% of my time), since I'm planning to do cp after a year.
Learnt the basics of deep learning
Learnt the correlation between neuron and neural network
learnt about the layers, the activation function and finally the cost function.
Day 4 of ML
Forgot to update this yesterday
Learnt about NLP
Learnt about the bag of words model
Made my first NLP model
LFG
Day 3 of ML
Made my first reinforcement learning model with Thompson Sampling.
Understood some really essential topics and sent through your confidence bound and ultimately figured out that Thompson Sampling is the better choice.