Motivation
Let me start with a very short story.
I did my first project involving Computer Vision when I was 15 years old, completely fascinated by technology and by creative solutions to all kinds of problems.
At the time, I thought it would be cool to turn my PC into a touchscreen device, so I took the naked LCD panel and diffuser layer from an old screen and built them into a cardboard box. I also disassembled my webcam and replaced its RGB filter with a makeshift infrared filter made from the black disk of an old floppy disk. The sketchy IR camera, together with a few IR LEDs, was placed inside the box, and whenever I touched the diffuser, it reflected the IR light back to the camera sensor.

Using CCV from the since then vanished company NUi Group (https://github.com/nuigroup/ccv2), I calibrated the four corners of the screen, and together with the TUIO mouse driver, that was enough to track my fingers and use them as multi-touch input.
I can't really describe what it feels like as a teenager to build a touchscreen PC for exactly $0. That small project opened a huge window for me. It showed me that cameras are not only for recording fun and memorable moments - they can also be used to build things, solve problems, and interact with the world in completely different ways.
Fast forward to 2026
A little over a decade later, I graduated in Media Informatics and Visual Computing, and I now have almost 8 years of combined professional experience in 3D design, product development, and Java development. The first satisfied my love for DIY projects, the latter my love for IT.
In a way, Computer Vision as my ultimate career goal feels like the combination of those two worlds. Cameras and image processing have a very strong connection to the real world, especially if you consider Computer Vision as part of robotics - and that is exactly the field I am absolutely in love with.
However, having experience only from my studies is a turn-off for companies looking to hire a Computer Vision Engineer.
9 years ago I managed to get a 3D designer job with Solidworks just by sitting down to practice all day and all night for only 2 weeks, turning my hobby and personal interest into a profession. Computer Vision is of course a more complex topic, but I am convinced that with the same amount of motivation and enthusiasm the same thing will happen again.
About the Computer Vision Code-Along series
Let's climb this mountain together, and follow me if you're interested.
If you are in a similar situation and looking forward to working in this field and helping the world with your own vision and your computer's vision, stick with me. In this series, I'll be working on three kinds of projects: Kaggle competitions, real-life problems, and totally made-up problems that nobody ever asked a solution for - letβs call those fun projects.
The focus of every project is to learn something new, gain experience, and overcome problems, whether they are skill issue kind of problems or technical ones.
What to expect and what not to expect
This series is about modern Computer Vision using neural networks in the first season and vision transformers (ViT) in the second season. Some basic, but stable knowledge about traditional Computer Vision methods is required to keep up.
It is not a shortcut to expertise in modern Computer Vision. Expect a rather slow pace, and don't expect to find the best possible solutions here. That is exactly the point of this series: you're learning with me, but more importantly for yourself. Think, code, debug, experiment, and let others know in the comments if you came up with a different solution.
Over the next few months - roughly with 1-2 episodes a week -, we'll go through different Computer Vision techniques and work on projects related to them in a learning-by-doing manner. If your learning style is very theory-first, then this series might not be the perfect fit for you - although I still recommend following along, because we'll talk about theory as well.
You'll also get full transparency into my technical struggles. At first glance, some parts may feel redundant, but these insights are part of this journey. This is not a course, this is a series of blog posts aimed at exploring, learning, trying different paths, and gaining experience in this field.
If you stay with me until the end, you'll hopefully become the proud owner of a beautiful GitHub repo and gain insight and experience in modern Computer Vision.
Where to start
Depending on your learning style and your starting point, there are different ways to begin, but most importantly, absolutely get familiar with OpenCV.
If you are completely new to Computer Vision, I strongly recommend building solid foundations in traditional Computer Vision first.
For complete beginners, I also made a small Jupyter Notebook as an appetizer that showcases OpenCV filters using nothing but your webcam, you can find it here: https://github.com/slelo/CVA-S0E0-OpenCV-Playground
If you're familiar with this, I wholeheartedly recommend - and kind of require - completing the Deep Learning Specialization by Andrew Ng on Coursera (https://www.coursera.org/specializations/deep-learning). It gives you a lot of understanding of what is happening under the hood, and the assignments also make you implement many of those ideas yourself.
I'll be using PyCharm as a development environment and Python 3.10 and 3.11 by default for compatibility reasons. If we use other tools in later projects, I'll let you know.
Foreshadowing
In the next episode, we'll use U-Nets for image segmentation for an inactive Kaggle competition. Until then, you can read more about them here: https://towardsdatascience.com/understanding-u-net-61276b10f360/
Please make sure you have a basic understanding of Convolutional Neural Networks. To build better intuition, I also recommend reading about AlexNet, ResNet, and MobileNet, and learning how they work and why they became so popular (This video and the following ones in the playlist will help: https://www.youtube.com/watch?v=-bvTzZCEOdM&list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF&index=12)
The next episode will be linked here when it's ready.
Thank you for reading, and your thoughts are more than welcome in the comments.











