The original video.
FEM Distribution of wires
For various reasons, my friend needed to know the precise distribution of wires inside a cable.
This cable consists of 6 bundle of wires twisted together around a central column. Each bundle consists of 416 strands of wires, and within each bundle these 416 wires are wrapped among themselves in a helical fashion as well.

In the video above, the coloured lines denote the sheaths wrapping around each bundle. The dot traces out the expected path of a single strand of wire within the cross-section of the cable as we move up the cable.
The problem was, a circular rotation pattern won't work for all 416 strands, as some will collide with the sheath while others will leave a lot of empty spaces near the corners of the sheath.
So the first attempt to even out the distribution of wires was to start with a rotating circle, tessellated with 416 smaller circles, and then apply elliptical grid mapping to transform it into a square. (This is not the only way to "square a circle". Other types of circle-to-square transformations can be found here.)

The square can then be mapped into the shape of the sheath (called a sextant).
This is done using an area-preserving transformation, to minimize the amount of distortion to the density distribution of wires.
This gives the following expected distribution of wires as we move up the cable.
But this is still quite far from our intuitive expectation of the distribution of wires inside the cable: It is still empty near the corners and too tightly packed near the centre of each bundle.
So my next idea was to pretend the centre of each wire was a charged particle, repelling them from each other.
To simplify the process, we consider the sextants to be identical copies of each other so that we can focus on only a single sextant.

Using a technique called overrelaxation to speed up the simulation, the distributions of the centre of the wires at each "slice" of the bundle were obtained.
But if we scan across the bundle, we will notice something wrong:
The wires "jumps" and swap location with each other when we scan across the bundle!
This is because we haven't implemented any restriction on how close the wire in the previous slice should be to the next slice. Therefore to achieve the lowest energy configuration, the distribution of wires in one slice may be very different from that of a neighbouring slice.
To fix this issue, we can extend the repulsive force to neighbouring slices as well. Let's see how that turns out:

Well, that ... wasn't supposed to happen. Everything went out of bounds. The force was too strong with this one.
The repulsive force made everything go out of bounds.
Reducing the repulsive force would only make each simulation more time consuming to run. But we tried it, and it still failed to converge.
So we removed the repulsive force and added an attractive force, pulling the centre of wires in neighbouring slices together. Still, no luck. Run it for long enough and the same divergence happened.
After numerous time-consuming experiments, and days of pouring over the data, I found out the source of the problem:
We thought the repulsive force should get exponentially stronger as we approach the centre.
Instead, a more mellow force profile (seen below) will lead to convergence more quickly:

The original profile (left) gives a solution with a very small region of stability: when the wires are far apart, they will repel each other and stay relatively motionless; but if any wire wanders close enough to another wire, it will be launched away from the latter with excessive force, barrelling into other wires, and causing a chain reaction that leads to chaos.
Whereas the newer force profile (right) has no such problem.
This gives rise to the following result:
Another interesting feature about my simulation is that it is memoryless, i.e. the wires "forget" the momentum it has gathered in the previous steps. Therefore it behaves like particles pushing on each other in a damping medium, like a viscous fluid.
What I'm doing in this model is basically an FEM simulation (Finite Element Modelling), but at a much faster speed, and a much lower computational cost, leveraging on NumPy's array operation abilities and branchless programming.
And if you're wondering how does the entire model look in 3D ...
And here's a static screenshot of the same bundle of 417 wires in higher resolution:

Well, all I can say is, the wires are too dense for us to see clearly what's going on in 3D. But at least we can see their outlines follows a spiral pattern.
Fourier transform a video
Some years ago, while learning about Fourier transform, I was enamoured of the fact that every curve has a "dual".
Therefore, when I saw the Arctic Monkey's music video "Do I Wanna Know", I trialled my Python skills with the challenge of Fourier transforming that video.
Play the following two videos simultaneously and you will see what I mean.
The fourier transformed version of the video.
Each frame of the video is obtained by taking the fourier transform of the curve of the original video's frame, then removing the imaginary part. The left side of the frame corresponds to low frequencies (long wiggles) and the right side of the frame corresponds to high frequencies (short wiggles).
It follows the same art style, speed and layout.
I stopped before a second line started to appear on the original video. Including the second line would have made the project too complex to manage using the python skills I had at the time.
Machine learning classification of products
Another friend of mine needed to manually classify up to several tens of thousands of products into the correct product categories, using only a simplified name given by the supplying company.
He is given a list of product ID's with their shortened names and descriptions, and is told to fit them into around ≈100 different predetermined categories.
Product ID | Name | Description | required category (has to be manually inputted) |
---|---|---|---|
1445245 | Savage Rainbow sequin festival wings | Body Jewellery | Accessories |
1418947 | Yoki Straw Circle Cross Body Bag | nan | Bags ( Days bags and Backpacks) |
1408225 | Sass & Belle swan salt & pepper shaker set | Living | Bar and Kitchen Utensils |
1492224 | Miss Patisserie Mango Bath Slab | Beauty Body | Bath Fizzers and Bombs |
1478389 | Hollister braided leather belt | nan | Belts |
1459198 | Monki scoop neck bikini top in black | Swimwear | Bikini |
1407710 | ASOS DESIGN white pinstripe suit blazer | nan | Blazer |
1356938 | Laura Mercier Natural Cheek Colour Blush - Chai | Beauty Base | Blusher |
The rightmost column is empty when the data is given.
The current practice is to manually label some of them, and then use fuzzy matching to match the rest. However, fuzzy matching has a spectacularly poor track record, reaching classification accuracy of ... drum roll please... a whopping 50%.
I suggested that perhaps fuzzy matching is an incorrect solution to the problem, and that he should instead use machine learning, which will have a much better effect as it can take advantage of existing information.
The machine learning algorithm extracts the name and description of the product, break it down to individual words, and learn the correlation between individual words and each category.
For example, the word "blazer" may appear in only a few products, all of which are categorized as "Blazer". Therefore the machine-learning model will correlate the word "blazer" strongly with the category "Blazer".
However, the word "short" may also appear in products that aren't wearable shorts (e.g. "short dress"). Therefore the word "short", on its own, should not be strongly correlated with the category "shorts".
4 different machine learning techniques were tested: SVM, logistic regression, random forest, and naïve Bayes. In the end, (linear) SVM was chosen as it came out on top with the highest accuracy, correctly predicting the category of the product 98% of the time. It also has the advantage of being the fastest among all of them and has no extraneous parameters to be tuned to optimize its performance: it works straight out of the box.
You can see a demonstration of its ability below. After being trained on 7211 correctly labelled product entries, the model was given the following unlabelled data:
Product ID | Name | Description |
---|---|---|
1482522 | Bershka slim fit jersey shorts in blue stripe | Shorts |
1373686 | Jeepers Peepers diamond sunglasses with blue lens | Mens |
1407648 | Moss Copenhagen button down maxi smock dress in vintage floral | Dresses - Day |
1460135 | ASOS DESIGN Plus fluro denim jacket in green | nan |
1449771 | Pour Moi Hot Spots ditsy padded underwired bikini top in black multi | Fuller Bust Bikinis |
1460208 | New Look Petite stripe linen crop trouser in cream pattern | Casual Bottoms |
1443942 | Boohoo crinkle high waist bikini bottoms with frill in khaki | Swimwear |
1465396 | Reclaimed Vintage inspired t-shirt with crystal print | Jersey Tops |
Since it has learned the correlation between words in the name and description and the categories, it can categorize each element in the table above, assigning it a possible category and the confidence score for it to be categorized as that category.
Product ID | Most likely category | Confidence score | second most likely category | Confidence score |
---|---|---|---|---|
1482522 | Shorts | 3.888540496683325 | Swimwear | -0.791359085767619 |
1373686 | Sunglasses | 1.3341932961661953 | Leggins / Jeggings | -0.9093674148743226 |
1407648 | Dresses | 2.1543932602501075 | Skin Care with Glass | -1.0688194890538223 |
1460135 | Coats / Jackets | 1.249923199928676 | Jewellery | -0.9958331476526012 |
1449771 | Bikini | 1.0576986345236594 | Skin Care with Glass | -1.0439104987337329 |
1460208 | Trousers (casual) / Jeans | 0.5250788065435212 | Pyjamas | -0.7996031592490555 |
1443942 | Bikini | 1.5026838844402144 | Swimwear | -0.5166287623573032 |
1465396 | T-shirts / Tops / Polos | 1.5382531417881091 | One Piece / Jumpsuit / Body Suit | -1.0075232111688528 |
This is great ... we spend months trying to manually label the data ... and this program takes only [10 seconds]!