Okay so like, i need to share this again because as of now no one has indicated to me that they understand just how absurd this is.
https://youtu.be/J7IKKzzApg8?si=04tTU-EeDF-A09mL
This is a zoom of a 41 iteration dragon curve. That's right 41 iterations. Why? Because i can :D Fun fact usually generating a dragon curve the memory requirement doubles each iteration, because you are copying and rotating the current curve. Double the segments, at best without any clever optimization its O(2^x)
So how hard is it to make say, 20 iterations? It was a long, long time ago, like 10 years, i followed an article that said you kind of have to write it in c so its more performant because 20 iterations on their machine took 20 minutes. 2^20 is going to be in the low one-millions range.
Now lets be more optimistic. Lets say computation time on a single cpu core has sped up 20 times, so you can generate a 20 iteration curve in 1 minute.
So 21 would take af least 2 minutes.
Now our scaling looks something more like 1m*2^(x-20). If we plug 21 more iterations to bring us up to 41, well we already know 2^20 is about one million, so this would be 2 million minutes or a bit under 2 years.
But guess what. This is rendering in real time. It would be incorrect to say that it generates the curve over 60 times per second, but it generates the visible part of the curve.
So what clever optimization trick gets me over a 120 million times speed boost at 41 iterations? Parallelization. See, this isnt just any dragon curve you see in c or java or python.
Nah girlie's running on the unity engines CG/HLSL shader language. The trick is to instead of generating the curve, and placing the segments with that info, we repeat the process that forms the dragon curve backwards. A dragon curve starts with a single segment, and it has 1 mid point. When folded, it turns into 2, and has 2 midpoints.
So what we do is we test every pixel, find its closest hypothetical midpoint, and then perform the transformation backwards. Since they originate from one segment, and this method ends up halvening the midpoints each time by merging them on top each other. These means after your desired number of iterations, we can simply check to see if the pixel we were on collapsed back into a specific midpoint.
Making midpoints integers an example would be the line segment from (0,0) to (0,2). Well then im that case the midpoint would be (0,1).
This scaling even allows us to keep track of everything via modulo operations. The vector math ends up working out so that most the time you're rotating and scaling, but the net displacement from the two is a fixed size vector along the cardinal direction, which when converted to integer math, is the same as either adding xor subtracting, just 1 from your current x xor y coordinate, based on the direction the rotation would take it
We can add a color gradient by keeping track of the turns in the curve. I dont remember exact details but its literally as simple as left? 0: right? 1. And then converting that binary number to decimal it gives you what number along the curve that pixels nearest midpoint belonged to.
Now there are some tricky edge cases like 0 iterations, or the first and last segment only being one line segment instead of a curve. Additionally if we try to smooth the color a bit by sampling all 4 directions, we'd get color bleeding between two very different parts of the curve that have been folded into proximity. You could probably weight this based on the proximity between segments but i dont recall if i ever bothered.
But yeah its some really fascinating stuff. Dont fully understand why the binary counting works but uh dont think i need to.
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality
Anya is LIVE right now
FREE
Free to watch • No registration required • HD streaming
I've been working on a raymarcher in my free time to practice making my own custom sdfs. And now a friend of mine asked me to make þem a volumetric tornado for Unity HDRP.
...some of you may know my distaste of Unity HDRP >:[
However, þis gave me an excuse to overwrite þe render πpeline wiþ my own handiwork, so I begrudgingly agreed — I didn't really have a choice eiþer way to begin wiþ þough 😅
Here's a couple photos of my current implementation, which will hopefully improve in coming weeks:
....currently one of þe rotation matrices aren't *quite* right, so I'll have to go þrough þose again, but for now I want to rework þe sdf to be more 'tornado-y'
p.s. I may have downloaded ahk for þe sole purpose of replacing th wiþ 'þ'. Do let me know if it's too much, I can turn it off to improve legibility of my posts :D
Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.
✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality
Anya is LIVE right now
FREE
Free to watch • No registration required • HD streaming
Honors Project & Dissertation - Generating physically inspired lightning effects in the GPU using parallelization techniques
Finally, time to upload my Honors Project! I really loved working on this, as one of my main interests is physics simulation in graphics programming.
Here's the explanation for this project - which is basically comparing two algorithms for rendering lightning geometry, parallelizing them, and seeing which one is faster.
Keep reading for the explanation! However, if you want the full documentation and breakdown of this project, click here.
Physically based or inspired lightning is not feasible to implement in interactive applications such as games because of the iterative nature of its algorithms, which makes them computationally expensive.
Accurate lightning simulation is described with the Dielectric Breakdown Model (DBM), which involves solving Laplace’s equation. Solving it with the conjugate gradient method (CGM) has been proven to be computationally expensive, however, this method has been substituted with a rational function to produce faster results.
Additionally, these methods have been tested in single-threaded CPU applications, but there’s evidence that the CGM can be optimized if they are computed in parallel in the GPU.
This research attempts to prove whether it’s possible to optimize physically inspired lightning generation performance using a rational function method in the GPU with the aid of parallel multithreading.
The application is developed in C++ using the Direct3D 11 API. It ports open-source code of the CGM and rational method from OpenGL to D3D11. Testing is carried out using a performance benchmark measuring the computation times of the parallelized rational method using a compute shader against its non-parallelized version and the CGM.
Despite the embarrassingly parallel nature of the algorithm, the parallelized version was 3 times slower than its non-parallelized version, albeit 3 times faster than the CGM. Resources consumed by the rational method increased ninefold compared to its non-parallelized version as well.
In conclusion, the parallelized version proved to be slower due to the CPU-GPU data transfer overhead.
Further research hypothesis suggests the use of group shared memory could compensate for this overhead.
Personally I would've loved to have more time to work on this as I believe I could've proved my hypothesis with group shared memory, but I'm not as knowledgeable yet. I need to study more graphics programming!
The idea is simple:
use a buffer to store a render of the scene that is updated periodically
the execution?
....
I don't want to talk about it...
If I don't continue to have to beat Unity into a pulp every time I do something, you might see another update from me in a few hours of a better effect that makes it harder to spot!