Welcome back! In a previous discussion on optimization, we touched on the importance of optimization within complex systems. To tackle such these non-intuitive problems, we were introduced to man-made and nature-inspired techniques. Topping the list of nature-inspired algorithms was Particle Swarm Optimization (PSO) — easily one of my favorite optimization routines and one that we can get up and running quickly.
In this tutorial, we’ll do just that — start with a blank script and build a PSO routine from scratch using the Julia programming language. Before we start, it’s worth noting that you can easily find a pre-built PSO package and implement it into your algorithm pipeline. Personally, though, I find that building my own code helps me understand the algorithm on a deeper level. With that, let’s get started!
Every analysis algorithm is based on some sort of dataset and that’s where we’ll start as well. Since the goal of our optimization routine is to find a global maximum, we’ll want to build a dataset that has both local and global extrema. For our purposes, a combination of sines and cosines will work nicely within a specified window. Let’s begin by building our “toy” data.
For illustration purposes, we’ll define a two-variable function:
In reality, though, many systems will be much more complex and contain more than two variables. In fact, this is one of the strongest traits of optimization routines like PSO — instead of sprinkling particles over two dimensions (ex. ), we would sprinkle them over the N-dimensional space (ex ) of our more complex problem!
using PyPlot # Generate dummy data x = -10:0.1:10 y = -10:0.1:10 z = cos.(x'./5) .* sin.(y./3) # We can also store this in a function f = (x,y)->cos(x/5) * sin(y/3) # Plot data figure() imshow(z); axis("off") show(); gcf()
With our two-dimensional space defined, we can start sprinkling in our particles. We begin by choosing random positions within the bounds our initial function space. For each particle position (
ypos), we find the value at that coordinate (
zpos). We then store each of these arrays in our current dataframe (
cufDF) for bookkeeping.
using Random using DataFrames # Generate initial population particles = 20 generations = 20 Random.seed!(121) xpos = rand(Uniform(minimum(x),maximum(x)),particles) ypos = rand(Uniform(minimum(y),maximum(y)),particles) curDF = DataFrame(X=float(xpos[:]), Y=float(ypos[:])) for i in 1:size(curDF) zval = f(curDF[i,"X"],curDF[i,"Y"]) if i==1; global zpos=[zval] else; append!(zpos,zval) end end # Store particles/values in a dataframe curDF = DataFrame(X=xpos, Y=ypos, Value=zpos)
With the particle locations stored in our dataframe, we can easily plot their locations over the background function. We can also sort the dataframe to find the largest value in our existing particles, and plot this point with a red marker.
# Sort dataframe for largest value sort!(curDF,:Value,rev=true) newpop = copy(curDF) figure() imshow(reverse(z,dims=1),extent=[minimum(x),maximum(x),minimum(y),maximum(y)]) #scatter(x=newpop.X,y=newpop.Y,s=1,c="black") scatter(x=newpop.X,y=newpop.Y,s=1,c=newpop.Value) scatter(x=newpop[1,"X"],y=newpop[1,"Y"],s=5,c="red") show(); gcf()
To convince ourselves that our dataframe values match the coordinate system we think they do, we’ll plot each particle in both black (below image, left) and in the color of the background function (below image, right). Given the disappearance of the black dots, we can see the particle values do indeed match the background value! And the position of our “global best” checks as well, since we’re looking for the location of the [current] global maximum.
Update particle locations
At this point, we’ve only defined our analysis function () and initialized our particles — we have yet to set our optimization in motion! Here is where the “magic” happens. Since our particles are now aware of where the “global best” is within our function, each particle will calculate the distance between itself and the best global position (defined by (
Once the distance and direction are known, the particle will define a new position for itself one step closer to the best location. The size of the step is determined by a few things: (a) the distance away from the best location and (b) the rate (
r) of the particle travel. The particle also “jiggles” a little bit; each step it takes towards the best location is slightly offset by a parameter
j. After the current particles takes a step towards the optimal ideal location, it re-evaluates its value on . The new position and value information are stored in a new dataframe,
After all particles have completed their analysis within the current timestep, the new dataframe (
newDF) is copied a dataframe for the new population (
newpop); this updated dataframe is re-sorted, and a new global best position is chosen. Once the new global best position is chosen, the cycle starts over again!
# For each generation for t in 1:generations # Identify current dataset & current best if t==1; global newpop=copy(curDF); else; newpop=copy(newDF); end sort!(newpop,:Value,rev=true) global bestX,bestY,bestZ = [newpop[1,a] for a in ["X","Y","Value"]] # Iterate through each particle for i in 1:size(newpop) # Create new position curx,cury = newpop[i,"X"], newpop[i,"Y"] difx,dify = bestX-curx, bestY-cury newx,newy = curx+r*difx, cury+r*dify newx = newx+(rand()*2-1)*j newy = newy+(rand()*2-1)*j # Handle outliers due to jiggling if newx<1; newx=1; end if newx>length(x); newx=length(x); end if newy<1; newy=1; end if newy>length(y); newy=length(y); end # Value at new location newx,newy = [Int(round(k)) for k in [newx,newy]] local zpos = f(x[newx],y[newy]) # Store results newpos = DataFrame(X=[newx],Y=[newy],Value=[zpos]) if i==1; global newDF=newpos; else; append!(newDF,newpos); end end
During each timestep, we can also save out each image of the particle locations plotted over the background function. Although this is not shown in the code, it is straight forward to implement at the end of each timestep, and gives us a great visual to double check our algorithm! The resulting animation shows the solution convergence over 20 generations.
In this tutorial, we took a first stab at developing a PSO algorithm from scratch in the Julia programming language. Although the function we optimized is intuitive, this will likely not be the case in the majority of cases. Nonetheless, it gives us an idea about the internal workings of PSO and gives us a feel for how the optimization occurs over many iterations.
In our next post, we’ll start taking a look at another optimization routine, the Emperor Penguin Colony! There are many options, who knows which one we’ll choose next! For now, thank you so much for dropping by! To stay updated as I post, feel free to like, comment, and subscribe. See you next time!
Get new content delivered directly to your inbox.