Multithreading Revisited – Using Jobs
I wrote about multithreading using Runspace here, but I also wanted to talk about running them the Powershell way using jobs. I want to make sure I give credit where it’s due, because most of my work in Jobs is based on Ryan Witschger‘s work. He writes a great Powershell blog called Get-Blog.com (great site name, too). Anyway, read on for my riff on Powershell jobs and the 3 step process you need to use them.
Powershell Jobs
Jobs were introduced to Powershell in v2.0, and give Powershell an easy way to multithread your scripts without getting into some ugly code–which frankly Runspaces does get you into, if only a little. But Powershell Jobs do have one major limitation when compared to Runspaces that we have to deal with. It’s the complete lack of throttling. Which means Powershell will let you submit as many jobs as you possibly can, with no regard to the fact that you can overload your computer and essentially bring most of the processing to a halt! Luckily, this isn’t too difficult to deal with.
The cmdlets you can use with Jobs are:
- Start-Job
- Stop-Job
- Get-Job
- Retrieve-Job
- Remove-Job
- Wait-Job
- Suspend-Job
I won’t be dealing with Stop, Wait or Suspend in this post as I haven’t really had a need to use them yet. This is especially true of Wait-Job, since the whole point is multi-threading a bunch of processes so why would I want my main script to completely stop processing while waiting on one job? But I expect there are some use cases for it, I just haven’t run into one yet!
Start-Job is pretty straight forward and you submit a scriptblock to it and Powershell will start the scriptblock as a background process. Get-Job is used to retrieve the status of any background jobs. When a background job completes it saves the data, but to get it back into your primary script you have to use Retrieve-Job (think of it as Retrieve-DataFromJob). After you’ve retrieved the job it will still show up in Get-Job though, so to fully remove it from existence you need to use Remove-Job.
Anytime you’re submitting background jobs, by any method really, you have to go through 3 steps:
- Submit the jobs
- Wait for the jobs to finish
- Collect the data
Submit the Jobs
It’s by using Get-Job that we can do our own home-grown style of thread throttling so let’s jump into it:
$MaxThreads = 10 While (@(Get-Job | Where { $_.State -eq "Running" }).Count -ge $MaxThreads) { Write-Verbose "Waiting for open thread...($MaxThreads Maximum)" Start-Sleep -Seconds 3 }
Here we define how many background jobs we’re going to allow, than use Get-Job to find out how many jobs are not completed. So why surround the whole thing in a @()? When you use Get-Job–and this applies for most Powershell cmdlet’s–if only 1 item comes back from the Where cmdlet then the output will be a string. If more than one item comes back you will get an array of strings. The count property only applies to arrays, so if only 1 job comes back as “not completed” then count won’t work and we’ll generate an error. Technically that’s ok, since that value–which would be a $null– isn’t greater than $MaxThreads so the While loop will still process properly but that’s just sloppy code. By surrounding the whole thing in @() we force the output into an array so even if only 1 item is returned it will be an array with 1 element.
If that count is greater than our $MaxThreads value than we’ll have the script wait for 3 seconds and try again. If it’s under that value the While loop will finish and we’ll move on to the next line of code.
$Scriptblock = { Param ( [string]$CN ) Get-Process -ComputerName $CN } Start-Job -ScriptBlock $Scriptblock -ArgumentList "."
Here we define a scriptblock to the variable $Scriptblock (creative, huh?), and this is another example about how Powershell is really not picky about what it will accept into a variable. You could define the entire script within the -Scriptblock parameter by surrounding your code in curly brackets too but I like to do it separate to make things a bit more friendly to read.
Unlike Runspaces, Powershell Jobs can reference variables from the main script if you use the $using: designation. So in the case above you could use:
$ComputerName = "." $Scriptblock = { Get-Process -ComputerName $using:ComputerName }
So why didn’t I? The $using: scope was introduced with Windows 3.0 and I’ve found that most people are still using Powershell 2.0 and I want our script to be compatible with the biggest audience possible. With Powershell 2.0 we have to pass the variables to the scriptblock using the -ArgumentList parameter and accept the parameter using Param. Yes, you could use $args since that also holds the parameters but it goes back to making your script readable, doesn’t it? Without reverse engineering the whole script you won’t know what $args is holding, but if you name the parameter it is pretty obvious. Don’t take shortcuts with your code, you really will regret it later when you’re trying to decipher a script you wrote 2 years ago!
Wait For the Jobs to Finish
Now we’ve created a thread capability within Powershell Jobs, and we’ve submitted our jobs–assuming we put it into some kind of loop. Now the next step is we have to wait for all of the background jobs to complete before we can process the data. This might look a little familiar!
While (@(Get-Job | Where { $_.State -eq "Running" }).Count -ne 0) { Write-Verbose "Waiting for background jobs..." Start-Sleep -Seconds 3 }
Pretty much the same code here, except now we’ve adjust the check to be not equal to zero (you could use -gt 0 too). This loop will continue to run until all background jobs are completed and then it’ll exit. We’re almost done with our 3 step process.
Collect the Data
We used Get-Job to monitor the job status and we’ll use it and Receive-Job to collect the data. As I mentioned in this post the most performant way to build a PSObject is to use a ForEach statement and assign the results to a variable. So let’s do exactly that.
$Data = ForEach ($Job in (Get-Job)) { Receive-Job $Job Remove-Job $Job }
With this snippet of code we loop through all the jobs, receive the data (and automatically assign it to $Data) and then remove the job. We can do both in this loop because Remove-Job doesn’t return any data, otherwise Powershell would try to put it into $Data and who knows what we’d end up with! But now $Data is loaded with all of the data we need in it and we can process it as we would any other array of objects.
And the output doesn’t need to be an object, it could be a simple string or hashtable as well.
Here’s a fun little script you can run to see how this all works when put together:
1..10 | % { $MaxThreads = 4 While (@(Get-Job | Where { $_.State -eq "Running" }).Count -ge $MaxThreads) { Write-Host "Waiting for open thread...($MaxThreads Maximum)" Start-Sleep -Seconds 3 } $Scriptblock = { Param ( [string]$CN ) Get-Process -ComputerName $CN } Start-Job -ScriptBlock $Scriptblock -ArgumentList "." } While (@(Get-Job | Where { $_.State -eq "Running" }).Count -ne 0) { Write-Host "Waiting for background jobs..." Get-Job #Just showing all the jobs Start-Sleep -Seconds 3 } Get-Job #Just showing all the jobs $Data = ForEach ($Job in (Get-Job)) { Receive-Job $Job Remove-Job $Job } $Data | Select ProcessName,Product,ProductVersion | Format-Table -AutoSize
[…] the background jobs we created on Monday? Let’s take this to the next level by putting up a progress bar that will track of […]
[…] Follow-up: Made another post about multi-threading the “Powershell” way, using Jobs. […]
[…] we need to use the multi-threading code I’ve already talked about here. As it turned out, I ended up changing ONE LINE of code to turn the script from a multi-threading […]
[…] we just have to submit the Scriptblock we defined in my last post here, with some multi-threading here and some progress feedback from here. The cool part is the resulting PSObject that comes back has […]
[…] the Test-Connection as a PowerShell job, monitor the job and retrieve it when it was done, see this post to see how to do that. And it all worked. Beautifully. I had the Ping Monitor I always […]
[…] not a report with 150 lines in it, it’s a report with 4,700! Suddenly, multi-tasking using PowerShell jobs takes on a whole new meaning, as well as taking a different perspective on your script writing. […]
[…] portion, so to do that I simply submitted each domain controller into a separate PowerShell job so it could be multi-threaded. In testing I discovered the bottle neck was waiting for the domain […]
[…] Multithreading Revisited – Using Jobs […]
[…] References: Multithreading Revisited – Using Jobs […]
Pingback by Using Powershell jobs to place VMs in Maintenance Mode and shutdown | The Handsome Hippo | October 28, 2015 |
I know this is an old post, but just happened upon it today. You were talking about not having a use case for Wait-Job that you could think of and then you re-invented the wheel when you have something for it already :). Replace this:
While (@(Get-Job | Where { $_.State -eq “Running” }).Count -ne 0) {
Write-Verbose “Waiting for background jobs…”
Start-Sleep -Seconds 3
}
With:
Get-Job | Wait-Job
Does exactly the same thing.
Absolutely right, and when I’m running a script from scheduled task that’s exactly what I do now. The only advantage my original code has is if you run the script interactively it does give some feedback–and you could certainly put in some limits in case a background job hangs up.
All that said, I pretty much exclusively use PoshRSJob from Boe Prox for multi-tasking. Ironically, a colleague of mine contributed “Wait-RSJob” to that project! I also recently made a small contribution which really felt good.
Thank you so much! Very useful!
[…] I decided to split those three actions up into their own discreet functions modeling after the PowerShell PSJob workflow. The three primary functions […]