Building PSObject Performance
You know I love working with Objects in Powershell, but are some methods better at building them then others? Ran into an interesting technique recently and wanted to test it against my normal way of doing things. Read on to see which technique is faster
What is a PSObject?
Powershell is an object-oriented style interpreted language, and pretty much everything works off of objects. The most interesting object of all is the PSObject. Whenever you run a cmdlet in Powershell, in almost all cases, the output will be an Object. These objects can then be piped into many other cmdlets. Where it gets interesting is when you begin building your own objects and then using the built-in cmdlets to manipulate them.
If you’ve read much of my blog you see I use PSObjects all the time to create a dataset, then use Export-CSV, ConvertTo-HTML (my current favorite), Out-Gridview, etc to display that data. It’s really a fantastic method for reporting and I hope Microsoft continues to expand the reporting capabilities within Powershell.
What’s the best way to build a PSObject
There are dozens of ways to build PSObjects, and most cmdlet’s simply output them so using a cmdlet and piping it to Select-Object is often the easiest way of doing it. The next level up has to be the Expression capability within Select to actually manipulate the object before output–and, to be honest, I often forget about this capability!–before displaying it. If you haven’t tried this it is a very cool feature of Select. Try running this:
Get-Service | Select DisplayName,@{Label="State";Expression={If ($_.Status -eq "Running") { "We're Good"} Else { "NOT RUNNING PANIC!!" }}},@{Label="Date";Expression={(Get-Date)}}
With Select we can choose which properties you want in your Object, but you can also manipulate it. Notice the second field starts with @{Label=”something”;Expression={scriptblock}}? You can change the label/header to whatever you want using the Label tag, and then make any output you want with Expression. That’s a full scriptblock there too, so you can get as fancy/crazy as you want in there. You’ll also notice that the last expression isn’t even a property in Get-Service, but something I manually created. And, of course, any output from this is already in Object form so you can continue the pipe into any other cmdlet. How about:
...previous command above ... | Where { $_.State -eq "NOT RUNNING PANIC!" }
Will now run your custom object and only display the services that aren’t in a “Running” status. I touched on this earlier when talking about the Service Information script.
But most of the time I build my PSObjects manually using the New-Object cmdlet. Here’s an example:
$ArrayOfObjects = @() ForEach ($Address in $Addresses) { $ArrayOfObjects += New-Object -Property @{ Name = $Name Address = $Address State = $State } }
This defines $ArrayOfObjects as an Array, then adds a new object to the array with 3 properties (name of the property and it’s value). But I recently ran across another technique to build an object from the MS Scripting Guy that had never occurred to me. Let me write it out using the same example.
$ArrayOfObjects = ForEach ($Address in $Addresses) { New-Object -Property @{ Name = $Name Address = $Address State = $State } }
This takes advantage of the fact that Powershell will take anything into a variable, and when that input is multiple objects it will automatically create an array of those objects. This technique also has the advantage of performance. As I touched upon here, we know += adds a lot of overhead to a script and the above avoids it completely, so let’s run some tests and see what we get.
$Path = "c:\windows" Measure-Command { $Folders1 = Foreach ($Folder in (gci $Path -Recurse -Directory)) { New-Object PSObject -Property @{ Name = $Folder.FullName Created = $Folder.CreationTime LastUpdated = $Folder.LastWriteTime } } } Measure-Command { $Folders2 = @() Foreach ($Folder in (gci $Path -Recurse -Directory)) { $Folders2 += New-Object PSObject -Property @{ Name = $Folder.FullName Created = $Folder.CreationTime LastUpdated = $Folder.LastWriteTime } } }
You’ll need Powershell 3.0 to run the test yourself. And here are the results:
Test Result -------------------------- --------------------------- Assign ForEach to Variable 15 seconds 393 milliseconds Manually Build Object 30 seconds 837 milliseconds
Running the script on my Windows directory and just assigning the output from ForEach is 50% faster! Huge performance gain, and another indictment on the += technique–which honestly makes me sad.
I’m a huge proponent of making my code readable and am willing to take a small hit in performance to do it, and I have to admit I like the manual PSObject building technique better than the assign to variable technique because it’s very straightforward and clear what’s happening. By assigning the output of a Foreach statement to a variable it just seems less clear what’s happening. And if the hit was 10-15% I would probably ignore it and continue doing things the way I have. But how do you ignore 50%?! I can’t say I won’t use the old technique because every script is different and you won’t always be able to build a nice pretty Foreach statement to assign you information. But when you can you definitely should as the performance is clearly superior, and imagine a 50% gain in performance if you ran the above test on a folder tree with thousands, if not millions of files?
[…] and turning them into simple HTML tables. I talked in more detail about objects in this post, Building PSObject Performance. Assuming you’ve built a PSObject full of wonderful data, how do you output it so it looks […]
[…] the job status and we’ll use it and Receive-Job to collect the data. As I mentioned in this post the most performant way to build a PSObject is to use a ForEach statement and assign the results to […]
[…] use the ForEach method to build our initial array, $Groups, and then simply sort the list. After that we’ll output the information as an […]
Consider instead:
$Folders1 = Foreach ($Folder in (gci $Path -Recurse -Directory)) {
[pscustomobject]@{
Name = $Folder.FullName
Created = $Folder.CreationTime
LastUpdated = $Folder.LastWriteTime
}
}
Then look at how fast it’ll run…
Thank’s Al. This is the new technique introduced with PowerShell 3.0. Should have done this in the original post since I was using -Directory on GCI (which is also a 3.0 parameter)! Ran it again and the New-Object technique took 26 seconds, the += took a whopping 52 seconds and the PS 3.0 technique you mentioned managed it in 18! Nice little gain!
What does your testing show?