Building PSObject Performance

You know I love working with Objects in Powershell, but are some methods better at building them then others? Ran into an interesting technique recently and wanted to test it against my normal way of doing things. Read on to see which technique is faster

What is a PSObject?

Powershell is an object-oriented style interpreted language, and pretty much everything works off of objects. The most interesting object of all is the PSObject. Whenever you run a cmdlet in Powershell, in almost all cases, the output will be an Object. These objects can then be piped into many other cmdlets. Where it gets interesting is when you begin building your own objects and then using the built-in cmdlets to manipulate them.

If you’ve read much of my blog you see I use PSObjects all the time to create a dataset, then use Export-CSV, ConvertTo-HTML (my current favorite), Out-Gridview, etc to display that data. It’s really a fantastic method for reporting and I hope Microsoft continues to expand the reporting capabilities within Powershell.

What’s the best way to build a PSObject

There are dozens of ways to build PSObjects, and most cmdlet’s simply output them so using a cmdlet and piping it to Select-Object is often the easiest way of doing it. The next level up has to be the Expression capability within Select to actually manipulate the object before output–and, to be honest, I often forget about this capability!–before displaying it. If you haven’t tried this it is a very cool feature of Select. Try running this:

Get-Service | Select DisplayName,@{Label="State";Expression={If ($_.Status -eq "Running") { "We're Good"} Else { "NOT RUNNING PANIC!!" }}},@{Label="Date";Expression={(Get-Date)}}

With Select we can choose which properties you want in your Object, but you can also manipulate it. Notice the second field starts with @{Label=”something”;Expression={scriptblock}}? You can change the label/header to whatever you want using the Label tag, and then make any output you want with Expression. That’s a full scriptblock there too, so you can get as fancy/crazy as you want in there. You’ll also notice that the last expression isn’t even a property in Get-Service, but something I manually created. And, of course, any output from this is already in Object form so you can continue the pipe into any other cmdlet. How about:

...previous command above ... | Where { $_.State -eq "NOT RUNNING PANIC!" }

Will now run your custom object and only display the services that aren’t in a “Running” status. I touched on this earlier when talking about the Service Information script.

But most of the time I build my PSObjects manually using the New-Object cmdlet. Here’s an example:

$ArrayOfObjects = @()
ForEach ($Address in $Addresses)
{  $ArrayOfObjects += New-Object -Property @{
      Name = $Name
      Address = $Address
      State = $State
   }
}

This defines $ArrayOfObjects as an Array, then adds a new object to the array with 3 properties (name of the property and it’s value). But I recently ran across another technique to build an object from the MS Scripting Guy that had never occurred to me. Let me write it out using the same example.

$ArrayOfObjects = ForEach ($Address in $Addresses) {
   New-Object -Property @{
      Name = $Name
      Address = $Address
      State = $State
   }
}

This takes advantage of the fact that Powershell will take anything into a variable, and when that input is multiple objects it will automatically create an array of those objects. This technique also has the advantage of performance. As I touched upon here, we know += adds a lot of overhead to a script and the above avoids it completely, so let’s run some tests and see what we get.

$Path = "c:\windows"
Measure-Command {
   $Folders1 = Foreach ($Folder in (gci $Path -Recurse -Directory))  {
      New-Object PSObject -Property @{
         Name = $Folder.FullName
         Created = $Folder.CreationTime
         LastUpdated = $Folder.LastWriteTime
      }
   }
}

Measure-Command {
$Folders2 = @()
   Foreach ($Folder in (gci $Path -Recurse -Directory))
   {  $Folders2 += New-Object PSObject -Property @{
         Name = $Folder.FullName
         Created = $Folder.CreationTime
         LastUpdated = $Folder.LastWriteTime
      }
   }
}

You’ll need Powershell 3.0 to run the test yourself. And here are the results:

Test                            Result
--------------------------      ---------------------------
Assign ForEach to Variable      15 seconds 393 milliseconds
Manually Build Object           30 seconds 837 milliseconds

Running the script on my Windows directory and just assigning the output from ForEach is 50% faster! Huge performance gain, and another indictment on the += technique–which honestly makes me sad.

I’m a huge proponent of making my code readable and am willing to take a small hit in performance to do it, and I have to admit I like the manual PSObject building technique better than the assign to variable technique because it’s very straightforward and clear what’s happening. By assigning the output of a Foreach statement to a variable it just seems less clear what’s happening. And if the hit was 10-15% I would probably ignore it and continue doing things the way I have. But how do you ignore 50%?! I can’t say I won’t use the old technique because every script is different and you won’t always be able to build a nice pretty Foreach statement to assign you information. But when you can you definitely should as the performance is clearly superior, and imagine a 50% gain in performance if you ran the above test on a folder tree with thousands, if not millions of files?

January 14, 2013 - Posted by Martin9700 | Powershell - Performance | Foreach, performance, PSObject

6 Comments »

[…] and turning them into simple HTML tables. I talked in more detail about objects in this post, Building PSObject Performance. Assuming you’ve built a PSObject full of wonderful data, how do you output it so it looks […]

Pingback by How to Create HTML Reports « The Surly Admin | January 31, 2013 | Reply
[…] the job status and we’ll use it and Receive-Job to collect the data. As I mentioned in this post the most performant way to build a PSObject is to use a ForEach statement and assign the results to […]

Pingback by Multithreading Revisited – Using Jobs « The Surly Admin | March 4, 2013 | Reply
[…] use the ForEach method to build our initial array, $Groups, and then simply sort the list. After that we’ll output the information as an […]

Pingback by Get a User’s Group Memberships « The Surly Admin | March 21, 2013 | Reply
Consider instead:

$Folders1 = Foreach ($Folder in (gci $Path -Recurse -Directory)) {
[pscustomobject]@{
Name = $Folder.FullName
Created = $Folder.CreationTime
LastUpdated = $Folder.LastWriteTime
}
}

Then look at how fast it’ll run…

Comment by Al Feersum | August 15, 2013 | Reply
- Thank’s Al. This is the new technique introduced with PowerShell 3.0. Should have done this in the original post since I was using -Directory on GCI (which is also a 3.0 parameter)! Ran it again and the New-Object technique took 26 seconds, the += took a whopping 52 seconds and the PS 3.0 technique you mentioned managed it in 18! Nice little gain!
  
  Comment by Martin9700 | August 15, 2013 | Reply
- What does your testing show?
  
  Comment by Martin9700 | August 15, 2013 | Reply

The Surly Admin

Father, husband, IT Pro, cancer survivor