September 2006 - Posts

Introduction

Imagine the following situation: you've created a large software library with lots of types defined in it and lots of people are using it already. However, the library has become so bloated due to its size, it becomes more and more difficult to maintain it. So, you want to split the existing library into multiple assemblies, but without breaking any existing consumer applications.

You recognize this situation? Then the TypeForwardedToAttribute will be something for you. I'll explain why.

A simple demo scenario

Put yourself in the middle of a 1.0 release of some simple library (company A). Here it is:

public class Bar
{
   public static void
Do()
   {
      System.
Console.WriteLine("[lib]Bar::Do"
);
   }
}

public class
Foo
{
}

Just imagine it's a lot bigger with real functionality and compile it to a dll:

>csc /t:library lib.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

Right your library has shipped. Let's transfer ourselves in the mindset of a library consumer (company B) and write the following:

class Demo
{
   public static void
Main()
   {
      Bar.Do();
   }
}

Compiling and running it shouldn't be surprising (or how Company B is happy):

>csc /r:lib.dll demo.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

>demo
[lib]Bar::Do

Time for changes. Company A dislikes its bloated lib library (or whatever the reason might be) and wants to split the library into multiple separate assemblies. So, we'll create a newlib.cs file containing the Bar class:

public class Bar
{
   public static void
Do()
   {
      System.
Console.WriteLine("[lib]Bar::Do"
);
   }
}

This one gets compiled into newlib.dll, as follows:

>csc /t:library newlib.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

And lib.cs is stripped down to:

public class Foo
{
}

Assume you compile this library now:

>csc /t:library lib.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

But what about company B? Xcopying both assemblies lib.dll and newlib.dll and trying to execute demo.exe will be a nasty experience:

>demo

Unhandled Exception: System.TypeLoadException: Could not load type 'Bar' from as
sembly 'lib, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null'.
   at Demo.Main()

Okay, we could recompile demo.exe or you might wonder why lib.dll (changed but with the same version number as before) and newlib.dll got copied to company B after all. Perfectly valid questions to ask yourself, but just assume you want to fix this issue without recompiling stuff at the side of company B.

Back to company A for a better solution. Just add one single line to lib.cs containing our original though stripped down library:

[assembly: System.Runtime.CompilerServices.TypeForwardedTo(typeof(Bar))]

public class Foo
{
}

Now compile, but reference the newlib.dll assembly in where the Bar type lives right now:

>csc /r:newlib.dll /t:library lib.cs
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.112
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

Now copy lib.dll and newlib.dll to company B. Guess what? demo.exe just works fine

>demo
[lib]Bar::Do

Recall nothing has changed to demo.exe, so it still contains this metadata token:

.assembly extern lib
{
   .ver 0:0:0:0
}

and a call to the Bar::Do method defined in lib:

IL_0001: call void [lib]Bar::Do()

Magic happens in lib.dll however, where you'll find this piece of metadata in the manifest:

.class extern forwarder Bar
{
   .assembly extern newlib
}

So, when the type loader of the CLR kicks in and searches for type Bar in the lib.dll assembly, it receives help from the metadata telling that the Bar type has been relocated to another assembly "newlib":

.assembly extern newlib
{
   .ver 0:0:0:0
}

Conclusion

As usual you should consider carefully whether this feature can be helpful to you. It might be much better to ship another version of your library, split up in multiple assemblies and have your consumers use that one if they want or need to do so. "It works fine, so keep your hands off." might be a good attitude (I think it is).

Btw, there are some caveats too: over-using this feature carelessly could cause cycles to be created. The C# compiler knows about this: CS0731. And what about this warning? The classes in System.Runtime.CompilerServices are for compiler writers use only. Anyway, now you know the meaning of this attribute when you encounter it somewhere somehow.

To forward or not to forward? Shakespeare won't have considered this sentence in his oeuvre (I guess), but you should. Have fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

String freezing is a less known feature of .NET 2.0, not to be confused with string interning. The feature allows you to freeze a string in a GC segment at compile time (or better: at native image generation time, cf. ngen.exe) instead of having this done at runtime (the default behavior).

The good side: reduction of private pages, more efficiency.
The dark side: the native image of the assembly containing the frozen string cannot be unloaded (because of possible references to the frozen string that the runtime doesn't have control over).

As always, use performance optimizations with care and evaluate the applicability of this technique. If you decide to give it a go, measure the netto gains of applying it. The most suitable place to use this technique is when you have "string-rich" assemblies that shouldn't be unloaded because they app you're building can't live without them.

Play time

As usual a good starting point is the MSDN documentation for the thing: StringFreezingAttribute. The first thing to notice is that the attribute has to be applied on the assembly level. That is, all strings (known at compile time, cf. ldstr) in the assembly will end up in a pre-allocated GC segment spit out by the native image generator. That brings us to the second implication: the assembly has to be ngen-ed, which is the process of turning a managed code assembly in a native image well-suited for execution on the target processor. I've blogged about ngen in the past, so you might want to check out that post too.

So, let's create a simple application and see what it does. However, before you start, you'll have to compile my unmanaged AddressOf library which we'll use to display addresses of objects at runtime. Now create a console application (I've called it "DotNetPlayground") and reference the AddressInspector.dll file (or whatever it is called) containing the unmanaged AddressOf implementation. Time to add some code to Program.cs. Here you go:

using System;
using
System.Runtime.CompilerServices;

//[assembly: StringFreezing()]


namespace
DotNetPlayground
{
   class
Program
   {
      static void Main(string
[] args)
      {
         string s = "Hello world!"
;
         AddressInspector.AddressOf(s);
      }
  
}
}

That's right, we'll first compile without string freezing turned on. Now go to the Visual Studio 2005 command line, navigate to the folder of your app and execute it as few times:

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
014040D0

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
015940D0

Different addresses, right? Now ngen the assembly:

...\DotNetPlayground\bin\Debug>ngen DotNetPlayground.exe
Microsoft (R) CLR Native Image Generator - Version 2.0.50727.112
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
Installing assembly ...\DotNetPlayground\bin\Debug\DotNetPlayground.exe
Compiling 1 assembly:
    Compiling assembly ...\DotNetPlayground\bin\Debug\DotNetPlayground.exe ...
DotNetPlayground, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null <debug>

and execute again:

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
014E1948

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
01671948

Random addresses again, and that's normal because the string isn't pre-allocated in a GC segment. That's what string freezing is responsible for, so let's turn it on:

using System;
using
System.Runtime.CompilerServices;

[assembly:
StringFreezing
()]

namespace
DotNetPlayground
{
   class
Program
   {
      static void Main(string
[] args)
      {
         string s = "Hello world!"
;
         AddressInspector
.AddressOf(s);
      }
   }
}

Before we carry on with compilation, switch back to the command line to unregister the ngen-ed image:

...\DotNetPlayground\bin\Debug>ngen uninstall DotNetPlayground.exe

Compile and execute:

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
015E1948

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
01701948

Just what we expected since we don't have created a native image yet. That's important to remember: without a native image, the StringFreezingAttribute doesn't have impact. It's there to serve ngen.exe in its mission of creating a native image exactly like you want it to be.

...\DotNetPlayground\bin\Debug>ngen DotNetPlayground.exe
Microsoft (R) CLR Native Image Generator - Version 2.0.50727.112
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.
Installing assembly ...\DotNetPlayground\bin\Debug\DotNetPlayground.exe
Compiling 1 assembly:
    Compiling assembly ...\DotNetPlayground\bin\Debug\DotNetPlayground.exe ...
DotNetPlayground, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null <debug>

Now execute and feel happy:

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
30004448

...\DotNetPlayground\bin\Debug>DotNetPlayground.exe
30004448

So, what has happened? The string "Hello World" ended up in the native image portion that "prepopulates" the GC segment. When the application (i.e. PE file) is loaded into memory, the string ends up at the same place every time (not talking about relocations etc). When the IL-equivalent to ldstr is executed, the fixed address is returned instead of performing an allocation at runtime on the GC heap.

Going deeper

So, what happened to our native image? Time for some inspection, so navigate to the NativeImages location on your machine (%windir%\assembly\NativeImages_v2.0.50727_32), cd into the DotNetPlayground folder that was created and then to a subfolder with some randomly looking name (dir for it):

C:\Windows\assembly\NativeImages_v2.0.50727_32\DotNetPlayground\4b5032f0418220bb
b61a84c7a39af83b>dir
 Volume in drive C is Windows Vista
 Volume Serial Number is C4FB-B1E7

 Directory of C:\Windows\assembly\NativeImages_v2.0.50727_32\DotNetPlayground\4b
5032f0418220bbb61a84c7a39af83b

22/09/2006  13:22    <DIR>          .
22/09/2006  13:22    <DIR>          ..
22/09/2006  13:22            19.456 DotNetPlayground.ni.exe
               1 File(s)         19.456 bytes
               2 Dir(s)  25.134.407.680 bytes free

This is where the native image (ni) lives. You can ildasm it, but there won't be much of IL left, just the metadata is there. A more interesting tool is the PE/COFF dumper dumpbin.exe that comes with the Platform SDK, the VS2005 tools and the new Windows SDK (so make sure it's on your search path by launching the right command prompt).

Inspection of the headers using dumpbin.exe /headers DotNetPlayground.ni.exe learns us:

Dump of file DotNetPlayground.ni.exe

...

OPTIONAL HEADER VALUES

                 ...
        30000000 image base (30000000 to 30013FFF)
                 ...

The address we saw before (30004448) falls in the image base as we expected. From the rest of the headers you'll find the raw section where the 30004448 address belongs to:

SECTION HEADER #2
   .data name
    2854 virtual size
    4000 virtual address (30004000 to 30006853)
         ...

When you now perform a dumpbin of the raw section data (e.g. using /all) you'll find the following:

  30004440: 00 00 00 00 EE 4E 00 8E E4 C6 0F 79 0D 00 00 00  ....¯N..õã.y....
  30004450: 0C 00 00 00 48 00 65 00 6C 00 6C 00 6F 00 20 00  ....H.e.l.l.o. .
  30004460: 77 00 6F 00 72 00 6C 00 64 00 21 00 00 00 00 00  w.o.r.l.d.!.....
  30004470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

"Hello world" pops out of misty clouds :-). CLR Profiler (Vista users, please read the installation note concerning ProfilerObj.dll registration) lovers can also use the EnumModuleFrozenObject method from ICorProfilerInfo2 to enumate frozen objects (and strings). Well sort of, the CLR Profiler tool doesn't have this support out of the box, but the Profiling API does. An example:

HRESULT hr = E_FAIL;

ICorProfilerObjectEnum* pEnum;
hr = m_pICorProfilerInfo2->EnumModuleFrozenObjects(moduleId, &pEnum);

if
(SUCCEEDED(hr))
{
   ULONG celt = 0;
   hr = pEnum->GetCount(&celt);

   if
(SUCCEEDED(hr) && celt > 0)
   {
      ObjectID* pFrozenObjectIds =
new
ObjectID[celt];

      while
(1)
      {
         hr = pEnum->Next(celt, pFrozenObjectIds, &celt);

         if
(SUCCEEDED(hr))
         {
            if
(0 == celt)
               break
;

            for
(ULONG i = 0; i < celt; ++i)
               LogEntry(
"Frozen object on address 0x%08X\r\n"
, pFrozenObjectIds[i]);
         }
      }

      delete
[] pFrozenObjectIds;
   }

   pEnum->Release();
}

Where the LogEntry function is a profiler callback function that ensures atomic printf and/or WriteFile logging. You can find the original code this is based on in the MSDN Magazine Article of January 2005 entitled "CLR Profiler: No Code Can Hide from the Profiling API in the .NET Framework 2.0" (see download).

The way profiling works is by turning it on by means of two environment variables:

  • COR_PROFILER is set to the profiler GUID (this is the GUID set in the C++ profiler code by assigning it to CLSID_PROFILER)

    #define PROFILER_GUID "{18884ADE-B15B-4af8-BE6C-FE5117BA4B32}"

    extern const GUID CLSID_PROFILER = { 0x18884ade, 0xb15b, 0x4af8, { 0xbe, 0x6c, 0xfe, 0x51, 0x17, 0xba, 0x4b, 0x32 } };

  • COR_ENABLE_PROFILING is set to 1 to turn on profiling

How to run this? Wel, grab the project from the article mentioned above and open up ProfilerCallback.cpp. Add the following #define:

#define SHOW_MODULE_LOADS 1

Now compile it. Go to a command prompt with administrative privileges and register profiler.dll (required on Vista):

...\CLRProfiler\bin\Debug>regsvr32 Profiler.dll

Now set two environment variables:

...\CLRProfiler\bin\Debug>set COR_ENABLE_PROFILING=1
...\CLRProfiler\bin\Debug>set COR_PROFILER={18884ADE-B15B-4af8-BE6C-FE5117BA4B32}

Now run our DotNetPlayground.exe file and be patient (it might take up to a few minutes). A file called output.log will appear in the current directory that can be used for profiling analysis. (Note: check the output of the application to ensure the native image was executed before checking the output.log file.)

Enjoy and don't get frozen by string freezing. Remember: shooting yourself in the foot is only a few seconds away in the world of computing.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Hot from the build lab: the Visual Studio "Orcas" September CTP. There are tons of new features in there:

  • LINQ to Objects (aka Standard Query Operators from System.Query, see LINQ-SQO) in the System.Core.dll assembly.
  • Partial C# 3.0 and VB 9.0 support (this is work in progress of course).
  • VSTS improvements.
  • Better support for Windows Vista apps.
  • VSTO for Office 2007
  • .NET Framework and CLR improvements with a new add-in model (hot!), GC latency control (hot!), threading performance improvements, partial trust reflection, enhancements to working with dates and timezones.

This CTP is released as a Virtual PC image and can be grabbed over here. Advantage: fast setup; Disadvantage: 3.5 GB download size.

Have fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Browsing the SSCLI can be enlighting from time to time (if not all the time). Take a look at the following function implemented in console.cs:

[HostProtection(UI=true)]
[
CLSCompliant(false
)]
public static void WriteLine(String format, Object arg0, Object arg1, Object arg2,Object arg3, __arglist
)
{
   Object
[] objArgs;
   int
argCount;

   ArgIterator
args = new ArgIterator(__arglist);

   //+4 to account for the 4 hard-coded arguments at the beginning of the list.
   argCount = args.GetRemainingCount() + 4;

   objArgs =
new Object
[argCount];

   //Handle the hard-coded arguments
   objArgs[0] = arg0;
   objArgs[1] = arg1;
   objArgs[2] = arg2;
   objArgs[3] = arg3;

   //Walk all of the args in the variable part of the argument list.
   for (int
i=4; i<argCount; i++) {
      objArgs[i] =
TypedReference
.ToObject(args.GetNextArg());
   }

   Out.WriteLine(format, objArgs);
}

Hmm, looks weird isn't it? Using that dark keyword __arglist (it really is a keyword, look at the keyword coloring of VS2005). Not to talk about TypedReference and ArgIterator yet. So, what is it and what does it do?

Compile the following piece of code:

class Program
{

   static
void Main(string[] args)
   {
      Foo(__arglist(1, 2, 3));
   }

   static
void Foo(__arglist)
   {
      ArgIterator iter = new ArgIterator(__arglist
);
      for (int
n = iter.GetRemainingCount(); n > 0; n--)
         Console.WriteLine(TypedReference
.ToObject(iter.GetNextArg()));
   }
}

Output will just print
1
2
3

on the screen.

Now take a look at the IL code:

.method private hidebysig static vararg void
Foo() cil managed
{
   .locals init ([0] valuetype [mscorlib]System.ArgIterator iter,
      [1] int32 n,
      [2] bool CS$4$0000)
   IL_0000: nop
   IL_0001: ldloca.s iter
   IL_0003: arglist
   IL_0005: call instance void [mscorlib]System.ArgIterator::.ctor(valuetype [mscorlib]System.RuntimeArgumentHandle)
   ...
} // end of method Program::Foo

Welcome to mysteria lane again: RuntimeArgumentHandle pops up. Information on MSDN reveals the C/C++ programming language support it's intended for. Taking a look at the CIL instruction set in Partition III, section 3.4 of the ECMA 355 CLI standard learns us that arglist is used to return argument list handle for the current method. So, by calling arglist (metadata shows that the method supports this, cf. vararg) a pointer to the argument list is pushed on the stack.

All of this mysterious stuff makes one wonder about possible other hidden secrets, and guess what: tokens.h in the csharp folder of the SSCLI reveals four such keywords:

TOK(L"__arglist" , TID_ARGS , TFF_MSKEYWORD | TFF_TERM , 0 , NOPARSEFN , OP_NONE , OP_NONE , OP_ARGS , KEYWORD )
TOK(L
"__makeref"
, TID_MAKEREFANY , TFF_MSKEYWORD | TFF_TERM , 0 , NOPARSEFN , OP_NONE , OP_NONE , OP_MAKEREFANY , KEYWORD )
TOK(L
"__reftype"
, TID_REFTYPE , TFF_MSKEYWORD | TFF_TERM , 0 , NOPARSEFN , OP_NONE , OP_NONE , OP_REFTYPE , KEYWORD )
TOK(L
"__refvalue" , TID_REFVALUE , TFF_MSKEYWORD | TFF_TERM , 0 , NOPARSEFN , OP_NONE , OP_NONE , OP_REFVALUE , KEYWORD )

Don't worry about the macro stuff in here. The basic usage of each of these (undocumented == don't use) keywords is shown below (you can find out about the syntax by inspecting and understanding the tokens.h file):

int i = 1;

TypedReference
tr = __makeref(i); // tr = &i
__refvalue(tr, int) = 2;
// *tr = 2

Console
.WriteLine(TypedReference.ToObject(tr));
Console.WriteLine(__refvalue(tr,int));
// kind of a "cast back" (à la 'tr as int')

Console.WriteLine(TypedReference
.GetTargetType(tr));
Console.WriteLine(__reftype(tr)); // i.GetType()

This prints out:
2
2
System.Int32
System.Int32

IL analysis reveals four dark IL instructions once more:

.locals init ([0] int32 i,
   [1] typedref tr)
IL_0001: ldc.i4.1
IL_0002: stloc.0
IL_0003: ldloca.s i
IL_0005: mkrefany [mscorlib]System.Int32
IL_000a: stloc.1
IL_000b: ldloc.1
IL_000c: refanyval [mscorlib]System.Int32
IL_0011: ldc.i4.2
IL_0012: stind.i4
IL_0013: ldloc.1
IL_0014: call object [mscorlib]System.TypedReference::ToObject(typedref)
IL_0019: call void [mscorlib]System.Console::WriteLine(object)
IL_001f: ldloc.1
IL_0020: refanyval [mscorlib]System.Int32
IL_0025: ldind.i4
IL_0026: call void [mscorlib]System.Console::WriteLine(int32)
IL_002c: ldloc.1
IL_002d: call class [mscorlib]System.Type [mscorlib]System.TypedReference::GetTargetType(typedref)
IL_0032: call void [mscorlib]System.Console::WriteLine(object)
IL_0038: ldloc.1
IL_0039: refanytype
IL_003b: call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
IL_0040: call void [mscorlib]System.Console::WriteLine(object)

mkrefany - Push a typed reference to stack.Pop() of type arg[0] onto the stack (4.18) - On line IL_0005 this pops the address of i (obtained in IL_0003) from the stack and pushes a typed reference ([1]) to i on the stack.

refanyval - Push the address stored in a typed reference (4.22) - On line IL_000c this pops the typed reference ([1]) from the stack and pushes the stored address (&i) on the stack (cf. stind in line IL_0012 uses the address to do the assignment of 2, kind of *tr = 2).

refanytypePush the type token stored in a typed reference (4.21) - On line IL_0039 this pops the typed reference ([1]) from the stack and pushes the type token (i.e. a handle to the type) on the stack. Line IL_003b uses this token to construct a Type object out of it.

Now, what can we do with all of this? Nothing much to worry about; you haven't missed yet another powerful feature in C#. As we do have the params keyword at our service in C#, we don't need this construct. Anyway, when browsing the SSCLI source this knowledge can be handy to have.

However, there is one nice (but useless) thing you can do:

[DllImport("msvcrt40.dll")]
public static extern int printf(string format, __arglist
);

static void Main(string
[] args)
{
   printf(
"Hello %s!\n", __arglist("Bart"
));
}

Woohoo, "Hello Bart!" on my screen. Guess I'll stick with Console.WriteLine anyhow :-). And you should too; it's very easy to shoot yourself in the foot using undocumented stuff, needless to say so.

Time to return to reality and wipe out this journey through the dark side of .NET from our memories. Happy __arglist-less coding!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

Time for some cool .NET 2.0 feature that might prove useful in some scenarios: string interning. If you don't know what it is, check out the resources at the end of the article first. Simply stated: string interning keeps a hashtable of strings while running an application. If a string with the same contents is created, no new heap allocation happens but instead a reference to the existing (identical) string is returned.

What do you think the output of the following code fragment will be (turn on your eye blocking filter for green letters :-))?

//
// Compile time intelligence ==> &sa == &&sb
//
string
sa = "Hello World" + "!";
string sb = "Hello World!"
;
Console.WriteLine("sa = {0}"
, sa);
Console.Write("&sa = "
); AddressOf(sa);
Console.WriteLine("sb = {0}"
, sb);
Console.Write("&sb = "
); AddressOf(sb);
Console.WriteLine("(&sa == &sb) == {0}\n", Object.ReferenceEquals(sa, sb));

Note: I rely on the unmaged AddressInspector (AddressOf method) created in a previous blog post. You can just omit AddressOf calls and trust the Object.ReferenceEquals call to display the right result (it does) but seeing the real address can be more "sexy" :-).

The answer to the question above is this:

sa = Hello World!
&sa = 01591CC8
sb = Hello World!
&sb = 01591CC8
(&sa == &sb) == True

Did you expect this? Maybe, maybe not. Right, what's going on here? Look at the IL for the first two lines:

IL_0001: ldstr "Hello World!"
IL_0006: stloc.0
IL_0007: ldstr "Hello World!"
IL_000c: stloc.1

Allrighty, the concatenation is performed at compile time thanks to C# compiler goodness. That is, there is no runtime call to String.Concat. Hang on, doesn't ldstr create a new string object? Exactly, from the III.4.15 section in the CLI spec: "The ldstr instruction pushes a new string object representing the literal stored in the metadata as string.". So, two allocations, one physical address. Welcome to the world of string interning.

Some demos and experiments

Let's start with something easy:

//
// Duplication of references ==> &s1 == &s2
//
string s1 = "Hello World"
;
Console.WriteLine("s1 = {0}"
, s1);
Console.Write("&s1 = "
); AddressOf(s1);

string
s2 = s1;
Console.WriteLine("s2 = {0}"
, s2);
Console.Write("&s2 = "
); AddressOf(s2);

Console.WriteLine("(&s2 == &s1) == {0}\n", Object.ReferenceEquals(s1, s2));

No surprise that s1 = s2 results in two identical addresses (or references, whatever terminology you like most). In the end, System.String is a reference type, thus assignment is pointer/address/reference assignment:

s1 = Hello World
&s1 = 01591DB8
s2 = Hello World
&s2 = 01591DB8
(&s2 == &s1) == True

So, we now have a string s1 containing "Hello World". What happens if we allocate another string with the same contents? An experiment:

//
// Another string with the same value ==> &s1 == &s3 (!)
//
string s3 = "Hello World"
;
Console.WriteLine("s3 = {0}"
, s3);
Console.Write("&s3 = "
); AddressOf(s3);

Console.WriteLine("(&s3 == &s1) == {0}\n", Object.ReferenceEquals(s1, s3));

Spitting out the following:

s3 = Hello World
&s3 = 01591DB8
(&s3 == &s1) == True

This, again, is string interning at your service. What happens is that the JIT checks the table with interned strings in the current appdomain (to be precise: on the module level, see SSCLI code later) and if the string is found in the (hash)table, that reference is returned.

Now, what if we append a character to s1 and assign it back to s1? Because strings are immutable, we'll end up with a new string object (and thus a new address/reference):

//
// Strings are immutable; changing them allocates a new string ==> different address!
//
s1 += "!"
;
Console.WriteLine("s1 = {0}"
, s1);
Console.Write("&s1 = "); AddressOf(s1);

In IL, this looks like this:

IL_00fa: ldloc.2
IL_00fb: ldstr "!"
IL_0100: call string [mscorlib]System.String::Concat(string, string)
IL_0105: stloc.2

You see, there is no such thing as ldstr "Hello World!" this time. The compiler can't foresee this concatenation (would be even useless if it could in this case, because we've used the unmodified s1 object before), so a runtime String.Concat call is emitted to IL, effectively creating a new string. Allocation of this string doesn't happen using ldstr but through a so-called "framed allocation" that directly calls into the GC for memory allocation (see SSCLI code further on).

Because the newly created string (obtained through concatenation) isn't allocated through ldstr (see further for SSCLI analysis), it doesn't end up in the (hash)table of interned strings. Therefore, calling ldstr "Hello World!" subsequently won't find a match in the interned strings table and will result in a new allocation (which on its turn does end up in the interned strings table!):

//
// Another string, equal to the modified s1 string ==> &s1 != &s4
//
string s4 = "Hello World!"
;
Console.WriteLine("s4 = {0}"
, s4);
Console.Write("&s4 = "
); AddressOf(s4);

Console.WriteLine("(&s4 == &s1) == {0}\n", Object.ReferenceEquals(s1, s4));

performs

IL_0124: ldstr "Hello World!"
IL_0129: stloc.s s4

and results in

s1 = Hello World!
&s1 = 015973C8
s4 = Hello World!
&s4 = 01591CC8
(&s4 == &s1) == False

How can we solve this? The answer: intern s1. Why? To reduce memory footprint caused by duplicate strings. What happens? The runtime finds s4 that already has the contents of s1 at the moment, thus the String.Intern(s1) call will return the address of s4.

Note: You might wonder what the implications are of interning and having multiple variables point to the same string in the back. Doesn't that cause problems when someone tries to change the string through one variable? The answer is of course no, because string are immutable. Changing a string isn't possible without creating another one (i.e. all instance methods on System.String return a new string instance).

//
// String interning ==> &s1 == &s4
//
Console.WriteLine("\n*** INTERNING s1 ***\n"
);
s1 =
String
.Intern(s1);
Console.WriteLine("s1 = {0}"
, s1);
Console.Write("&s1 = "
); AddressOf(s1);

Console.WriteLine("(&s4 == &s1) == {0}\n", Object.ReferenceEquals(s1, s4));

This displays (compare the new value for &s1 with &s4 in the previous experiment):

*** INTERNING s1 ***

s1 = Hello World!
&s1 = 01591CC8
(&s4 == &s1) == True

Finally we allocate yet another string that contains "Hello World!", this time via ldstr directly:

//
// Another string, equal to the modified but interned s1 string ==> &s1 == &s5
//
string s5 = "Hello World!"
;
Console.WriteLine("s5 = {0}"
, s5);
Console.Write("&s5 = "
); AddressOf(s5);

Console
.WriteLine("(s5 == s1) == " + Object.ReferenceEquals(s1, s5));

which displays

s5 = Hello World!
&s5 = 01591CC8
(s5 == s1) == True

Going much deeper - enter the SSCLI

You're a geek? So you do want to know how this really works behind the scenes? The SSCLI will help you out. First download it here and extract it using some ZIP/RAR tool. Download the .vcproj file created by shawnfa too if you want some comfortable browsing through the massive code base.

Part 1 - The JITter interns strings

Let's show how strings get interned when strings are loaded in IL. It all starts by having a call to create a new string literal:

jithelpers.h

    CORINFO_HELP_STRCNS,            // create a new string literal
    ...
    JITHELPER(CORINFO_HELP_STRCNS, JIT_StrCns)

This relies on the JITHELPER macro defined in jitinterface.h. Just for your reference:

jitinterface.h

    // enum for dynamically assigned helper calls
    enum DynamicCorInfoHelpFunc {
    #define JITHELPER(code, pfnHelper)
    #define DYNAMICJITHELPER(code, pfnHelper) DYNAMIC_##code,
    #include "jithelpers.h"    
        DYNAMIC_CORINFO_HELP_COUNT
};

Back on the right track, we find that jithelpers.cpp specifies the JIT_StrCns operation:

jithelpers.cpp

HCIMPL2(Object *, JIT_StrCns, unsigned metaTok, CORINFO_MODULE_HANDLE scopeHnd)

    hndStr = ConstructStringLiteral(scopeHnd, metaTok);
    return *(Object**)hndStr;

This piece of code relies on the HCIMPL2 macro, defined in fcall.h ("fast call"):

fcall.h

    #define FC_COMMON_PROLOG(target, assertFn) FCALL_TRANSITION_BEGIN()
    #define HCIMPL_PROLOG(funcname) LPVOID __me; __me = 0; FC_COMMON_PROLOG(funcname, HCallAssert)
    #define HCIMPL2(rettype, funcname, a1, a2) rettype F_CALL_CONV funcname(a1, a2) { HCIMPL_PROLOG(funcname)

That being said, on to ConstructStringLiteral which we find further in the jithelpers.cpp file:

jithelpers.cpp

OBJECTHANDLE ConstructStringLiteral(CORINFO_MODULE_HANDLE scopeHnd, mdToken metaTok)

    Module* module = GetModule(scopeHnd);
    if (module->HasNativeImage() && module->IsNoStringInterning())
    {
        return module->ResolveStringRef(metaTok, module->GetAssembly()->Parent(), true);
    }
    return module->ResolveStringRef(metaTok, module->GetAssembly()->Parent(), false);

As you can see, the current module's ResolveStringRef method is called, which is defined in ceeload.cpp:

ceeload.cpp

OBJECTHANDLE Module::ResolveStringRef(DWORD token, BaseDomain *pDomain, bool bNeedToSyncWithFixups)

    string = (OBJECTHANDLE)pDomain->GetStringObjRefPtrFromUnicodeString(&strData);
    return string;

This one calls into the appdomain implementation:

appdomain.hpp

   //****************************************************************************************
    // Methods to retrieve a pointer to the COM+ string STRINGREF for a string constant.
    // If the string is not currently in the hash table it will be added and if the
    // copy string flag is set then the string will be copied before it is inserted.
    STRINGREF *GetStringObjRefPtrFromUnicodeString(EEStringData *pStringData);

The comment reveals the intentions of this long-named method and mentions the hash table used for interning. The implementation is rather straightforward too:

appdomain.cpp

STRINGREF *BaseDomain::GetStringObjRefPtrFromUnicodeString(EEStringData *pStringData)

    if (m_pStringLiteralMap == NULL)
    {
        LazyInitStringLiteralMap();
    }

    return m_pStringLiteralMap->GetStringLiteral(pStringData, TRUE, !CanUnload() /* bAppDOmainWontUnload */);

And finally we end up in the "string literal map" class:

stringliteralmap.cpp

StringLiteralEntry *GlobalStringLiteralMap::GetStringLiteral(EEStringData *pStringData, DWORD dwHash, BOOL bAddIfNotFound)

    HashDatum Data;
    StringLiteralEntry *pEntry = NULL;

    if (m_StringToEntryHashTable->GetValue(pStringData, &Data, dwHash))
    {
        pEntry = (StringLiteralEntry*)Data;
        // If the entry is already in the table then addref it before we return it.
        if (pEntry)
            pEntry->AddRef();
    }
    else
    {
        if (bAddIfNotFound)
            pEntry = AddStringLiteral(pStringData);
    }

    return pEntry;

The hashtable implementation itself is of no further interest to us.

Part 2 - String.Concat doesn't take care of interning

As I told you before, String.Concat doesn't check whether the constructed string is interned. Doing so would require an expensive hashtable lookup and consistency would demand other methods like ToUpper, ToLower, etc to do the same. An overview of the stack:

string.cs

public static String Concat(String str0, String str1)

    ...
    String result = FastAllocateString(str0Length + str1.Length);
    ...
    retun result;

The FastAllocateString method that's called is implemented elsewhere in unmanaged code. Therefore it's declared "extern" and decorated with a MethodImplOptions.InternalCall type of method implementation attribute as shown below:

string.cs

    [MethodImplAttribute(MethodImplOptions.InternalCall)]
    private extern static String FastAllocateString(int length);

The real implementation lives in ecall.cpp which contains a big mapping from various "internal call" methods to the corresponding supporting methods in the back:

ecall.cpp

    ECClass gECClasses[] =
    {
        ...
        FCClassElement("String", "System", gStringFuncs)
        ...
    };

    FCFuncStart(gStringFuncs)
        ...
        FCFuncStart(gStringFuncs)
    FCDynamic("FastAllocateString", CORINFO_INTRINSIC_Illegal, ECall::FastAllocateString)
        ...
    FCFuncEnd()

This implements so-called "fast calls" (abbreviated FCall). Ecall.h comes to a rescue for dynamically assigned FCalls as shown below:

ecall.h

    #define DYNAMICALLY_ASSIGNED_FCALLS() \
    DYNAMICALLY_ASSIGNED_FCALL_IMPL(FastAllocateString,                FramedAllocateString) \

The end is near. Now we end up in a JIT helper, just as we did before in part 1. However, this time the implementation calls into SlowAllocateString stuff.

jithelpers.cpp

HCIMPL1(StringObject*, FramedAllocateString, DWORD stringLength)
    ...
    result = SlowAllocateString(stringLength+1);
    result->SetStringLength(stringLength);

    return((StringObject*) OBJECTREFToObject(result));

The end of our journey: the GC implementation. Never far away in .NET as you can see. Basically, the gcscan piece of it is asked to allocate a string:

gcscan.cpp

STRINGREF SlowAllocateString( DWORD cchArrayLength )
    ...
    orObject = (StringObject *)Alloc( ObjectSize, FALSE, FALSE );
    ...
    return( ObjectToSTRINGREF(orObject) );

The Alloc method (calling into the heap allocation mechanism and possibly triggering a garbage collection) would lead us too far, outside the scope of our current investigation. There's much more in here (masked by the ellipses ...) such as method table initialization and profiling stuff. Check it out if you got curious :-).

Part 3 - How String.Intern works

In one of our experiments we made an explicit call to String.Intern. Let's see how this one works now:

string.cs

    public static String Intern(String str) {
        if (str==null) {
            throw new ArgumentNullException("str");
        }
        return Thread.GetDomain().GetOrInternString(str);
    }

appdomain.cpp

STRINGREF *BaseDomain::GetOrInternString(STRINGREF *pString)
{
    if (m_pStringLiteralMap == NULL)
    {
        LazyInitStringLiteralMap();
    }
    _ASSERTE(m_pStringLiteralMap);
    return m_pStringLiteralMap->GetInternedString(pString, TRUE, !CanUnload() /* bAppDOmainWontUnload */);
}

And finally back to the "string literal map" class as in part 1. Oh yes, the m_pStringLiteralMap type is:

    // The AppDomain specific string literal map.
    AppDomainStringLiteralMap   *m_pStringLiteralMap;

Other resources

Other articles on interning include a post on Chris Brumme's weblog and a VB2TheMax article on DevX by Francesco Balena. Even Wikipedia knows the concept. So why wouldn't you too? I hope to have wet your SSCLI appetite too by means of this article.

Enjoy your internal string kitchen :-)

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

You can find it over here.Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Rather unexpectedly, the availability of a SP1 beta for Visual Studio 2005 was announced today by soma. As usual, everyone is welcome to join the beta programme on Microsoft Connect. Enjoy!Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

Windows Vista comes with a new (comctl32 v6) API called TaskDialog. In marketing speak, this API should be seen as a part of the "clarity" pillar of Windows Vista's mission statement. Where classic dialogs à la MessageBox are the old-fashioned way to interact with end-users, TaskDialogs offer a lot more functionality. In this first post on TaskDialog, I'm showing you how to create a MessageBox-equivalent TaskDialog in C# using interop.

Where do we want to go?

In the end we want to be capable to do a similar thing as the MessageBox.Show method in Windows Forms:

TaskDialog.Show(this, "The quick brown fox jumps over the lazy dog.", "Can I ask you a question?", "My application", TaskDialogButtons.Yes | TaskDialogButtons.No | TaskDialogButtons.Cancel, TaskDialogIcon.Warning);

One first remark is the number of arguments (I'll supply overloads further on). Next to a title ("My application") and a text ("The quick brown fox jumps over the lazy dog.") there's a so-called instruction too ("Can I ask you a question?"). Wait a minute to see where this pops up in the displayed dialog. But first ... the code.

Let the fun begin: interop coding time

The TaskDialog signature can be found in CommCtrl.h in the Windows SDK folder (%programfiles%\Microsoft SDKs\Windows\v6.0\Include):

WINCOMMCTRLAPI HRESULT WINAPI TaskDialog(__in_opt HWND hwndParent, __in_opt HINSTANCE hInstance, __in_opt PCWSTR pszWindowTitle, __in_opt PCWSTR pszMainInstruction, __in_opt PCWSTR pszContent, TASKDIALOG_COMMON_BUTTON_FLAGS dwCommonButtons, __in_opt PCWSTR pszIcon, __out_opt int *pnButton);

Or in a more clean fashion:

HRESULT TaskDialog(      
    HWND hWndParent,
    HINSTANCE hInstance,
    PCWSTR pszWindowTitle,
    PCWSTR pszMainInstruction,
    PCWSTR pszContent,
    TASKDIALOG_COMMON_BUTTON_FLAGS dwCommonButtons,
    PCWSTR pszIcon,
    int *pnButton
);

This translates to the following in C#:

using System.Runtime.InteropServices;

public
class TaskDialog
{
   [
DllImport("comctl32.dll", CharSet = CharSet
.Unicode,  EntryPoint="TaskDialog")]
  
static extern int _TaskDialog(IntPtr hWndParent, IntPtr hInstance, String pszWindowTitle, String pszMainInstruction, String pszContent, int dwCommonButtons, IntPtr pszIcon, out int
pnButton);
}

Note: The name of the method was changed to _TaskDialog in order to make it compile, because the class itself is called TaskDialog too. Notice the parameter EntryPoint to the DllImportAttribute to make this name change work correctly.

The typical mapping of HWND to IntPtr should be no surprise. The same for HINSTANCE (which we won't use; it can be used to refer to a resource module to extract an icon, see pszIcon parameter). PCWSTRs map to System.String. The only surprise might be the mapping of PCWSTR pszIcon on IntPtr pszIcon. The reason for this is the definition of the icons in the CommCtrl.h file:

#define TD_WARNING_ICON         MAKEINTRESOURCEW(-1)
#define TD_ERROR_ICON           MAKEINTRESOURCEW(-2)
#define TD_INFORMATION_ICON     MAKEINTRESOURCEW(-3)
#define TD_SHIELD_ICON          MAKEINTRESOURCEW(-4)

MAKEINTRESOURCEW is a macro defined in WinUser.h:

#define MAKEINTRESOURCEW(i) ((LPWSTR)((ULONG_PTR)((WORD)(i))))

It's basically just a mapping to a pointer, therefore we use IntPtr.

Enough on this plumbing for now; time to do some useful work. We'll declare modal and non-modal overloads for a static Show method, just like the ones available on MessageBox. However, we'll start by defining the helper method that calls the imported TaskDialog method:

private static TaskDialogResult ShowInternal(IntPtr owner, string text, string instruction, string caption, TaskDialogButtons buttons, TaskDialogIcon icon)
{
   int
p;
   if (_TaskDialog(owner, IntPtr.Zero, caption, instruction, text, (int)buttons, new IntPtr((int)icon), out
p) != 0)
      throw new InvalidOperationException("Something weird has happened."
);

   switch
(p)
   {
      case 1: return TaskDialogResult
.OK;
      case 2: return TaskDialogResult
.Cancel;
      case 4: return TaskDialogResult
.Retry;
      case 6: return TaskDialogResult
.Yes;
      case 7: return TaskDialogResult
.No;
      case 8: return TaskDialogResult
.Close;
      default: return TaskDialogResult
.None;
   }
}

In analogy to MessageBox, we've defined three helper enums: TaskDialogButtons, TaskDialogIcon and TaskDialogResult. Here are the definitions:

[Flags]
public enum
TaskDialogButtons
{
   OK = 0x0001,
   Cancel = 0x0008,
   Yes = 0x0002,
   No = 0x0004,
   Retry = 0x0010,
   Close = 0x0020
}

public enum
TaskDialogIcon
{
   Information =
UInt16
.MaxValue - 2,
   Warning =
UInt16
.MaxValue,
   Stop =
UInt16
.MaxValue - 1,
   Question = 0,

   SecurityWarning = UInt16.MaxValue - 5,
   SecurityError =
UInt16
.MaxValue - 6,
   SecuritySuccess =
UInt16
.MaxValue - 7,
   SecurityShield =
UInt16
.MaxValue - 3,
   SecurityShieldBlue =
UInt16
.MaxValue - 4,
   SecurityShieldGray =
UInt16
.MaxValue - 8
}

public enum
TaskDialogResult
{
   None,
   OK,
   Cancel,
   Yes,
   No,
   Retry,
   Close
}

Don't worry about the enum's values which are interop-related. For the buttons, you can consult the CommCtrl.h header file for instance:

enum _TASKDIALOG_COMMON_BUTTON_FLAGS
{
    TDCBF_OK_BUTTON            = 0x0001, // selected control return value IDOK
    TDCBF_YES_BUTTON           = 0x0002, // selected control return value IDYES
    TDCBF_NO_BUTTON            = 0x0004, // selected control return value IDNO
    TDCBF_CANCEL_BUTTON        = 0x0008, // selected control return value IDCANCEL
    TDCBF_RETRY_BUTTON         = 0x0010, // selected control return value IDRETRY
    TDCBF_CLOSE_BUTTON         = 0x0020  // selected control return value IDCLOSE
};
typedef int TASKDIALOG_COMMON_BUTTON_FLAGS;           // Note: _TASKDIALOG_COMMON_BUTTON_FLAGS is an int

The icons are slightly more cmplex and were extracted based on a few defines as well as some experiments:

#define TD_WARNING_ICON         MAKEINTRESOURCEW(-1)
#define TD_ERROR_ICON           MAKEINTRESOURCEW(-2)
#define TD_INFORMATION_ICON     MAKEINTRESOURCEW(-3)
#define TD_SHIELD_ICON          MAKEINTRESOURCEW(-4)

What about the various overloads? Here these are:

public static TaskDialogResult Show(IWin32Window owner, string text)
{
   return Show(owner, text, null, null, TaskDialogButtons
.OK);
}

public static TaskDialogResult Show(IWin32Window owner, string text, string
instruction)
{
   return Show(owner, text, instruction, null, TaskDialogButtons
.OK, 0);
}

public static TaskDialogResult Show(IWin32Window owner, string text, string instruction, string
caption)
{
   return Show(owner, text, instruction, caption, TaskDialogButtons
.OK, 0);
}

public static TaskDialogResult Show(IWin32Window owner, string text, string instruction, string caption, TaskDialogButtons
buttons)
{
   return
Show(owner, text, instruction, caption, buttons, 0);
}

public static TaskDialogResult Show(IWin32Window owner, string text, string instruction, string caption, TaskDialogButtons buttons, TaskDialogIcon
icon)
{
   return
ShowInternal(owner.Handle, text, instruction, caption, buttons, icon);
}

public static TaskDialogResult Show(string text)
{
   return Show(text, null, null, TaskDialogButtons
.OK);
}

public static TaskDialogResult Show(string text, string
instruction)
{
   return Show(text, instruction, null, TaskDialogButtons
.OK, 0);
}

public static TaskDialogResult Show(string text, string instruction, string
caption)
{
   return Show(text, instruction, caption, TaskDialogButtons
.OK, 0);
}

public static TaskDialogResult Show(string text, string instruction, string caption, TaskDialogButtons
buttons)
{
   return
Show(text, instruction, caption, buttons, 0);
}

public static TaskDialogResult Show(string text, string instruction, string caption, TaskDialogButtons buttons, TaskDialogIcon
icon)
{
  
return ShowInternal(IntPtr
.Zero, text, instruction, caption, buttons, icon);
}

A few examples

So, what did we create? A few examples to illustrate:


TaskDialog.Show(this, "The quick brown fox jumps over the lazy dog.", "Can I ask you a question?", "My application", TaskDialogButtons.Yes | TaskDialogButtons.No | TaskDialogButtons.Cancel, TaskDialogIcon.Warning);


TaskDialog.Show(this, "The quick brown fox jumps over the lazy dog.", "Can I ask you a question?", "My application"TaskDialogButtons.OK, TaskDialogIcon.Information);


TaskDialog.Show(this, "The quick brown fox jumps over the lazy dog.", "Can I ask you a question?", "My application", TaskDialogButtons.OK | TaskDialogButtons.Cancel, TaskDialogIcon.SecuritySuccess);


TaskDialog.Show(this, "The quick brown fox jumps over the lazy dog.", "Can I ask you a question?", "My application", TaskDialogButtons.Retry | TaskDialogButtons.Cancel, TaskDialogIcon.Stop);

Download the code

Over here. So, I'd say: time to bring clarity to your customers' live. Stay tuned for more Vista fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Introduction

Everyone who has started with Windows PowerShell should know the where-object (aliased as where) cmdlet by now. A typical example

get-process | where-object { $_.ProcessName.StartsWith("n") }

or abbreviated using aliases

gps | where { $_.ProcessName.StartsWith("n") }

Assuming you have a single notepad process running on your machine, the output will be very similar as the one displayed below:

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     55       3     1416       6764    52     0,08   3436 notepad

The part indicated in red is a so-called script block. Readers of my blog learned about creating cmdlets themselves already (see all my PowerShell posts too). Today, I'm showing you a possible way to consume a scriptblock in your own cmdlet too, by re-creating the where-object cmdlet.

Dissection of Where-Object

So, what does Where-Object do? The answer is really simple: every object that flows through the Where-Object cmdlet is evaluated against the specified script block. If the script block returns true, it's passed on in the pipeline. If not, the object is dropped.

Considering this, what are the parameters? There are clearly two: one to take the script block and one to take the current object that has to be processed by the cmdlet. Let's concretize this using get-help where-object:

PARAMETERS
    -FilterScript <System.Management.Automation.ScriptBlock>
        The script block that is being applied to the object from the pipeline.

        Parameter required?           true
        Parameter position?           1
        Parameter type                System.Management.Automation.ScriptBlock
        Default value
        Accept multiple values?       false
        Accepts pipeline input?       false
        Accepts wildcard characters?  false

    -InputObject <System.Management.Automation.PSObject>
        The pipelined object that is being evaluated by the scriptBlock.

        Parameter required?           false
        Parameter position?           named
        Parameter type                System.Management.Automation.PSObject
        Default value
        Accept multiple values?       false
        Accepts pipeline input?       true (ByValue)
        Accepts wildcard characters?  false

The first parameter is the script block; the second one (which can be accepted from the pipeline) contains the input object.

Coding time

We know enough to create a similar cmdlet right now. Here it is:

[Cmdlet("where", "object2")]
public class WhereObjectCmdlet :
PSCmdlet
{
  
private ScriptBlock
filterScript;

   [
Parameter(Mandatory = true
, Position = 1)]
   public ScriptBlock
FilterScript
   {
      get { return
filterScript; }
      set { filterScript = value
; }
   }

   private PSObject
inputObject;

   [
Parameter(Mandatory = true, Position = 2, ValueFromPipeline = true
)]
   public PSObject
InputObject
   {
      get { return
inputObject; }
      set { inputObject = value
; }
   }

   protected override void
ProcessRecord()
   {
      object
o = filterScript.InvokeReturnAsIs(inputObject);
      if (LanguagePrimitives
.IsTrue(o))
         WriteObject(inputObject);
   }
}

The two properties need no explanation. The ProcessRecord method isn't too difficult as well. First, the inputObject is passed to the script that's called using the InvokeReturnAsIs method. There's another Invoke method too that returns an array of PSObjects as return objects. You can use this one too, but we just need a scalar value indicating the true/false evaluation state of the script block.

Using LanguagePrimitives we check for a true value and in that case, the object is passed on via the pipeline using WriteObject.

Next, wrap the cmdlet in a snap-in as explained in a previous post. I'm not going to duplicate this code over here; I'll assume you "publish" the cmdlet as "where-object2". Don't forget to installutil the snap-in too.

Testing it

Open Windows PowerShell and add your snap-in using add-pssnapin:

PS C:\Users\Bart> add-pssnapin <snapinname>

Next, run the where-object2 cmdlet on the get-process (gps) output as shown below:

PS C:\Users\Bart> gps | where-object2 { $args[0].ProcessName.StartsWith("n") }

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     55       3     1416       6756    52     0,09   3436 notepad

Works fine isn't it? The only caveat is that we have to use the $args script block parameter collection instead of $_. A little workaround can help:

PS C:\Users\Bart> gps | where-object2 { $_ = $args[0]; $_.ProcessName.StartsWith("n") }

However, the goal of this post is not to nag about the $_ assignment, but rather to show you how to create a cmdlet that invokes a scriptblock. And that's what we did :-).

Happy scripting!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Four days that will take you months ahead of the game.

Time is ticking; only 5 days to go for TechEd (Europe) early bird registration. So in case you've not registered yet, do it now and get a € 300 discount. For groups of 6 or more people there are also very attractive discounts, check it out here.

Your interests reach from hardcore development to IT administration? So you want to attend both events? That's possible with a 30% discount for the second event registration!

As usual, TechEd features a lot of great speakers. Eric Rudder will do the keynote, make sure not to miss this one.

Not convinced yet? Experience TechEd 2005 from your seat by watching some of the top session videos from last year over here.

Don't forget: Early bird registrations (Save € 300) run till 29 September 2006.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

More Posts Next page »