.NET Serialization Choices

Introduction

Serialization is not a trivial problem in any language. In .NET there are quite some choices available for serializing/deserializing objects. Each available option has it's strengths and weaknesses.

I've started a project on git hub SerializationTests where i'm trying to determine what is supported by which library and draw a few conclusions that should be helpful when designing serializable objects and in general when dealing with serialization.

I've started this project because I've been bitten too many times by issues related to serialization, either trying to send a message with an System.Uri property in NServiceBus, or trying to deserialize an object from Json which had an ID public readonly field named different from the corresponding parameter in the constructor. In the end i'm trying to put together a few tips that i hope will be useful in the to myself and others.

Tested Implementations

The following serialization choices have been included in the SerializationTests project:

Update: Added Raven.Json Serializer - see post

Tested Messages

When designing an object that is meant to be used as a message, event or command there are a few approaches you can take:

I say i prefer the first choice since i find it the one that expresses my intentions the best. I want immutable data transfer objects. Also i have to mention that the first time i've seen this approach was in a video by Greg Young about event sourcing and CQRS architecture. To get the idea an event it's defined like this:

 1 public class PersonCreated
 2 {
 3     public readonly Guid AggregateId;
 4     public readonly string Name;
 5     public readonly string Street;
 6     public readonly string StreetNumber;
 7 
 8     public PersonCreated(Guid id, string name, string street, string streetNumber)
 9     {
10         this.AggregateId = id;
11         this.Name = name;
12         this.Street = street;
13         this.StreetNumber = streetNumber;
14     }
15 }

Serializers Conclusions

Disclaimer: I have not used all this serializers in real projects and i'm by no means an expert in any of them. Before adopting one of them do you research and try to see if they have any other drawbacks that might not be acceptable in your projects. If i don't mention Cons for one, it only means that i have not been interested in it so much to research it deeper.

BinaryFormatter

Pro: The BinaryFormatter included in the .NET Framework passes all tests except the DataContractOnly test which is expected since it relies on the presence of the [Serializable] Attribute.

Cons: Very Platform dependent. Assembly version dependent. It's complicated to handle different versions of the same class. Requires the [Serializable] attribute which you might not always be able to add.

DataContractSerializer & NetDataContractSerializer

Pro: Passes all the tests, XML Based, used in WCF

Cons: Requires attributes on the class ( [DataContract] ) and all members ([DataMember])

Newtonsoft Json.Net

One of the most common choices when doing serialization in an AJAX Call where the result is deserialized in JavaScript.

Pro:JSON Based, human readable, platform independent , fast, passes almost all the tests except the ones where the message has public readonly fields that have dirrerent names from the constructor parameters.

Cons: Without automatic testing, naming a constructor parameter different from the field can cause hard to notice bugs where certain fields are not deserialized. It should however be possible to write a deserializer based on the existing one which throws an exception if there are doubts when deserializing.

NServiceBus XMLSerialize

Integrated in NServiceBus. Has caused me a lot of problems when it was silently ignoring some properties.

Pro: XML Based, integrated with NSB, leaving aside the problems i've had with it has performed quite well in a few projects. Even if i don't personally love it, it's actively maintained by the NSB group and can be reliably used if you know it's limitations.

Cons: You might end up with properties being silently not serialized/deserialized

ProtocolBufers.NET Serializer

To quote the authors:

Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats.

Pro: Fast, platform independent, compact, throws exceptions when unable to serialize/deserialive, supports versioning

Cons: Needs custom attributes on the class and members that need to be serialized.

Overall a good choice when size and performance really matter.

ServiceStack JsonSerializer

Pro: it's said to be the fastest json serializer.

Cons: Fails a lot of tests - you have to make sure you write your objects in ways supported by the serializer

SoapFormatter

Pro: Xml Based, Passes all the tests except the one without the Serializable attribute

Cons: The outputed xml is big

XmlSerializer

Pro: widely known?

Cons: I must be doing something wrong since it fails a lot of tests.

Conclusion

My first conclusion is that when approaching an architecture that relies on serialization like EventSourcing and CQRS you should carefully plan the way you write and persist the events in your system.

At the moment my choice is Json.NET with a Gzip filter. The main reason is that it has a minimal impact on how i must write events and is efficient in therms of speed and size yet is still human readable.

I'm hoping after i let this sink in a little i'll come back an update the conclusions.

Also i would be glad if others share their opinions & experiences related to serialization.

Comments