Wednesday, February 6, 2013

LINQ to SQL and Serialization


One of the important that things that you will likely have to do at one point or another with entity objects is serialize them. This could be for Web Services most commonly but could be for any sort of storage or transport scenario.
There are a number of pitfalls with this scenario however, primarily because LINQ to SQL will very likely generate circular references into your entity model from your data and XML Serialization will fail outright in that scenario. For example, say you have a customer and projects table and if you let LINQ generate the one to many relationship it will create Customer entity with a Projects property and a Project entity with a Customer property.
For code scenarios this is probably a good thing - you want to be able to see all Projects and filter that list in your code.
Unfortunately in a serialization scenario this doesn't work because you essentially have a circular reference - Customer ->Projects -> Customer. So we have a 1 -> Many and a 1 - 1 relationship going back to the original object here. To demonstrate you can actually do whacky stuff like this:
CustomerEntity entity = Customer.Context.CustomerEntities.Single(c => c.Pk == 1);
string address = entity.Projects.ToArray()[0].Customer.Address;
And while that's a whacky example, this sort of setup makes sense. When you look at an Project you will liklely want to know about the Customer that is associated with the Project. And when you're looking at a Customer you'll want to know about all the Projects associated with the customer. A classic circular reference scenario.
At an abstract level that's fine but when you want to serialize this arrangement there are problems because the serializer doesn't know what do to with the circular references. So if you run code like this:
CustomerEntity entity = Customer.Context.CustomerEntities.Single(c => c.Pk == 1);
  
MemoryStream ms = new MemoryStream();
XmlTextWriter writer = new XmlTextWriter(ms, new UTF8Encoding());
 
XmlSerializer serializer =
        new XmlSerializer( typeof( CustomerEntity ) );
 
writer.Formatting = Formatting.Indented;
writer.IndentChar = ' ';
writer.Indentation = 3;
 
serializer.Serialize(writer, entity);
 
byte[] Result = new byte[ms.Length];
ms.Position = 0;
ms.Read(Result, 0, (int)ms.Length);
 
string XmlResultString = Encoding.UTF8.GetString(Result, 0, (int)ms.Length);
You'll get:
A circular reference was detected while serializing an object of type TimeTrakker.CustomerEntity.
There are a couple of ways to get around this but they're not exactly pretty as they involving changing the model.
So when you think about Customers and Projects relationship for example, is it really appropriate to have a Projects property in your CustomerEntity? By itself this expression doesn't really represent anything useful, especially if the system grows and there are lots of projects. So for serialization purposes you really wouldn't actually like to see this relationship.
So you can actually hide this relationship for serialization by marking the child property as internal. On the other the ProjectEntity probably should have a CustomerEntity and that always should be visible because it's a 1-1 relationship and you'll likely need that data and it would likely be OK to serialize so that relationship stays public:
You can also set the ChildProperty  option to false altogether to completely cut off the relationship through the entity. Using Internal still makes the relationship available in the middle tier (if you use one that is), but hides it for XmlSerialization (which works off public properties only).
As I said this is not a great workaround because by removing this relationship you're also removing the ability to run entity relation ship queries through LINQ. So by doing the above this sort of thing will no longer work:
CustomerEntity entity = Customer.Context.CustomerEntities.Single(c => c.Pk == 1);            
 
var query2 = from c in Customer.Context.CustomerEntities  
             from p in c.Projects
             where p.CustomerPk == 1
             select new { c.Company, p.ProjectName};
 
this.gdList.DataSource = query2;
this.gdList.DataBind();
Remove the relationship or set it to internal (outside of the bus object) you loose the ability to the implicit relationship that is implied and you'd have to explicitly define it like this:
var query2 = from c in Customer.Context.CustomerEntities  
             join p in Customer.Context.ProjectEntities on c.Pk equals p.CustomerPk
             select new { c.Company, p.ProjectName};
 
this.gdList.DataSource = query2;
this.gdList.DataBind();
 
return;
So the above workaround of turning off the relationship mapping might get you by in SOME scenarios. The internal flagging allows you to hack around the problem by making the property essentially invisible for XmlSerialization. However, that's not going to work for Binary Serialization and is also going to be a problem for WCF based Web Services that don't use the basicHttpBinding - all other formats use the lower level variations of binary serialization that work of field state rather than property values.
There's a solution for WCF however -  the Entity designer has a Serialization option for the generating a WCF [DataContract] and [DataMembers] on the entities generated:
This option controls only whether the entities are generated with a WCF [DataContract] attribute in the classes:
[Table(Name="dbo.Customers")]
[DataContract()]
public partial class CustomerEntity : INotifyPropertyChanging, INotifyPropertyChanged
and for each property:
[Column(Storage="_LastName", DbType="NVarChar(50)", UpdateCheck=UpdateCheck.Never)]
[DataMember(Order=4)]
public string LastName
If you specify a non-public flag (Internal, Protected or Private) for the relationship (or any field for that matter), the DataMember property is not generated. This gives you pretty good control over WCF serialization as well as XML Serialization and ASMX Web Services.
For Binary Serialization that's not going to do anything however - but then again there shouldn't be a lot of need for binary serialization anymore with WCF filling that niche in most places.
.Serialization - a Special Scenario
All of this also got me to thinking about these object relationships. In code having an Projects property is useful because you can apply LINQ filter expressions against it. But when you persist via serialization you don't get that flexibility. So if you were to serialize you would get ALL of the child entities serialized which is probably not at all what you want. For example, in my simple time tracking app I'm using as my playground I have a customer entity with individual entries. I really don't want to serialize ALL entries. I MIGHT want to serialize a handful of new entries, but certainly not ALL of them once the application has been running for a few months.
The problem now is that we have a logical AND a physical relationship in this scenario. As non parameterized view the Projects property doesn't make any sense in a Customer object. However,  a Entries property on an Invoice object might make perfect sense. In some cases it makes perfect sense to expose the relationship directly.But if you decide not to express the relationship explicitly between say Customer and Projects you loose some of the cool functionality that LINQ provides by automatically understanding relationships.
I'm not sure there's a solution to this. I think this same problem also exists with the ADO.NET Entity Framework (which uses a very similar approach although it's even more implicit about how relationships are defined by explicitly removing key fields).
Maybe some of the folks who've been doing OR/M development for some time with other tools like nHibernate can chime in. This is a thorny problem.
It seems to me that LINQ to SQL could actually solve these issues quite easily by providing a few more properties to set on entities and relationships. Such as a Serialized property on relations, and explicit serialization attributes on the entities. But even as it is you can bend LINQ to SQL to do as you need it seems.

No comments: