Problems with the Entity Framework, and how to overcome them

I am not a signatory of the Entity Framework Vote of No Confidence. Not because I have any confidence in the tool, but rather because my concerns are different. The concerns expressed in the VoNC are:

  • Focus on data over behavior
  • Lack of lazy loading
  • Shared canonical model
  • Lack of persistence ignorance
  • Excessive merge conflicts

These are valid concerns coming mostly from a community that uses ORM tools. I am not an ORM advocate, so some of these concerns mean little to me. In particular, persistence ignorance, while a cornerstone of ORM culture, has been of limited value in my experience. I've never had the need to port a data model unchanged into a different database. The database always imposes its constraints on the domain model. (Or maybe I just don't understand what PI really means.)

My concerns, on the other hand, are born out of trying to use EF on an enterprise project. My team has settled on a way of using it that works for us, so we will continue to use it. The problems that we've had to overcome are:

  • Tight coupling to the database schema
  • Lack of versioning support
  • Lack of visibility
  • Confusion between ownership and association

Tight coupling to the database schema
The easiest way to create an entity model is to reverse-engineer it from the database. The tool will create an entity type for each table that you select, and a relationship for each foreign key. It will even turn a two-column associative table into a many-to-many relationship. For simple schemas, this works pretty well.

There are many problems with taking this approach, however. The first is that foreign keys are only properties of tables, not views. The import creates an entity for each selected view, but cannot create relationships. Relationships can be created manually, so this is easily overcome. However this shortcoming encourages the use of tables over views, removing one useful layer of indirection.

We will reverse-engineer up to a point, then switch to views under the covers. Once that switch is made, we no longer refresh the EDMX from the model. This practice retains the layer of abstraction that we gain from views, but doesn't completely defeat the usefulness of the tool.

Lack of versioning support
We are building a service-oriented application. There are problems with SOA that I won't get into right now, but the biggest problem has been breaking changes in the service contract.

Entity Framework adds the [DataContract] attribute to all of the entities. This allows you to return an entity from a WCF service. If Microsoft made it this easy, it must the right thing to do, right?

The problem is that there is just one version of each entity type, and therefore of each service signature. If someone adds a column to a table, the client proxy needs to be regenerated. Every change to the database schema is a breaking change to the clients. And there is no way to keep old versions of the service calls while simultaneously adding new versions. This is painful in development. This would be death to production.

The solution is to ignore the help that Microsoft gives you. Don't use EF entities in WCF service contracts. Yes, this means that you have to write left-hand-right-hand code, but at least it's completely under your control. You choose when to define new data types, and you can version them according to your own needs.

Lack of visibility
We have some talented DBAs on the team. They are the reason that I'm not much of an ORM advocate. I see the job that they do, and I don't want my developers to be responsible for it. They think about the health and design of the data. Every day. If you use an ORM, you either ask your developers to be just as concerned, or you get accidental data access. Frankly, my developers have too much to think about without also caring as deeply about the data as my DBAs do.

So since I lean so heavily upon my DBAs, I want to give them all of the information they need. One thing they ask for is a list of queries that the application will be performing. They need this in order to ensure that the tables are properly indexed. Extracting this information from EF is cumbersome. The best way I've discovered is to run SQL Profiler and start the integration tests.

Another thing they ask for is a list of tables and views that a particular build of the application requires. They need this to coordinate database releases with application releases. We don't bring down the whole datacenter in order to release. Database changes are rolled out on a live system in advance of application changes. They are applied in a way that will not break the previous application. Then, when the database is upgraded, the app servers are individually taken off-line and upgraded. During this process, there is zero down-time. Part of the datacenter is on the previous version, while the other part is on the new version. Entity Framework hides the dependency of code upon schema, and gives us absolutely no help in planning for this dance.

Confusion between ownership and association
I observe five axioms of software behavior. One of them is ownership. An object has exactly one owner. It's a parent-child relationship. If the child is removed from the parent, it is deleted.

Ownership is different from association. An association is just a reference. That reference can be changed without deleting the object.

In a relational database, ownership is manifested as a foreign key that you cannot change. A child is added to a parent when it is inserted, and removed when it is deleted. A one-to-one or one-to-many association is also a foreign key, but you are allowed to change it. A many-to-many association is represented as an associative table.

Entity Framework only has navigation properties. It makes no distinction between ownership and association. In fact, it treats all navigation properties as associations, and does not do the right thing with parent-child relationships. Removing an object from a navigation property does not delete it, even if the parent is supposed to be the owner. Instead, it throws this exception:

A relationship is being added or deleted from an AssociationSet '***'. With cardinality constraints, a corresponding '***' must also be added or deleted.

You have to delete the child object from the container, not from its owner.

The way you code with objects is different than the way you code with relational databases. Entity Framework tries to hide these differences behind a layer of abstraction, but it fails completely. Not being an ORM user, I don't know the extent to which other frameworks succeed. But when I'm working with a relational database, I prefer to code relationally. It's really not that bad.

So when using Entity Framework in an enterprise environment:

  • Don't completely rely upon reverse-engineering the model.
  • Don't expose entities through WCF services.
  • Use SQL Profiler to see what EF is doing.
  • Know the difference between ownership and association, even if the tool doesn't.

Leave a Reply

You must be logged in to post a comment.