Insertion order of multiple records in Entity Framework

12,332

Solution 1

What I'd like to know is how I can get these inserts to be ordered.

You cannot. Order of database commands is EF's internal behavior. If you want to control the order of commands don't use tools which abstract you from low level database interactions - use SQL directly.

Edit based on comment:

Yes it is low level interaction because you are putting expectations on the order of SQL commands when working with abstraction you don't have under your control. At high level you are getting something different because you are using expectations which don't work with that abstraction. If you want to have control over order of SQL commands you must either force EF by saving items one by one (=> multiple SaveChanges and TransactionScope) or write SQL yourselves. Otherwise use separate column for ordering.

Btw. EF doesn't save the entity as you see it. It has its own change tracker holding references to all your attached instances. References are held in multiple Dictionary instances and dictionary doesn't preserve insertion order. If these collections are used for generating SQL commands (and I guess they are) no order can be guaranteed.

Solution 2

Tables in the database are sets. That means that the order is not guaranteed. I assume in your example that you want the results ordered by "Number". If that is what you want, what are you going to do if that number changes and it doesn't reflect the order in the database anymore?

If you really want to have the rows inserted in a specific order, multiple SaveChanges are your best bet.

The reason nobody wants to call SaveChanges multiple times is because this feels exactly how it is: a dirty hack.

Since a primary key is a technical concept, it shouldn't make any functional sense to order your results on this key anyway. You can order the results by a specific field and use a database index for this. You probably won't see the difference in speed.

Making the ordering explicit has other benefits as well: it is easier to understand for people who have to maintain it. Otherwise that person has to know that ordering on primary key is important and gives the correct results, because in an other (completely) unrelated section of your application, it accidentally is the same order as the number field.

Solution 3

I've found a way to do it. It just thought I'd let you know:

using (var dbContextTransaction = dbContext.Database.BeginTransaction())
{
  dbContext.SomeTables1.Add(object1);
  dbContext.SaveChanges();

  dbContext.SomeTables1.Add(object2);
  dbContext.SaveChanges();

  dbContextTransaction.Commit();
}

Solution 4

Another way of doing this, without database round trip after each entry added (heavily dependent on the application logic though) is via combination of entity state changes.

In my case - hierarchy of nodes - I had to persist root nodes first, then rest of the hierarchy in order to automatic path calculations to work.

So I had a root nodes, without parent ID provided and child nodes with parent ID provided.

EF Core randomly (or through complex and intelligent logic - as you prefer :) randomly scheduled nodes for insertion, breaking path calculation procedure.

So I went with overriding SaveChanges method of the context and inspecting entities from the set for which I need to maintain certain order of inserts - detaching any child nodes first, then saving changes, and attaching child nodes and saving changes again.

// select child nodes first - these entites should be added last
List<EntityEntry<NodePathEntity>> addedNonRoots = this.ChangeTracker.Entries<NodePathEntity>().Where(e => e.State == EntityState.Added && e.Entity.NodeParentId.HasValue == true).ToList();

// select root nodes second - these entities should be added first
List<EntityEntry<NodePathEntity>> addedRoots = this.ChangeTracker.Entries<NodePathEntity>().Where(e => e.State == EntityState.Added && e.Entity.NodeParentId.HasValue == false).ToList();

        if (!Xkc.Utilities.IsCollectionEmptyOrNull(addedRoots))
        {
            if (!Xkc.Utilities.IsCollectionEmptyOrNull(addedNonRoots))
            {
                // detach child nodes, so they will be ignored on SaveChanges call
                // no database inserts will be generated for them
                addedNonRoots.ForEach(e => e.State = EntityState.Detached);

                // run SaveChanges - since root nodes are still there, 
                // in ADDED state, inserts will be executed for these entities
                int detachedRowCount = base.SaveChanges();

                // re-attach child nodes to the context
                addedNonRoots.ForEach(e => e.State = EntityState.Added);

                // run SaveChanges second time, child nodes are saved
                return base.SaveChanges() + detachedRowCount;
            }
        }

This approach does not let you preserve order of individual entities, but if you can categorize entities in those that must be inserted first, and those than can be inserted later - this hack may help.

Share:
12,332

Related videos on Youtube

Richard
Author by

Richard

I am a software developer working for a data centre based in Reading, UK

Updated on October 14, 2022

Comments

  • Richard
    Richard over 1 year

    I'm having trouble with EF reordering my inserts when I try and add an entity with multiple children all at once. I've got a 3 level structure with one-to-many relationships between each (Outer 1--* Item 1--* SubItem). If I try and insert a new Outer with Items and Subitems, the Items which contain SubItems end up being inserted first.

    Sample Code (.NET 4.5, EF 5.0.0-rc):

    public class Outer
    {
        public int OuterId { get; set; }
        public virtual IList<Item> Items { get; set; }
    }
    
    public class Item
    {
        public int OuterId { get; set; }
        [ForeignKey("OuterId")]
        public virtual Outer Outer { get; set; }
    
        public int ItemId { get; set; }
        public int Number { get; set; }
    
        public virtual IList<SubItem> SubItems { get; set; }
    }
    
    public class SubItem
    {
        public int SubItemId { get; set; }
    
        [ForeignKey("ItemId")]
        public virtual Item Item { get; set; }
        public int ItemId { get; set; }
    }
    
    public class MyContext : DbContext
    {
        public DbSet<Outer> Outers { get; set; }
        public DbSet<Item> Items { get; set; }
        public DbSet<SubItem> SubItems { get; set; }
    }
    
    class Program
    {
        static void Main(string[] args)
        {
            Database.SetInitializer(new DropCreateDatabaseAlways<MyContext>());
            MyContext context = new MyContext();
    
            // Add an Outer object, with 3 Items, the middle one having a subitem
            Outer outer1 = new Outer { Items = new List<Item>() };
            context.Outers.Add(outer1);
            outer1.Items.Add(new Item { Number = 1, SubItems = new List<SubItem>() });
            outer1.Items.Add(new Item { Number = 2, SubItems = new List<SubItem>(new SubItem[] { new SubItem() }) });
            outer1.Items.Add(new Item { Number = 3, SubItems = new List<SubItem>() });
    
            context.SaveChanges();
    
            // Print the order these have ended up in
            foreach (Item item in context.Items)
            {
                Console.WriteLine("{0}\t{1}", item.ItemId, item.Number);
            }
            // Produces output:
            // 1       2
            // 2       1
            // 3       3
        }
    }
    

    I'm aware of this answer by Alex James which states that inserts may need to be reordered in order to satisfy relational constraints, but that is not the issue here. His answer also mentions that they can't track the order of items in order-preserving structures such as Lists.

    What I'd like to know is how I can get these inserts to be ordered. While I can rely on sorting my inserted items by a field other than the PK, it's a lot more efficient if I can rely on the PK order. I don't really want to have to use multiple SaveChanges calls to accomplish this.

    I'm using EF5 RC, but judging by the other unanswered questions around, this has been around for some time!

  • Richard
    Richard almost 12 years
    My question wasn't whether I should be using the Primary Key to sort - I just want to be able to do it. My example is greatly simplified, the actual implementation is similar to a Logfile - entries are added over time, and never updated or removed. Yes, I can stick an index on another column (Number in this example), but this means I'm using 2 indices to accomplish something which can be done with 1.
  • Richard
    Richard almost 12 years
    I don't regard this as a low level interaction. I'm asking for an ordered set to be stored, it's not doing it. At a high level, I'm getting something different out to what I put in.
  • Jeroen
    Jeroen almost 12 years
    In that case there is nothing you can do except calling SaveChanges multiple times.
  • Ladislav Mrnka
    Ladislav Mrnka almost 12 years
    I added some explanation directly to the answer.
  • Anthony Mason
    Anthony Mason almost 8 years
    There is plenty you can do, but it requires knowledge of the framework. You can handle the saving within a partial class for the context, assign your own change tracker, write a SQL query through the context, tie a sproc directly and have the result set as an object EF generates, or modify the tt file for even more modularity. It sounds like in this case simply overriding SaveChanges(). Look at the result set from the following, this is in my partial context: IEnumerable<DbEntityEntry> modifiedOrAddedEntities = this.ChangeTracker.Entries().Where(x => x.State == EntityState.Added);
  • Kurtis Jungersen
    Kurtis Jungersen over 2 years
    -1: saving each individual item of a series is anti-pattern for EF. .AddRange() or AddRangeAsync() should be used instead (source). At n=2 we don't see a performance impact, but at n=1000 there could be a significant impact.