Entity Framework 6 inserting duplicate values

13,244

Solution 1

From EF point of view two entities are same if they are pointing to same row in database. I.e. two entities should have same non-zero keys.

If you want to have only one Genre entity with name "rock", then you should add exactly same genre entity to second artist genres collection or you can have two entities, but they should have same non-zero ids. I suppose you have some Add extension method which creates new genre and adds it to artist's genres:

public static void Add(this ICollection<Genre> genres, string name)
{
    genres.Add(new Genre { Name = name });
}

This will create independent instances of genres each time you call this method. Thus ids of created entities will be equal to zero, EF will treat them as different entities. E.g.

 a1.Genres.Add(new Genre { Name = "rock" });
 a1.Genres.Add(new Genre { Name = "rock" });

During saving changes EF will find two objects in genres collection. EF will check entities ids and generate appropriate SQL queries. If id is zero, it will generate INSERT query. For non-zero id EF will generate UPDATE query. In this case you will have two inserts (a little simplified - see comment below). How to fix that? You can use exactly same genre entity for both artists:

var rock = new Genre { Name = "rock" };
var a1 = new Artist { Name = "a1" };
a1.Genres.Add(rock);
var a2 = new Artist { Name = "a2" };
a2.Genres.Add(rock);

If you don't want to insert new 'rock' row to database, then you can use existing one instead of creating new:

var rock = db.Genres.FirstOrDefault(g => g.Name == "rock") ?? new Genre { Name = "rock" };

Solution 2

In the classes you linked Artist to Genre and then in the code you added 2 Genre using Name field.

If you had done, you would have remained in your artist and added 2

a1.Add("rock");
a1.Add("rock");

Solution 3

I think there are two possible reason that doesnt work:

  1. You are commiting just once and at the end of the code block. Thus, EF adds all genres as a new one.

  2. Your Base class PersistenceEntity may not include Id property that is Key. And your Add method accepts PersistenceEntity main class. This can effect all object as a new one.

Share:
13,244
xeraphim
Author by

xeraphim

SOreadytohelp

Updated on July 25, 2022

Comments

  • xeraphim
    xeraphim almost 2 years

    I have following two entities:

    public class Artist
    {
        [Key]
        public string ArtistId { get; set; }
        public string Name { get; set; }
    
        public virtual ICollection<Genre> Genres { get; set; }
    }
    
    public class Genre
    {
        [Key]
        public int GenreId { get; set; }
        public string Name { get; set; }
    
        public virtual ICollection<Artist> Artist { get; set; }
    }
    

    In my program I create some artists and want to save them:

    using (var context = new ArtistContext())
    {
        var artists = _fullArtists.Select(x => x.Artist);
    
        foreach (var artist in artists)
        {
            context.Artists.AddOrUpdate(artist);
        }
    
        context.SaveChanges();
    }
    

    Entity Framework correctly created the three tables:

    Artist (ArtistId, Name)
    Genre (GenreId, Name)
    ArtistGenre (ArtistId, GenreId)

    But unfortunately, when my artist sample data look like this:

    var a1 = new Artist { Name = "a1" };
    a1.Genres.Add(new Genre { Name="rock"});
    var a2 = new Artist { Name = "a2" };
    a2.Genres.Add(new Genre { Name="rock"});
    

    It will create 2 records in table Genre:

    IdName
    rock   
    rock   

    instead of creating it once and then reuse it.

    • Do you know if this is a configuration problem or how to tell EF to not insert duplicates and reuse existing instead?

    Thanks in advance


    Edit: Unfortunately, the solution of Sergey Berezovskiy didn't work (or probably i did something wrong :D)

    What I have now is the following:

    using (var workUnit = WorkUnitFactory.CreateWorkUnit())
    {
        var snapShot = new Snapshot { Date = DateTime.Now.Date };
    
         //ICollection<Genre> genres = _fullArtists.Select(x => x.ToArtist(snapShot)).SelectMany(x => x.Genres).Distinct(new GenreComparer()).ToList();
         //workUnit.Repository<IGenreRepository>().InsertEntities(genres);
         //workUnit.Commit();
    
         var artists = _fullArtists.Select(x => x.ToArtist(snapShot)).ToList();
    
         workUnit.Repository<IArtistRepository>().InsertEntities(artists);
         workUnit.Commit();
    }
    

    ArtistExtensions.cs

        public static class ArtistExtensions
        {
            public static Artist ToArtist(this FullArtistWrapper value, Snapshot snapShot)
            {
                if (value == null) { throw new ArgumentNullException(); }
    
                var artist = new Artist
                {
                    ArtistId = value.Id,
                    Name = value.Name
                };
    
                var genres = value.Genres.Select(x => x.ToGenre()).ToList();
                artist.Genres.AddRange(genres);
    
                return artist;
            }
        }
    

    GenreExtensions.cs

    public static class GenreExtensions
        {
            public static Genre ToGenre(this string value)
            {
                using (var workUnit = WorkUnitFactory.CreateWorkUnit())
                {
                    return workUnit.Repository<IGenreRepository>().GetGenres().FirstOrDefault(x => x.Name == value) ??
                                new Genre {Name = value};
                }
            }
        }
    

    Unfortunately, EF still inserts duplicate Genres in the database.

    InsertEntities(...) is this:

    public void InsertEntities<TPersistentEntity>(ICollection<TPersistentEntity> persistentEntitiesToBeInserted) where TPersistentEntity : PersistenceEntity
        {
            persistentEntitiesToBeInserted.ForEach(this.Add);
        }
    
    public void Add(PersistenceEntity entity)
        {
            var dbSet = this.Context.Set(entity.GetType());
            dbSet.Add(entity);
        }
    
    • Did I misunderstood the answer of Sergey or what could be another reason that EF still inserts duplicates?

    Thanks again

  • xeraphim
    xeraphim about 8 years
    thanks alot for your very comprehensible answer! :-)
  • Sergey Berezovskiy
    Sergey Berezovskiy about 8 years
    @xeraphim btw I have simplified a little the process of query generation - EF actually tracks state of entities. And new entities will have Detached state. After you add entity to some dbset it will change it's state to Added. State of entity is what EF actually looking at when generating queries. You can manually change state and set it to Modified even if entity has zero ID. And EF will generate UPDATE query. But it will fail with concurrency exception when you'll try to save changes :)
  • xeraphim
    xeraphim about 8 years
    I've tried to implement your answer but unfortunately it didn't work like i hoped it did :-( I've added an edit to my original post.. would you mind to look at it again and see if I misunderstood your answer and did something wrong? thanks alot :-)
  • Sergey Berezovskiy
    Sergey Berezovskiy about 8 years
    @xeraphim please do not edit question with follow-up questions - it's better to create new small questions instead. But this time I'll help you. Issue is workUnit.Commit() which is called AFTER you process all artists. So if you don't have some genre in db, you will create N times new Genre entity and assign new objects for each artist. You need to save changes when you are creating new genre. EF will assign ID to that entity and you will be able to add same genre entity to each artist.
  • xeraphim
    xeraphim about 8 years
    oh sorry didn't know that, next time I'll create a new question :-) that's what I thought too which is why I tried to first save all the genres (ICollection<Genre> genres = _fullArtists.Select(x => x.ToArtist(snapShot)).SelectMany(x => x.Genres).Distinct(new GenreComparer()).ToList(); workUnit.Repository<IGenreRepository>().InsertEntities(genre‌​s); workUnit.Commit();) and then creating the Artists in .ToArtist() but this still added duplicate genres. Do I have to save all genres, then save all artists, then add the relationships to the artist and save again?
  • Sergey Berezovskiy
    Sergey Berezovskiy about 8 years
    @xeraphim you can save genres, and then save artists with genres (relation will be saved automatically). Or you can make sure that you are using same 'rock' genre instance for all artists. I.e. you should use new Genre only once for each genre. You can achieve that by adding new genre entity when you creating it (e.g. db.Genres.Add(new Genre("rock"))) and when you next time look for 'rock' you should also check if entity is in change tracker (e.g. db.ChangeTracker.Entries<Genre>().FirstOrDefault(e => e.Entity.Name == "rock"))
  • xeraphim
    xeraphim about 8 years
    thanks I'll try this when I get home :-) thanks alot Sergey!
  • xeraphim
    xeraphim about 8 years
    You're right, PersistenceEntity is just an empty abstract class. I'll try to add the Id property and define it as key when I get home, thanks!
  • Engineert
    Engineert about 8 years
    Adding Id property as a key to main entity class provides a mechanism to filter entities. Your main goal should be to pass parameter T like Add( T entity ) instead of Add(PersistenceEntity entity).