C++ - Fastest way to add an item to a sorted array

14,787

Solution 1

the solution is to rewrite your code to use the stl, I don't understand why people write C code in C++.

You need a vector of User

std::vector<User> users;
//then you can keep it ordered at each insertion
auto it = upper_bound(users.begin(), users.end(), user_to_insert, 
    [](auto& lhs, auto& rhs ) { /* implementation left to the reader */});
users.insert(it, user_to_insert);

You now have the same functionality in a much nicer and clean way

Solution 2

Reinventing the wheel is fine if you want to learn how to code binary search, otherwise reusing is better.

std::lower_bound performs a binary search on a sorted range [first, last), returning an iterator to the searched element x if already present; otherwise the iterator would be pointing to the first element greater than x. Since standard containers' exposing an insert would insert before the iterator, this iterator can be used as-is. Here's a simple example.

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

int main()
{
    std::list<int> data = { 1, 5, 7, 8, 12, 34, 52 };

    auto loc = std::lower_bound(data.begin(), data.end(), 10);
    // you may insert 10 here using loc
    std::cout << *loc << '\n';

    loc = std::lower_bound(data.begin(), data.end(), 12);
    // you may skip inserting 12 since it is in the list (OR)
    // insert it if you need to; it'd go before the current 12
    std::cout << *loc << '\n';
}

12

12

Solution 3

Easy , direct method cause binary searching is too mainstream. Just need a few lines:

int where_to_add(int array[], int element)
{
    int i;
    for (i = length; i >= 0 && array[i-1] > element; i--);
    return i;
}

Let me know if this is the answer you were looking for

Solution 4

Binary search will be of limited interest, as you will need to insert anyway and this will remain a time consuming operation (O(N)). So your first idea of a linear search followed by insertion is good enough; you can combine in a single backward loop. (This is a step of StraightInsertionSort.)

The truly efficient ways to handle dynamic sorted lists are by maintaining a balanced tree or using a hash table.

Solution 5

If you're sorting a sorted list with only a few new out of place trailing items then you should take advantage of the rare case in which insertion sort actually works efficiently. Implementing insertion sort on a sorted list with only a few trailing out of place values can sort in O(n) time. You're just inserting your few out of place values into place, while quick sort is picking a pivot and going through the entire quick sort process. Also, if you're not incorporating some type of efficient pivot selection process into your quick sort, and going with some "average of first 3 items" approach on an already sorted list you're going to be sorting in O(n^2) time.

Share:
14,787
Dan
Author by

Dan

Updated on June 26, 2022

Comments

  • Dan
    Dan about 2 years

    I've got a database with approximately 200 000 items, which is sorted by username. Now when I add an item to end of array and call my quick sort function to sort that array it takes almost a second to sort, which is not acceptable. There are definitely quite some optimisations that can be done. For example if I sequentially compare each string from n-1 to 0, and then move items accordingly performance is much greater.

    Other idea is that I could perform binary search from 0 to n-1, well not infact search, but something similar to take advantage of my already sorted array. However I've failed to write a proper function that would return an index where my new element should be placed.

    void quick_sort(int left, int right)
    {
        int i = left, j = right;
        if (left >= right) return;
        char  pivotC[128];
        DataEntry *tmp;
    
        strcpy_a(pivotC, sizeof pivotC, User[(left + right) / 2]->username);
    
        while (i <= j)
        {
            while (StringCompare(User[i]->username, pivotC))
                i++;
            while (StringCompare(pivotC, User[j]->username))
                j--;
            if (i <= j) 
            {
                tmp = User[i];
                User[i] = User[j];
                User[j] = tmp;
                i++;
                j--;
            }
        }
        if (left < j)
            quick_sort(left, j);
        if (i < right)
            quick_sort(i, right);
    }
    

    Any help is greatly appreciated.

  • oknsnl
    oknsnl over 9 years
    right(r) left(l) middle(m) container(c) t(the object ve find its place) And it returns the position of correct place u push that object
  • Sebastian Redl
    Sebastian Redl over 9 years
    The predicate needs to take two parameters.
  • Sebastian Redl
    Sebastian Redl over 9 years
    Also, I believe that you need to use upper_bound. insert inserts before the iterator, so you need the next element after the theoretical insert location.
  • dau_sama
    dau_sama over 9 years
    yeah, or reverse the comparison in the lambda, I guess upper_bound is the cleanest however
  • legends2k
    legends2k over 9 years
    @SebastianRedl In a list { 1, 4, 5 } trying to insert 4 (again) using lower_bound will lead to { 1, (new) 4, 4, 5 }; if we'd used uppoer_bound we'd have { 1, 4, (new) 4, 5 }. How does it make a difference? In fact using lower_bound on may skip insertion knowing the element already exists. Using upper_bound we'd've to make a -- on the iterator to verify if the element already exists.
  • Sebastian Redl
    Sebastian Redl over 9 years
    @legends2k You are right, I somehow thought that lower_bound({ 1, 3, 4 }, 2) would return the iterator pointing to 1. That's of course wrong, so lower_bound works, and as you pointed out, is better if you want to eliminate duplicates. (If, on the other hand, you have lots and lots of duplicates, upper_bound is a tiny bit faster since it has to move fewer elements.)
  • legends2k
    legends2k over 9 years
    upper_bound [...] has to move fewer elements How? Both would look for x, then upper_bound would've to move across all the dupes of x to point to x + 1 (i.e. element greater than x) so still lower_bound is better, no? Or perhaps equal since lower_bound has to work too to get to the beginning of xs.
  • mur
    mur over 5 years
    "I don't understand why people write C code in C++" - C is a lot easier and no magic, c++ loads of magic. raw pointers vs smart pointers, easiest decision of my life, raw - thug life.