How can I merge two sequences in clojure?

41,091

Solution 1

I think andih's solution works great. Here is an alternate way because hey why not. It uses concat and distinct:

user> (distinct (concat '(1 2 3) '(2 3 4)))
=> (1 2 3 4)

Solution 2

If what you want is actually distinct unsorted data (sets), you should be using Clojure's set data structure instead of vectors or lists. And as andih suggested indirectly, there is a core library for set operations: http://clojure.github.com/clojure/clojure.set-api.html

(require '[clojure.set :refer [union]])

(union #{1 2 3} #{3 4 5})
=> #{1 2 3 4 5}

If sets are for whatever reason not what you want, then read on. Careful with concat when you have a significant amount of data in your sequences, and consider using into which is much better optimized as a vector merging algorithm. I don't know why concat is not implemented using into (or better yet-- why does concat even exist? BTW while into is significantly faster than concat, it is still way way slower than conj. Bagwell's RRB trees, compatible with both Clojure and Scala, will solve this problem, but are not yet implemented for Clojure).

To rephrase Omri's non-set solution in terms of 'into':

(distinct (into [1 2 3] [3 4 5]))
=> (1 2 3 4 5)

Solution 3

One way to get the union of two lists is to use union

Clojure> (into #{} (clojure.set/union '(1,2,3) '(3,4,5)))
#{1 2 3 4 5}

or if you want to get a list

(into '() (into #{} (clojure.set/union '(1,2,3) '(3,4,5))))
(5 4 3 2 1)

Solution 4

If you don't mind duplicates, you can try concat :

(concat '(1 2 3 ) '(4 5 6 1) '(2 3)) 
;;==> (1 2 3 4 5 6 1 2 3) 

Solution 5

One option is flatten:

(def colls '((1 2 3) (2 3 4)))
(flatten colls) ;; => (1 2 3 2 3 4)
(distinct (flatten colls)) ;; => (1 2 3 4)

One thing to be aware of is that it will flatten deeply nested collections:

(flatten [[1 2 [3 4 5]] [1 [2 [3 4]]]]) ;; => (1 2 3 4 5 1 2 3 4)

But works well for maps:

(flatten [[{} {} {}] [{} {} {}]]) ;; => ({} {} {} {} {} {})
Share:
41,091

Related videos on Youtube

user1639206
Author by

user1639206

Updated on July 09, 2022

Comments

  • user1639206
    user1639206 over 1 year

    What is an idiomatic way to merge (or retrieve the union of) two lists (or sequences) in Clojure?

    (merge l1 l2)
    

    doesn't seem to be the solution:

    a=> (merge '(1 2 3) '(2 3 4))
    ((2 3 4) 1 2 3)
    
    • mikera
      mikera over 11 years
      how do you define "merge"? e.g. do duplicates exist, and if so how are duplicates handled? also do you know if the lists are already sorted?
    • tnoda
      tnoda over 11 years
      FYI. The function name, merge, has already been taken by clojure.core. To avoid confusion, you may choose another name for your merge function. See clojuredocs.org/clojure_core/clojure.core/merge
    • rplevy
      rplevy over 11 years
      The poster was in fact using clojure.core/merge, but not on hash-maps or otherwise associative data, and said function has undefined behavior in that context.
  • tnoda
    tnoda over 11 years
    (-> #{} (into [1 2 3]) (into [3 4 5]) seq)
  • Marko Topolnik
    Marko Topolnik over 11 years
    You aren't really achieving anything with union operating on list arguments. All it does is (reduce conj '(1,2,3) '(3,4,5)). clojure.set functions are designed to work with set arguments.
  • rplevy
    rplevy over 11 years
    I would just advise against concat for performance reasons, as it is surprisingly slow. See my answer below for further discussion.
  • Omri Bernstein
    Omri Bernstein over 11 years
    @rplevy thanks for the comment (and your answer below). I had not realized that concat has performance problem.
  • Philip Potter
    Philip Potter over 11 years
    what's so bad about concat? It's constant-time because it's lazy, not the linear-time implementation you get in strict languages.
  • Philip Potter
    Philip Potter over 11 years
    also, how can into be slower than conj? it's implemented using conj. (into foo bar) is the same as (reduce conj foo bar), except that it will use transients if they are available.
  • rplevy
    rplevy over 11 years
    No, I said conj is faster! (but if you need to concatenate vectors and not merely conj one onto the other, these are different needs to have.)
  • rplevy
    rplevy over 11 years
    Philip: into uses transients, very similar to the transients for vector concatenation example in Joy of Clojure in fact, which is a good place to look for more discussion on this.
  • Petrus Theron
    Petrus Theron about 10 years
    @PhilipPotter surely concat cannot be constant time for all inputs? Has to be linear time as size of input lenghts grow?
  • Philip Potter
    Philip Potter over 9 years
    @pete sorry for late reply. concat is lazy, so yes it can be constant time because it doesn't really do much work. it returns something which when consumed as a seq, first fetches items from the first arg, then from the second when the first is exhausted.