Julia: Flattening array of array/tuples

20,961

Solution 1

Iterators.flatten(x) creates a generator that iterates over each element of x. It can handle some of the cases you describe, eg

julia> collect(Iterators.flatten([(1,2,3),[4,5],6]))
6-element Array{Any,1}:
 1
 2
 3
 4
 5
 6

If you have arrays of arrays of arrays and tuples, you should probably reconsider your data structure because it doesn't sound type stable. However, you can use multiple calls to flatten, eg

julia> collect(Iterators.flatten([(1,2,[3,3,3,3]),[4,5],6]))
6-element Array{Any,1}:
 1            
 2            
  [3, 3, 3, 3]
 4            
 5            
 6            

julia> collect(Iterators.flatten(Iterators.flatten([(1,2,[3,3,3,3]),[4,5],6])))
9-element Array{Any,1}:
 1
 2
 3
 3
 3
 3
 4
 5
 6

Note how all of my example return an Array{Any,1}. That is a bad sign for performance, because it means the compiler could not determine a single concrete type for the elements of the output array. I chose these example because the way I read your question it sounded like you may have type unstable containers already.

Solution 2

In order to flatten an array of arrays, you can simply use vcat() like this:

julia> A = [[1,2,3],[4,5], [6,7]]
Vector{Int64}[3]
    Int64[3]
    Int64[2]
    Int64[2]
julia> flat = vcat(A...)
Int64[7]
    1
    2
    3
    4
    5
    6
    7

Solution 3

The simplest way is to apply the ellipsis ... twice.

A = [[1,2,3],[4,5], [6,7]]
flat = [(A...)...]
println(flat)

The output would be [1, 2, 3, 4, 5, 6, 7].

Solution 4

If you use VectorOfArray from RecursiveArrayTools.jl, it uses the indexing fallback to provide convert(Array,A) for a VectorOfArray A.

julia> using RecursiveArrayTools

julia> A = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
3-element Array{Array{Int64,1},1}:
 [1, 2, 3]
 [4, 5, 6]
 [7, 8, 9]

julia> VA = VectorOfArray(A)
3-element Array{Array{Int64,1},1}:
 [1, 2, 3]
 [4, 5, 6]
 [7, 8, 9]

First of it acts as a lazy wrapper for doing the indexing without conversion:

julia> VA[1,3]
7

Note that columns are the separate arrays so that way it's still "column-major" (i.e. efficient to index down columns). But then it has a straight conversion:

julia> convert(Array,VA)
3×3 Array{Int64,2}:
 1  4  7
 2  5  8
 3  6  9

The other way to handle this conversion is to do something like hcat(A...), but that's slow if you have a lot of arrays you're splatting!

Now, you may think: what about writing a function that pre-allocates the matrix, then loops through and fills it? That's almost what convert on the VectorOfArray works, except the fallback that convert uses here utilizes Tim Holy's Cartesian machinery. At one point, I wrote that function:

function vecvec_to_mat(vecvec)
  mat = Matrix{eltype(eltype(vecvec))}(length(vecvec),length(vecvec[1]))
  for i in 1:length(vecvec)
    mat[i,:] .= vecvec[i]
  end
  mat
end

but I have since gotten rid of it because the fallback was much faster. So, YMMV but that's a few ways to solve your problem.

Share:
20,961

Related videos on Youtube

Pigna
Author by

Pigna

Updated on May 15, 2020

Comments

  • Pigna
    Pigna almost 4 years

    In Julia vec reshapes multidimensional arrays into one-dimension arrays. However it doesn't work for arrays of arrays or arrays of tuples. A part from using array comprehension, is there another way to flatten arrays of arrays/tuples? Or arrays of arrays/tuples of arrays/tuples? Or ...

  • Pigna
    Pigna over 6 years
    That's exactly what I was looking for! In my case it's only arrays, no tuples, so it's not my problem. However after adding the "Iterators" package and calling "using Iterators", when I try to use flatten (inside the REPL) it tells me that the function does not exist I am on Julia 0.6.1
  • gggg
    gggg over 6 years
    Iterators is now built into base, the old Iterators package is deprecated. I don't know what happens if you install it now, but if you have any problems uninstall it. You need to using Base.Iterators to get the exported methods (including flatten) or import Base.Iterators: flatten to get just one.
  • Timmmm
    Timmmm over 6 years
    Doesn't seem to work for arrays of arrays of arrays.
  • Hansang
    Hansang over 3 years
    This is great and really compact syntax, but I still can't for the life of me understand how the ... splat operator works - would you be so kind as to go into the details of how this expression is evaluated?
  • Ricoter
    Ricoter over 3 years
    The splatting operator is used inside a function call to split one single argument in many arguments [faq at julialang]docs.julialang.org/en/v1/manual/faq/…. Here it is used inside an array literal [ ] which also works. If the operator would be used once, it would combine the elements of the splatted array back into an array, essentially doing nothing. With the brackets, the splat operation is evaluated twice: it splits the splatted array back into an array. Unpacking, unpacking, packing.
  • MentatOfDune
    MentatOfDune about 3 years
    This is very terse, but I'll warn you that the splat operator can introduce some really unexpected slow downs. Try out x = [rand(n) for n in rand(1:20, 100000)]. Compare the difference between @btime [(x...)...] (21.215 ms and 1049049 allocations) to @btime vcat(x...) (2.836ms and 3 allocations).
  • schneiderfelipe
    schneiderfelipe almost 3 years
    This seems to be the most performant option if your arrays are not arbitrarily nested.

Related