How to chain map and filter functions in the correct order

11,072

before we know better

I really like chaining ...

I see that, and I'll appease you, but you'll come to understand that forcing your program through a chaining API is unnatural, and more trouble than it's worth in most cases.

const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
  map: map => Transduce(mapper(map)(reducer)),
  filter: pred => Transduce(filterer(pred)(reducer)),
  run: arr => arr.reduce(reducer, [])
});

I think I understand why it happens, but I can't figure out how to fix it without changing the “interface” of my function.

The problem is indeed with your Transduce constructor. Your map and filter methods are stacking map and pred on the outside of the transducer chain, instead of nesting them inside.

Below, I've implemented your Transduce API that evaluates the maps and filters in correct order. I've also added a log method so that we can see how Transduce is behaving

const Transduce = (f = k => k) => ({
  map: g =>
    Transduce(k =>
      f ((acc, x) => k(acc, g(x)))),
  filter: g =>
    Transduce(k =>
      f ((acc, x) => g(x) ? k(acc, x) : acc)),
  log: s =>
    Transduce(k =>
      f ((acc, x) => (console.log(s, x), k(acc, x)))),
  run: xs =>
    xs.reduce(f((acc, x) => acc.concat(x)), [])
})

const foo = nums => {
  return Transduce()
    .log('greater than 2?')
    .filter(x => x > 2)
    .log('\tsquare:')
    .map(x => x * x)
    .log('\t\tless than 30?')
    .filter(x => x < 30)
    .log('\t\t\tpass')
    .run(nums)
}

// keep square(n), forall n of nums
//   where n > 2
//   where square(n) < 30
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]

untapped potential

Inspired by this answer ...

In reading that answer I wrote, you overlook the generic quality of Trans as it was written there. Here, our Transduce only attempts to work with Arrays, but really it can work with any type that has an empty value ([]) and a concat method. These two properties make up a category called Monoids and we'd be doing ourselves a disservice if we didn't take advantage of transducer's ability to work with any type in this category.

Above, we hard-coded the initial accumulator [] in the run method, but this should probably be supplied as an argument – much like we do with iterable.reduce(reducer, initialAcc)

Aside from that, both implementations are essentially equivalent. The biggest difference is that the Trans implementation provided in the linked answer is Trans itself is a monoid, but Transduce here is not. Trans neatly implements composition of transducers in the concat method whereas Transduce (above) has composition mixed within each method. Making it a monoid allows us to rationalize Trans the same way do all other monoids, instead of having to understand it as some specialized chaining interface with unique map, filter, and run methods.

I would advise building from Trans instead of making your own custom API


have your cake and eat it too

So we learned the valuable lesson of uniform interfaces and we understand that Trans is inherently simple. But, you still want that sweet chaining API. OK, ok...

We're going to implement Transduce one more time, but this time we'll do so using the Trans monoid. Here, Transduce holds a Trans value instead of a continuation (Function).

Everything else stays the same – foo takes 1 tiny change and produces an identical output.

// generic transducers
const mapper = f =>
  Trans(k => (acc, x) => k(acc, f(x)))

const filterer = f =>
  Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)

const logger = label =>
  Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))

// magic chaining api made with Trans monoid
const Transduce = (t = Trans.empty()) => ({
  map: f =>
    Transduce(t.concat(mapper(f))),
  filter: f =>
    Transduce(t.concat(filterer(f))),
  log: s =>
    Transduce(t.concat(logger(s))),
  run: (m, xs) =>
    transduce(t, m, xs)
})

// when we run, we must specify the type to transduce
//   .run(Array, nums)
// instead of
//   .run(nums)

Expand this code snippet to see the final implementation – of course you could skip defining a separate mapper, filterer, and logger, and instead define those directly on Transduce. I think this reads nicer tho.

// Trans monoid
const Trans = f => ({
  runTrans: f,
  concat: ({runTrans: g}) =>
    Trans(k => f(g(k)))
})

Trans.empty = () =>
  Trans(k => k)

const transduce = (t, m, xs) =>
  xs.reduce(t.runTrans((acc, x) => acc.concat(x)), m.empty())

// complete Array monoid implementation
Array.empty = () => []

// generic transducers
const mapper = f =>
  Trans(k => (acc, x) => k(acc, f(x)))
  
const filterer = f =>
  Trans(k => (acc, x) => f(x) ? k(acc, x) : acc)
  
const logger = label =>
  Trans(k => (acc, x) => (console.log(label, x), k(acc, x)))

// now implemented with Trans monoid
const Transduce = (t = Trans.empty()) => ({
  map: f =>
    Transduce(t.concat(mapper(f))),
  filter: f =>
    Transduce(t.concat(filterer(f))),
  log: s =>
    Transduce(t.concat(logger(s))),
  run: (m, xs) =>
    transduce(t, m, xs)
})

// this stays exactly the same
const foo = nums => {
  return Transduce()
    .log('greater than 2?')
    .filter(x => x > 2)
    .log('\tsquare:')
    .map(x => x * x)
    .log('\t\tless than 30?')
    .filter(x => x < 30)
    .log('\t\t\tpass')
    .run(Array, nums)
}

// output is exactly the same
console.log(foo([1,2,3,4,5,6,7]))
// => [ 9, 16, 25 ]

wrap up

So we started with a mess of lambdas and then made things simpler using a monoid. The Trans monoid provides distinct advantages in that the monoid interface is known and the generic implementation is extremely simple. But we're stubborn or maybe we have goals to fulfill that are not set by us – we decide to build the magic Transduce chaining API, but we do so using our rock-solid Trans monoid which gives us all the power of Trans but also keeps complexity nicely compartmentalised.


dot chaining fetishists anonymous

Here's a couple other recent answers I wrote about method chaining

Share:
11,072
user3297291
Author by

user3297291

Updated on June 16, 2022

Comments

  • user3297291
    user3297291 almost 2 years

    I really like chaining Array.prototype.map, filter and reduce to define a data transformation. Unfortunately, in a recent project that involved large log files, I could no longer get away with looping through my data multiple times...

    My goal:

    I want to create a function that chains .filter and .map methods by, instead of mapping over an array immediately, composing a function that loops over the data once. I.e.:

    const DataTransformation = () => ({ 
        map: fn => (/* ... */), 
        filter: fn => (/* ... */), 
        run: arr => (/* ... */)
    });
    
    const someTransformation = DataTransformation()
        .map(x => x + 1)
        .filter(x => x > 3)
        .map(x => x / 2);
    
    // returns [ 2, 2.5 ] without creating [ 2, 3, 4, 5] and [4, 5] in between
    const myData = someTransformation.run([ 1, 2, 3, 4]); 
    

    My attempt:

    Inspired by this answer and this blogpost I started writing a Transduce function.

    const filterer = pred => reducer => (acc, x) =>
        pred(x) ? reducer(acc, x) : acc;
    
    const mapper = map => reducer => (acc, x) =>
        reducer(acc, map(x));
    
    const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
        map: map => Transduce(mapper(map)(reducer)),
        filter: pred => Transduce(filterer(pred)(reducer)),
        run: arr => arr.reduce(reducer, [])
    });
    

    The problem:

    The problem with the Transduce snippet above, is that it runs “backwards”... The last method I chain is the first to be executed:

    const someTransformation = Transduce()
        .map(x => x + 1)
        .filter(x => x > 3)
        .map(x => x / 2);
    
    // Instead of [ 2, 2.5 ] this returns []
    //  starts with (x / 2)       -> [0.5, 1, 1.5, 2] 
    //  then filters (x < 3)      -> [] 
    const myData = someTransformation.run([ 1, 2, 3, 4]);
    

    Or, in more abstract terms:

    Go from:

    Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, f(g(x)))
    

    To:

    Transducer(concat).map(f).map(g) == (acc, x) => concat(acc, g(f(x)))
    

    Which is similar to:

    mapper(f) (mapper(g) (concat))
    

    I think I understand why it happens, but I can't figure out how to fix it without changing the “interface” of my function.

    The question:

    How can I make my Transduce method chain filter and map operations in the correct order?


    Notes:

    • I'm only just learning about the naming of some of the things I'm trying to do. Please let me know if I've incorrectly used the Transduce term or if there are better ways to describe the problem.
    • I'm aware I can do the same using a nested for loop:

    const push = (acc, x) => (acc.push(x), acc);
    const ActionChain = (actions = []) => {
      const run = arr =>
        arr.reduce((acc, x) => {
          for (let i = 0, action; i < actions.length; i += 1) {
            action = actions[i];
    
            if (action.type === "FILTER") {
              if (action.fn(x)) {
                continue;
              }
    
              return acc;
            } else if (action.type === "MAP") {
              x = action.fn(x);
            }
          }
    
          acc.push(x);
          return acc;
        }, []);
    
      const addAction = type => fn => 
        ActionChain(push(actions, { type, fn }));
    
      return {
        map: addAction("MAP"),
        filter: addAction("FILTER"),
        run
      };
    };
    
    // Compare to regular chain to check if 
    // there's a performance gain
    // Admittedly, in this example, it's quite small...
    const naiveApproach = {
      run: arr =>
        arr
          .map(x => x + 3)
          .filter(x => x % 3 === 0)
          .map(x => x / 3)
          .filter(x => x < 40)
    };
    
    const actionChain = ActionChain()
      .map(x => x + 3)
      .filter(x => x % 3 === 0)
      .map(x => x / 3)
      .filter(x => x < 40)
    
    
    const testData = Array.from(Array(100000), (x, i) => i);
    
    console.time("naive");
    const result1 = naiveApproach.run(testData);
    console.timeEnd("naive");
    
    console.time("chain");
    const result2 = actionChain.run(testData);
    console.timeEnd("chain");
    console.log("equal:", JSON.stringify(result1) === JSON.stringify(result2));
    • Here's my attempt in a stack snippet:

    const filterer = pred => reducer => (acc, x) =>
      pred(x) ? reducer(acc, x) : acc;
    
    const mapper = map => reducer => (acc, x) => reducer(acc, map(x));
    
    const Transduce = (reducer = (acc, x) => (acc.push(x), acc)) => ({
      map: map => Transduce(mapper(map)(reducer)),
      filter: pred => Transduce(filterer(pred)(reducer)),
      run: arr => arr.reduce(reducer, [])
    });
    
    const sameDataTransformation = Transduce()
      .map(x => x + 5)
      .filter(x => x % 2 === 0)
      .map(x => x / 2)
      .filter(x => x < 4);
      
    // It's backwards:
    // [-1, 0, 1, 2, 3]
    // [-0.5, 0, 0.5, 1, 1.5]
    // [0]
    // [5]
    console.log(sameDataTransformation.run([-1, 0, 1, 2, 3, 4, 5]));
    • Jonas Wilms
      Jonas Wilms almost 7 years
      map: map => Transduce(mapper(map)(reducer)), means: return a function that maps first, then does the original reducing.
    • Admin
      Admin almost 7 years
      Transducer build a stack. Try run: arr => arr.reduceRight(reducer, [])
    • user3297291
      user3297291 almost 7 years
      @ftor I wanted to make the reducer in run be a composition of all map and filter methods previously passed. I think the problem is not the order of which I reduce my arr with data, but the way I'm failing to correctly return a new reducer method. E.g.: mapper(f) (mapper(g) (concat)); returns a reducer similar to (acc, x) => concat(acc, g(f(x))), whereas my Transduce(concat).map(f).map(g) returns (acc, x) => concat(acc, f(g(x)))
  • user3297291
    user3297291 almost 7 years
    I don't think this works since my pred is a function of x -> bool and map is x -> y. This makes the reducer run first, returning an array (at least in my current example). Or am I missing something here?
  • Jonas Wilms
    Jonas Wilms almost 7 years
    @user3297291 you were right. You also needed to change run and the default reducer jsbin.com/laqezabihe/edit?console
  • user3297291
    user3297291 almost 7 years
    The crux of the problem is really in supporting filter it seems. I agree that the for loop option beats having to resort to a try {} catch () {}...
  • user3297291
    user3297291 almost 7 years
    What a great answer! I had hoped to make things easier for myself by skipping some of the "abstract" stuff and making the code less general (i.e. only apply to arrays), but I see now that it doesn't always work that way... I'll have to admit that some of the concepts you're referring to (like η-conversion and Monoids) are still hard for me to grasp. But I really appreciate that you didn't just leave it at the quick fix ("but that's UGLY" – haha!) and took the time to explain, step-by-step, the concepts that help you solve such problems in a structured way.
  • Bergi
    Bergi almost 7 years
    Nice! Notice that the eta conversion works on the tupled function as well, no need to curry it
  • Mulan
    Mulan almost 7 years
    @Bergi, I thought there was something off with that. My intuition said it should've been possible, but when I tried it earlier, I must've made a mistake or something. Thanks for catching that, it cleaned up the answer significantly.