What is the Haskell response to Node.js?

multithreading haskell concurrency node.js

49,095

Solution 1

Ok, so having watched a little of the node.js presentation that @gawi pointed me at, I can say a bit more about how Haskell compares to node.js. In the presentation, Ryan describes some of the benefits of Green Threads, but then goes on to say that he doesn't find the lack of a thread abstraction to be a disadvantage. I'd disagree with his position, particularly in the context of Haskell: I think the abstractions that threads provide are essential for making server code easier to get right, and more robust. In particular:

using one thread per connection lets you write code that expresses the communication with a single client, rather that writing code that deals with all the clients at the same time. Think of it like this: a server that handles multiple clients with threads looks almost the same as one that handles a single client; the main difference is there's a fork somewhere in the former. If the protocol you're implementing is at all complex, managing the state machine for multiple clients simultaneously gets quite tricky, whereas threads let you just script the communication with a single client. The code is easier to get right, and easier to understand and maintain.
callbacks on a single OS thread is cooperative multitasking, as opposed to preemptive multitasking, which is what you get with threads. The main disadvantage with cooperative multitasking is that the programmer is responsible for making sure that there's no starvation. It loses modularity: make a mistake in one place, and it can screw up the whole system. This is really something you don't want to have to worry about, and preemption is the simple solution. Moreover, communication between callbacks isn't possible (it would deadlock).
concurrency isn't hard in Haskell, because most code is pure and so is thread-safe by construction. There are simple communication primitives. It's much harder to shoot yourself in the foot with concurrency in Haskell than in a language with unrestricted side effects.

Solution 2

Can Haskell provide some of the benefits of Node.js, namely a clean solution to avoid blocking I/O without having recourse to multi-thread programming?

Yes, in fact events and threads are unified in Haskell.

You can program in explicit lightweight threads (e.g. millions of threads on a single laptop).
Or; you can program in an async event-driven style, based on scalable event notification.

Threads are actually implemented in terms of events, and run across multiple cores, with seamless thread migration, with documented performance, and applications.

E.g. for

massively concurrent job orchestration
concurrent collections scaling on 32 or 48 cores
tool support for profiling and debugging multi-threaded/multi-event programs.
high performance event-driven web servers.
interesting users: such as high-frequency trading.

Concurrent collections nbody on 32 cores

alt text

In Haskell you have both events and threads, and as it is all events under the hood.

Read the paper describing the implementation.

Solution 3

First up, I don't hold your view that node.js is doing the right thing exposing all of those callbacks. You end up writing your program in CPS (continuation passing style) and I think it should be the compiler's job to do that transformation.

Events: No thread manipulation, the programmer only provides callbacks (as in Snap framework)

So with this in mind, you can write using a asynchronous style if you so wish, but by doing so you'd miss out on writing in an efficient synchronous style, with one thread per request. Haskell is ludicrously efficient at synchronous code, especially when compared to other languages. It's all events underneath.

Callbacks are guaranteed to be run in a single thread: no race condition possible.

You could still have a race condition in node.js, but it's more difficult.

Every request is in it's own thread. When you write code that has to communicate with other threads, it's very simple to make it threadsafe thanks to haskell's concurrency primitives.

Nice and simple UNIX-friendly API. Bonus: Excellent HTTP support. DNS also available.

Take a look in hackage and see for yourself.

Every I/O is by default asynchronous (this can be annoying sometimes, though). This makes it easier to avoid locks. However, too much CPU processing in a callback will impact other connections (in this case, the task should split into smaller sub-tasks and re-scheduled).

You have no such problems, ghc will distribute your work amongst real OS threads.

Same language for client-side and server-side. (I don't see too much value in this one, however. JQuery and Node.js share the event programming model but the rest is very different. I just can't see how sharing code between server-side and client-side could be useful in practice.)

Haskell can't possibly win here... right? Think again, http://www.haskell.org/haskellwiki/Haskell_in_web_browser .

All this packaged in a single product.

Download ghc, fire up cabal. There's a package for every need.

Solution 4

I personally see Node.js and programming with callbacks as unnecessarily low-level and a bit unnatural thing. Why program with callbacks when a good runtime such as the one found in GHC may handle callbacks for you and do so pretty efficiently?

In the meantime, GHC runtime has improved greatly: it now features a "new new IO manager" called MIO where "M" stands for multicore I believe. It builds on foundation of existing IO manager and its main goal is to overcome the cause of 4+ cores performance degradation. Performance numbers provided in this paper are pretty impressive. See yourself:

With Mio, realistic HTTP servers in Haskell scale to 20 CPU cores, achieving peak performance up to factor of 6.5x compared to the same servers using previous versions of GHC. The latency of Haskell servers is also improved: [...] under a moderate load, reduces expected response time by 5.7x when compared with previous versions of GHC

And:

We also show that with Mio, McNettle (an SDN controller written in Haskell) can scale effectively to 40+ cores, reach a thoroughput of over 20 million new requests per second on a single machine, and hence become the fastest of all existing SDN controllers.

Mio has made it into GHC 7.8.1 release. I personally see this as a major step forward in Haskell performance. It would be very interesting to compare existing web applications performance compiled by the previous GHC version and 7.8.1.

Solution 5

IMHO events are good, but programming by means of callbacks is not.

Most of the problems that makes special the coding and debugging of web applications comes from what makes them scalable and ﬂexible. The most important, the stateless nature of HTTP. This enhances navigability, but this imposes an inversion of control where the IO element (the web server in this case) call different handlers in the application code. This event model -or callback model, more accurately said- is a nightmare, since callbacks do not share variable scopes, and an intuitive view of the navigation is lost. It is very difficult to prevent all the possible state changes when the user navigate back and forth, among other problems.

It may be said that the problems are similar to GUI programming where the event model works fine, but GUIs have no navigation and no back button. That multiplies the state transitions possible in web applications. The result of the attempt to solve these problem are heavy frameworks with complicated configurations plenty of pervasive magic identifiers without questioning the root of the problem: the callback model and its inherent lack of sharing of variable scopes, and no sequencing, so the sequence has to be constructed by linking identifiers.

There are sequential based frameworks like ocsigen (ocaml) seaside (smalltalk) WASH (discontinued, Haskell) and mflow (Haskell) that solve the problem of state management while maintaining navigability and REST-fulness. within these frameworks, the programmer can express the navigation as a imperative sequence where the program send pages and wait for responses in a single thread, variables are in scope and the back button works automatically. This inherently produces shorter, more safe, more readable code where the navigation is clearly visible to the programmer. (fair warning: I´m the developer of mflow)

View more solutions

49,095

gawi

Solution architect, formely Java developer. 20 years experience in speech recognition telephony applications and call center solutions. Haskell wannabe. Main developper of the Rivr framework (http://rivr.nuecho.com)

Updated on July 30, 2020

Comments

gawi almost 4 years
I believe the Erlang community is not envious of Node.js as it does non-blocking I/O natively and has ways to scale deployments easily to more than one processor (something not even built-in in Node.js). More details at http://journal.dedasys.com/2010/04/29/erlang-vs-node-js and Node.js or Erlang

What about Haskell? Can Haskell provide some of the benefits of Node.js, namely a clean solution to avoid blocking I/O without having recourse to multi-thread programming?

There are many things that are attractive with Node.js
1. Events: No thread manipulation, the programmer only provides callbacks (as in Snap framework)
2. Callbacks are guaranteed to be run in a single thread: no race condition possible.
3. Nice and simple UNIX-friendly API. Bonus: Excellent HTTP support. DNS also available.
4. Every I/O is by default asynchronous. This makes it easier to avoid locks. However, too much CPU processing in a callback will impact other connections (in this case, the task should split into smaller sub-tasks and re-scheduled).
5. Same language for client-side and server-side. (I don't see too much value in this one, however. jQuery and Node.js share the event programming model but the rest is very different. I just can't see how sharing code between server-side and client-side could be useful in practice.)
6. All this packaged in a single product.
- Jonas over 13 years
  
  I think you should ask this question on Programmers instead.
- gawi over 13 years
  
  Not including a piece of code does not make it a subjective question.
- Jonas over 13 years
  
  True, but Programmers isn't only a site for subjective questions. It is a place for questions that aren't directly related to code e.g. language choice. Good question though. +1
- gawi over 13 years
  
  I did wonder whether to ask on Programmers or not. But: 1- The subject seems too technical compared to other questions. 2- No "erlang", "haskell" no "node.js" tags. 3- The answer "could" involve some code.
- Jonas over 13 years
  
  Yes, I agree with that. But it is possible to ask the same question on both sites.
- Simon Marlow over 13 years
  
  I don't know much about node.js, but one thing struck me about your question: why do you find the prospect of threads so unpleasant? Threads should be exactly the right solution to multiplexing I/O. I use the term threads broadly here, including Erlang's processes. Perhaps you're worried about locks and mutable state? You don't have to do things that way - use message-passing or transactions if that makes more sense for your application.
- gawi over 13 years
  
  @Simon Marlow One of the Node.js characteristics is that every callback code run in one single thread, freeing the programmer from parallel mutation problems.
- Simon Marlow over 13 years
  
  @gawi I don't think that sounds very easy to program - without preemption, you have to deal with the possibility of starvation and long latencies. Basically threads are the right abstraction for a web server - there's no need to deal with asynchronous I/O and all the difficulties that go along with that, just do it in a thread. Incidentally, I wrote a paper about web servers in Haskell which you might find interesting: haskell.org/~simonmar/papers/web-server-jfp.pdf
- gawi over 13 years
  
  @Simon I haven't tries Node.js for the real-life test, so I remain skeptical about the ease of programming promises. Regarding threads, I should have said "OS thread". Node.js does not bring green/interpreter threads to Javascript but rather works on top of libev software.schmorp.de/pkg/libev.html which is an event loop and run every user-provided callbacks exclusively in a single thread. See the first minutes of this presentation for a more detailed explanation of the motivations of Node.js youtube.com/watch?v=F6k8lTrAE2g
- Hassan Syed over 10 years
  
  Lol some of these types of questions are closed others are protected. SO is a circus.
- ArtOfWarfare over 9 years
  
  @HassanSyed - I'm not sure I would consider the two mutually exclusive. You protect a question to stop inexperienced users from attempting to answer it. You close a question because it doesn't belong on SO, but it already has useful content so it'd be a crime against humanity to delete it entirely.
- Erik Kaplun about 9 years
  
  Look at GHCJS or Haste if you want Haskell on both sides; or PureScript, Fay or Elm if a Haskell-like language is OK for you in the browser.
- Admin almost 9 years
  
  "Callbacks are guaranteed to be run in a single thread: no race condition possible." Wrong. You can easily have race conditions in Node.js; just assume that one I/O action will complete before another one, and BOOM. What is indeed impossible is one particular kind of race conditions, namely concurrent unsynchronised access to the same byte in memory.
gawi over 13 years

Thanks. I need to digest all this... This seems to be GHC-specific. I guess it's OK. The Haskell language is sometime as anything GHC can compile. In a similar way, the Haskell "platform" is more or less the GHC run-time.
Robert Massaioli over 13 years

@gawi: That and all of the other packages that get bundled right into it so that it is useful right out of the box. And this is the same image that I saw in my CS course; and the best part is that it is not hard in Haskell to achieve similar awesome results in your own programs.
gawi over 13 years

Ok, so I get that node.js is solution to 2 problems: 1- concurrency is hard in most languages, 2- using OS threads is expansive. Node.js solution is to use event-based concurrency (w/ libev) to avoid communication between threads and to avoid scalability problems of OS threads. Haskell does not have problem #1 because of purity. For #2, Haskell has lightweight threads + event manager that was optimized recently in GHC for large-scale contexts. Also, using Javascript just can't be perceived as a plus for any Haskell developer. For some people using the Snap Framework, Node.js is "just bad".
gawi over 13 years

Request processing is most of the time a sequence of inter-dependent operations. I tend to agree that using callbacks for every blocking operation can be cumbersome. Threads are better suited than callback for this.
andreypopp over 13 years

Yep! And brand new I/O multiplexing in GHC 7 makes writing servers in Haskell even better.
Ricardo Tomasi almost 13 years

Your first point doesn't make much sense to me (as an outsider)... When processing a request in node.js your callback deals with a single client. Managing state only becomes something to worry about when scaling to multiple processes, and even then it's quite easy using available libraries.
Greg Weber almost 13 years

Hi Don, do you think you could link to the haskell web server that performs the best (Warp) when answering questions like these? Here is the quite relevant benchmark against Node.js: yesodweb.com/blog/2011/03/…
gawi almost 13 years

I was just playing devil's advocate. So, yes I agree on your points. Except the client-side and server-side language unification. While I think it's technically feasible, I don't think it may eventually replace all the Javascript ecosystem in place today (JQuery and friends). While it's an argument put forward by Node.js supporters, I don't think it's a very important one. Do you really need to share that much code between your presentation layer and your backend? Do we really aim having programmers knowing just one language?
dan_waterworth almost 13 years

The real win is that you can render pages on both the server and client side making real-time pages easier to create.
gawi almost 13 years

Ridiculous? The question is not "does Haskell has a response" but rather "what is the Haskell response". At the time the question was asked, GHC 7 was not even released so Haskell was not "in the game" yet (except maybe for frameworks using libev like Snap). Other than that, I agree.
Honza Pokorny over 12 years

The paper link is dead. Could you fix it? Thanks
Tim Gautier about 12 years

I don't know if this was true when you posted this answer, but now there are, in fact, node modules that allow node apps to easily scale across cores. Also, that link is comparing node.js running on a single core to haskell running on 4 cores. I'd like to see it run again in a fairer configuration, but alas, the github repo is gone.
Jesse Hallett almost 12 years

Voted down because this response does not answer the question, does Haskell support non-blocking IO? The advantages of multithreading, while relevant, are a separate issue.
mb21 almost 12 years

@dan_waterworth exactly, see meteor or derby.js
AndrewC over 11 years

It's not a separate issue. If this question is a genuine search for the best tools for the job in Haskell, or a check whether excellent tools for the job exist in Haskell, then the implicit assumption that multi-threaded programming would be unsuitable needs to be challenged, because Haskell does threads rather differently, as Don Stewart points out. Answers that explain why the Haskell community are also not jealous of Node.js are very much on-topic for this question. gawi's response suggests it was an appropriate answer to his question.
Kr0e almost 11 years

Haskell using more than 4 cores degrades the performance of the application. There was a paper on this issue, it's actively worked on but it is still an issue. So running 16 instances of Node.js on 16 core server will most likely be much better than a single ghc application using +RTS -N16 which indeed will be slower than +RTS -N1 because of this runtime bug. It's because they use just one IOManager which will slow down when used with many OS threads. I hope they will fix this bug but it exists since ever so I would have not much hope...
Kr0e almost 11 years

Just in theory. Haskell "lightweight threads" are not so lightweight as you think. It's much much much much cheaper to register a callback on an epoll interface than scheduling a so called green thread, they are of course cheaper than OS threads but they are not free. Creating 100.000 of them uses approx. 350 MB of memory and take some time. Try 100.000 connections with node.js. No problem at all . It would be magic if it were not faster since ghc uses epoll under the hood so they cannot be faster than using epoll directly. The programming with threads interface is quite nice, though.
Kr0e almost 11 years

In addition: The new IO manager (ghc) uses a scheduling algorithm which has (m log n) complexity (where m is the number of runnable threads and n the total number of threads). Epoll has complexity k (k is number of readable/writeable fd's=. So ghc has O(k * m log n) over all complexity which is not very good if you face high traffic connections. Node.js has just the linear complexity caused by epoll. And just let us dont talk about windows performance... Node.js is much faster because it uses IOCP.
dfeuer over 9 years

How does this answer the question?
Chawathe Vipul S over 9 years

@dfeuer The link must read as, Snap Haskell Web Framework has dropped libev, I don't know why formatting is failing. The node server runtime was all about Linux libev when it began, & so was Snap Web FrameWork. Haskell with Snap is like ECMAscript with nodejs, so how Snap evolves alongwith nodejs is more relevant than Haskell, that can more rightly compared with ECMAscript in this context.
Robin Green over 9 years

In node.js callbacks are used for handling async I/O, e.g. to databases. You are talking about something different which, while interesting, does not answer the question.
agocorona almost 7 years

You are right. It took three years to have an answer that, I hope, meet your objections: github.com/transient-haskell
Eric Elliott over 6 years

@gawi We have production services where 85% of the code is shared between the client and server. This is known as universal JavaScript in the community. We're using React to dynamically render content on the server to decrease the time to first useful render in the client. While I'm aware that you can run Haskell in the browser, I'm not aware of any set of "universal Haskell" best practices that allow for server-side and client-side rendering using the same codebase.
Eric Elliott over 6 years

Anybody looking at this answer should be aware that Node can easily process 100k simple requests on a single core and it's trivially easy to scale a stateless Node application across many cores. pm2 -i max path/to/app.js will automatically scale to the optimum number of instances based on cores available. Additionally, Node is also non-blocking by default.
Eric Elliott over 6 years

Node now supports async functions, which means you can write imperative-style code that is actually asynchronous. It uses promises under the hood.