Can you explain the concept of streams?

java .net stream language-agnostic iostream

24,386

Solution 1

The word "stream" has been chosen because it represents (in real life) a very similar meaning to what we want to convey when we use it.

Let's forget about the backing store for a little, and start thinking about the analogy to a water stream. You receive a continuous flow of data, just like water continuously flows in a river. You don't necessarily know where the data is coming from, and most often you don't need to; be it from a file, a socket, or any other source, it doesn't (shouldn't) really matter. This is very similar to receiving a stream of water, whereby you don't need to know where it is coming from; be it from a lake, a fountain, or any other source, it doesn't (shouldn't) really matter.

That said, once you start thinking that you only care about getting the data you need, regardless of where it comes from, the abstractions other people talked about become clearer. You start thinking that you can wrap streams, and your methods will still work perfectly. For example, you could do this:

int ReadInt(StreamReader reader) { return Int32.Parse(reader.ReadLine()); }

// in another method:
Stream fileStream = new FileStream("My Data.dat");
Stream zipStream = new ZipDecompressorStream(fileStream);
Stream decryptedStream = new DecryptionStream(zipStream);
StreamReader reader = new StreamReader(decryptedStream);

int x = ReadInt(reader);

As you see, it becomes very easy to change your input source without changing your processing logic. For example, to read your data from a network socket instead of a file:

Stream stream = new NetworkStream(mySocket);
StreamReader reader = new StreamReader(stream);
int x = ReadInt(reader);

As easy as it can be. And the beauty continues, as you can use any kind of input source, as long as you can build a stream "wrapper" for it. You could even do this:

public class RandomNumbersStreamReader : StreamReader {
    private Random random = new Random();

    public String ReadLine() { return random.Next().ToString(); }
}

// and to call it:
int x = ReadInt(new RandomNumbersStreamReader());

See? As long as your method doesn't care what the input source is, you can customize your source in various ways. The abstraction allows you to decouple input from processing logic in a very elegant way.

Note that the stream we created ourselves does not have a backing store, but it still serves our purposes perfectly.

So, to summarize, a stream is just a source of input, hiding away (abstracting) another source. As long as you don't break the abstraction, your code will be very flexible.

Solution 2

The point is that you shouldn't have to know what the backing store is - it's an abstraction over it. Indeed, there might not even be a backing store - you could be reading from a network, and the data is never "stored" at all.

If you can write code that works whether you're talking to a file system, memory, a network or anything else which supports the stream idea, your code is a lot more flexible.

In addition, streams are often chained together - you can have a stream which compresses whatever is put into it, writing the compressed form on to another stream, or one which encrypts the data, etc. At the other end there'd be the reverse chain, decrypting, decompressing or whatever.

Solution 3

The point of the stream is to provide a layer of abstraction between you and the backing store. Thus a given block of code that uses a stream need not care if the backing store is a disk file, memory, etc...

Solution 4

It's not about streams - it's about swimming. If you can swim one Stream, than you can swim any Stream you encounter.

Solution 5

To add to the echo chamber, the stream is an abstraction so you don't care about the underlying store. It makes the most sense when you consider scenarios with and without streams.

Files are uninteresting for the most part because streams don't do much above and beyond what non-stream-based methods I'm familiar with did. Let's start with internet files.

If I want to download a file from the internet, I have to open a TCP socket, make a connection, and receive bytes until there are no more bytes. I have to manage a buffer, know the size of the expected file, and write code to detect when the connection is dropped and handle this appropriately.

Let's say I have some sort of TcpDataStream object. I create it with the appropriate connection information, then read bytes from the stream until it says there aren't any more bytes. The stream handles the buffer management, end-of-data conditions, and connection management.

In this way, streams make I/O easier. You could certainly write a TcpFileDownloader class that does what the stream does, but then you have a class that's specific to TCP. Most stream interfaces simply provide a Read() and Write() method, and any more complicated concepts are handled by the internal implementation. Because of this, you can use the same basic code to read or write to memory, disk files, sockets, and many other data stores.

View more solutions

24,386

Rob Sobers

I'm a software developer with a love for problem solving, design, and technology in general. I'm currently working at Fog Creek Software in New York.

Updated on July 08, 2022

Comments

Rob Sobers almost 2 years

I understand that a stream is a representation of a sequence of bytes. Each stream provides means for reading and writing bytes to its given backing store. But what is the point of the stream? Why isn't the backing store itself what we interact with?

For whatever reason this concept just isn't clicking for me. I've read a bunch of articles, but I think I need an analogy or something.
Jon Skeet over 15 years

That's useful, certainly, but I wouldn't say it's the "whole point". Even without chaining it's useful to have a common abstraction.
Craig over 15 years

Yeah, it allows you to interchange the type of stream without breaking your code. For example, you could read in from a file on one call and then a memory buffer on the next.
alxp over 15 years

I would add that the reason you would want to do this is that often you don't need file seek capability when reading or writing a file, and thus if you use a stream that same code can easily be used to read from or write to a network socket, for example.
vava over 15 years

Yeah, you're right. I've change the words to make this clear.
Rob Sobers over 15 years

I'm actually currently on chapter 1 of SICP. Thanks!
java.is.for.desktop about 14 years

Abstract thinking (and explaining) seems to be in your blood ;) Your analogy to water (and thus metaphorical references) reminded me of Omar Khayyam.
Rushino over 11 years

@HosamAly Your explanation is very clear but something confuse me a bit in the sample code. The explicit conversion from string to int is done automatically doing ReadInt ? i believe i could do ReadString too ?
Hosam Aly over 11 years

@Rushino There are no conversions in the code above. The method ReadInt is defined at the very top using int.Parse, which receives the string returned from reader.ReadLine() and parses it. Of course you could create a similar ReadString method. Is this clear enough?
user137717 almost 9 years

Don't the different types of stream readers used in @HosamAly example above imply that you do know what the backing store is? I take it FileStream, NetworkStream etc... are reading from those types of sources. Additionally, are there cases where you don't know what the backing store might be and that would be dynamically chosen while the program runs? I just haven't personally come across this and would like to know more.
user137717 almost 9 years

Also, can streams pipe data through some process as data is generated or do I need access to the full dataset I want to operate on when I begin the process?
Jon Skeet almost 9 years

@user137717: No, if you just take a StreamReader - or better, a TextReader then your code doesn't know what kind of stream underlies the data flow. Or rather, it can use the BaseStream property to find out the type - but it may be a type that your code has never seen before. The point is that you shouldn't care. And yes, you can absolutely end up writing code which will sometimes be used for a network stream and sometimes be used for a file stream. As for streams piping data through a process - well that wouldn't be done inside the process... it would be the stream provider.
象嘉道 about 8 years

one would like to tell SICP stream from others. an important feature of SICP stream is laziness, while the generic stream concept emphasizes the abstraction on data sequences.
Felype almost 7 years

Well put. Streams to me are the most simple and powerful generic abstractions in the entirety of programming. Having .net basic Stream.Copy makes life so much easier in a lot of applications.
Richie Thomas over 6 years

Great example of analogy-as-explanation.