Whats the difference between objects and data structures?

26,173

Solution 1

The distinction between data structures and classes/objects is a harder to explain in Java than in C++. In C, there are no classes, only data structures, that are nothing more than "containers" of typed and named fields. C++ inherited these "structs", so you can have both "classic" data structures and "real objects".

In Java, you can "emulate" C-style data structures using classes that have no methods and only public fields:

public class VehicleStruct
{
    public Engine engine;
    public Wheel[] wheels;
}

A user of VehicleStruct knows about the parts a vehicle is made of, and can directly interact with these parts. Behavior, i.e. functions, have to be defined outside of the class. That's why it is easy to change behavior: Adding new functions won't require existing code to change. Changing data, on the other hand, requires changes in virtually every function interacting with VehicleStruct. It violates encapsulation!

The idea behind OOP is to hide the data and expose behavior instead. It focuses on what you can do with a vehicle without having to know if it has engine or how many wheels are installed:

public class Vehicle
{
    private Details hidden;

    public void startEngine() { ... }
    public void shiftInto(int gear) { ... }
    public void accelerate(double amount) { ... }
    public void brake(double amount) { ... }
}

Notice how the Vehicle could be a motorcycle, a car, a truck, or a tank -- you don't need to know the details. Changing data is easy -- nobody outside the class knows about data so no user of the class needs to be changed. Changing behavior is difficult: All subclasses must be adjusted when a new (abstract) function is added to the class.

Now, following the "rules of encapsulation", you could understand hiding the data as simply making the fields private and adding accessor methods to VehicleStruct:

public class VehicleStruct
{
    private Engine engine;
    private Wheel[] wheels;

    public Engine getEngine() { return engine; }
    public Wheel[] getWheels() { return wheels; }
}

In his book, Uncle Bob argues that by doing this, you still have a data structure and not an object. You are still just modeling the vehicle as the sum of its parts, and expose these parts using methods. It is essentially the same as the version with public fields and a plain old C struct -- hence a data structure. Hiding data and exposing methods is not enough to create an object, you have to consider if the methods actually expose behavior or just the data!

When you mix the two approaches, e.g. exposing getEngine() along with startEngine(), you end up with a "hybrid". I don't have Martin's Book at hand, but I remember that he did not recommend hybrids at all, as you end up with the worst of both worlds: Objects where both data and behavior is hard to change.

Your questions concerning HashMaps and Strings are a bit tricky, as these are pretty low level and don't fit quite well in the kinds of classes you will be writing for your applications. Nevertheless, using the definitions given above, you should be able to answer them.

A HashMap is an object. It exposes its behavior to you and hides all the nasty hashing details. You tell it to put and get data, and don't care which hash function is used, how many "buckets" there are, and how collisions are handled. Actually, you are using HashMap solely through its Map interface, which is quite a good indication of abstraction and "real" objects.

Don't get confused that you can use instances of a Map as a replacement for a data structure!

// A data structure
public class Point {
    public int x;
    public int y;
}

// A Map _instance_ used instead of a data structure!
Map<String, Integer> data = new HashMap<>();
data.put("x", 1);
data.put("y", 2);

A String, on the other hand, is pretty much an array of characters, and does not try to hide this very much. I guess one could call it a data structure, but to be honest I am not sure if much is to be gained one way or the other.

Solution 2

This is what, I believe, Robert. C. Martin was trying to convey:

  1. Data Structures are classes that simply act as containers of structured data. For example:

    public class Point {
        public double x;
        public double y;
    }
    
  2. Objects, on the other hand, are used to create abstractions. An abstraction is understood as:

    a simplification of something much more complicated that is going on under the covers The Law of Leaky Abstractions, Joel on Software

    So, objects hide all their underpinnings and only let you manipulate the essence of their data in a simplified way. For instance:

    public interface Point {
        double getX();
        double getY();
        void setCartesian(double x, double y);
        double getR();
        double getTheta();
        void setPolar(double r, double theta);
    }
    

    Where we don't know how the Point is implemented, but we do know how to consume it.

Solution 3

As I see it , what Robert Martin tries to convey, is that objects should not expose their data via getters and setters unless their sole purpose is to act as simple data containers. Good examples of such containers might be java beans, entity objects (from object mapping of DB entities), etc.

The Java Collection Framework classes, however, are not a good example of what he's referring to, since they don't really expose their internal data (which is in a lot of cases basic arrays). It provides abstraction that lets you retrieve objects that they contain. Thus (in my POV) they fit in the "Objects" category.

The reasons are stated by the quotes you added from the book, but there are more good reasons for refraining from exposing the internals. Classes that provide getters and setters invite breaches of the Law of Demeter, for instance. On top of that, knowing the structure of the state of some class (knowing which getters/setters it has) reduces the ability to abstract the implementation of that class. There are many more reasons of that sort.

Solution 4

An object is an instance of a class. A class can model various things from the real world. It's an abstraction of something (car, socket, map, connection, student, teacher, you name it).

A data structure is a structure which organizes certain data in a certain way. You can implement structures in ways different that by using classes (that's what you do in languages which don't support OOP e.g.; you can still implement a data structure in C let's say).

HashMap in java is a class which models a map data structure using hash-based implementation, that's why it's called HashMap.

Socket in java is a class which doesn't model a data structure but something else (a socket).

Solution 5

A data structure is only an abstraction, a special way of representing data. They are just human-made constructs, which help in reducing complexity at the high-level, i.e. to not work in the low-level. An object may seem to mean the same thing, but the major difference between objects and data structures is that an object might abstract anything. It also offers behaviour. A data structure does not have any behaviour because it is just data-holding memory.

The libraries classes such as Map, List,etc. are classes, which represent data structures. They implement and setup a data structure so that you can easily work with them in your programs by creating instances of them (i.e. objects).

Share:
26,173
jantristanmilan
Author by

jantristanmilan

Updated on June 12, 2020

Comments

  • jantristanmilan
    jantristanmilan almost 4 years

    I've been reading the book Clean Code: A Handbook of Agile Software Craftsmanship and in chapter six pages 95-98 it clarifies about the differences between objects and data structures:

    • Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.

    • Object expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects.

    • Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.

    I'm a tad bit confused whether some classes are objects or data structures. Say for example HashMaps in java.util, are they objects? (because of its methods like put(), get(), we dont know their inner workings) or are they data structures? (I've always thought of it as data structures because its a Map).

    Strings as well, are they data structures or objects?

    So far majority of the code I've been writing have been the so called "hybrid classes" which try to act as an object and a data structure as well. Any tips on how to avoid them as well?

  • peter.petrov
    peter.petrov about 10 years
    String is a class in Java. String - OK, you can consider it a data structure too if you think about it as an ordered sequence of chars. Or you can take String as something simpler that a data structure - just a data type in Java. Probably the first treatment is better (in your terms).
  • lurker
    lurker about 10 years
    @TristanMilan a "string" is a data structure (which could also be a class in some languages). In C, an array of ASCII values terminating in a null. In Pascal, a size followed by that number of ASCII values. Etc...
  • Bernhard Barker
    Bernhard Barker about 10 years
    I probably would've said "HashMap in Java is a class which models a hash table data structure". IMO a map is an abstract data type (ADT). Some view ADTs and data structures as disjoint (although others views the one as a class of the other). Although it sounds like the book defines ADTs as data structures, so, by that definition, your statement is probably fine.
  • Bernhard Barker
    Bernhard Barker about 10 years
    "All data structures are Objects, but not all Objects are data structures" is close enough to true, but you lost me with the rest of your answer. Visibility has absolutely nothing to do with whether or not something is a data structure. It may be bad practice (and/or break things) to make member variables of a DS class public, but many consider making member variables of any class public bad practice. And "[a data structure???] is usually private within another class" just sounds wrong. I wouldn't call the node of a tree a DS in itself. No idea what "Objects are the Eve class" means.
  • Ferdinand Beyer
    Ferdinand Beyer over 8 years
    Don't confuse "static" data structures as created with the struct keyword in C with "dynamic" data structures as represented by instances of HashMap. The question here is if the class HashMap should be considered an object or a data structure following the definition of the quoted book, not if you could use Map instances as a replacement of custom data structures declared with the struct or class keywords!
  • Ferdinand Beyer
    Ferdinand Beyer over 8 years
    The subtle point of Martin's book is that your interface Point still exposes its parts, instead of behavior. None of its methods allow you to do something meaningful with the point! I don't say that this is bad, a point might be a classic example when to prefer a data structure over an object, but this might not be the best example.
  • Ferdinand Beyer
    Ferdinand Beyer over 8 years
    Sorry, you are missing the point completely. The book and this question is more about the theory of Object-oriented programming, where the definition of an object is more than just "an instance of a class". Also, primitve types in Java are not objects! They can be "boxed" in objects, but a value of type int is something different than an object of type Integer!
  • Ferdinand Beyer
    Ferdinand Beyer over 8 years
    Well, the chapter in Clean Code describes when a class should be considered an object or a data structure, not whether it models a data structure, i.e. if its instances can be used as a data structure. There is a difference! The class HashMap should be considered an object since it exposes only behavior, not its internal data.
  • peter.petrov
    peter.petrov over 8 years
    @FerdinandBeyer I am not sure I get your point. Sure, any instance of any class is an object, but not every class models a data structure. E.g. the class Socket does not model a data structure. That was my point. Seems you just don't like the word "model" here for some reason.
  • Ferdinand Beyer
    Ferdinand Beyer over 8 years
    @peter.petrov I guess you have not read the chapter in "Clean Code" the question is referring to. The author does not define "object" as merely an instance of a class, but investigates the original OOP definition of an object hiding data and exposing behavior. A class with only public fields and no methods should be considered a data structure according to the book, similar to plain old C structs. A class modelling something that can be a replacement of a struct at runtime is not considered a data structure. A HashMap is not a data structure by the book's definition.
  • peter.petrov
    peter.petrov over 8 years
    @FerdinandBeyer I haven't read the book, you're right. So... OK, I see what you mean now.
  • sp1rs
    sp1rs over 8 years
    In simple words .. Data Structure has attributes whereas object has attributes and state
  • kay am see
    kay am see almost 6 years
    Great explanation @ferdinard ,thank you! I found out this post - hackernoon.com/objects-vs-data-structures-e380b962c1d2 which uses all your explanation in much detail example.
  • Ferdinand Beyer
    Ferdinand Beyer almost 6 years
    @kay am see: thank you for sharing the link, that is indeed a very good resource on this topic!
  • CShark
    CShark over 5 years
    Create a data-structure and then create a separate object class for each behaviour you want to expose. You can pass the data-structure into each object to operate on it as is required. You will end up with good separation between objects and data-structures. You will end up having many object classes, but you will also have proper separation of concerns!
  • Alphas Supremum
    Alphas Supremum over 4 years
    About mixing both, her's what uncle Bob have said in his book "The Clean Code": "Such hybrids make it hard to add new functions but also make it hard to add new data structures. They are the worst of both worlds. Avoid creating them. They are indicative of a muddled design whose authors are unsure of—or worse, ignorant of—whether they need protection from functions or types."
  • Naga
    Naga over 4 years
    I am trying to understand this from MVC pattern if the details encapsulated in Model is not known to controller ? how it can serialise the details inside Model and pass back to presenter ?
  • jaco0646
    jaco0646 about 4 years
    @FerdinandBeyer, the code is copied from Martin's book (page 94). I initially had the same reaction as you upon seeing getX() and getY() exposed. Martin says, "The beautiful thing is that there is no way you can tell whether the implementation is in rectangular or polar coordinates. It might be neither! And yet the interface still unmistakably represents a data structure. But it represents more than just a data structure. The methods enforce an access policy. You can read the individual coordinates independently, but you must set the coordinates together as an atomic operation."
  • Bita Mirshafiee
    Bita Mirshafiee almost 4 years
    Good examples of common data Structures we use in Object Oriented World as Uncle Bod said is DTOs.
  • Nazgul
    Nazgul about 3 years
    @FerdinandBeyer does it make a difference? whether the structure is static or dynamic is of no consequence IMO. a DS is a compile time or code time construct. We code it to hold some data. Actual runtime realization of that construct is what is an object. To interact with it in an executing context you need an object of it your static or dynamic construct is conceptual for a runtime execution context.