Repository vs database vs filesystem

14,856

Solution 1

When I worked on repository software, many years ago. Back then, the difference between (general purpose) databases and repositories was the difference between "data" and "meta-data".

So, a database stores data. A repository is a special class of database which is designed to store meta-data, that is, data that describes other data.

Any general purpose database software could be used as a repository, but there are some characteristics of meta-data that make it desirable to use a special-purpose tool. Generally, the granularity of the data is small, with lots of cross-references to other data. The number of records is likely to be tractable. There is often a requirement for version control and/or diffs of the contents.

Because of these special requirements, database manufacturers were tempted into writing special DBMS systems to support the needs of repository builders. (Does anyone remember Microsoft Repository or the Unisys's UREP?) I am no longer in that field, and couldn't tell you about the progress in the past decade.

Solution 2

Repository is just a descriptive term the author's chose.

I'm not sure why you'd ask what it means. It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff".

**What makes repository different from database, filesystem or any other kind of storage? **

Nothing. It's storage. It's a filesystem. It's a database. It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff". They shortened it to "repository".

Usually, we reserve "filesystem" for the underlying OS features that give us persistent storage. A repository probably has some more organization than just random files. But it might not.

Usually, we reserve "database" for a discrete product that has a more formal API, a query language, and locking and some reliability features like backups and logs.

How can I exactly tell that this or that is repository judging by some set of features that it has or does not have?

You can't. Something is a repository because the folks that wrote the software decided to call it a "repository". The application developers could call anything a repository -- database, filesystem, individual file. Anything "stateful" can be a repository.

It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff".

it's not really clear what exact differences does it have

Why does that matter? Who actually cares? What problem do you have?

Why does it matter which files are a "repository", which files are a "database" and which files are just files?

You can have files that are a "backup" or a "vault". You can have files that are a "collection" or anything the developers want to call it.

They're free to use any descriptive term they want to replace "the file system locations in which we keep your stuff".

Solution 3

I would complement "Places where you can store something" with "... for you and other people to retrieve it". Or maybe reword that as "Places where you can store a collection of related things for you and other people to retrieve them." The meaning is really that generic.

In contrast, file system and database have more technical definitions: "In computing, a file system is a method of storing and organizing computer files and the data they contain to make it easy to find and access them". See the wikipedia entry. Database is a collection of logically related data structured in way that is easily accessed, managed, and updated.

Solution 4

My background is RIM. When I think Database, I think of an SQL structure or something similar. All the data elements. When I thing of a Repository, I think of storing scanned hardcopy documents, eDocuments, PDFs, Photos, voice and video files etc…

A DB is optimized for data. A Repository is optimized for storing objects.

Solution 5

From the perspective of a database designer, i tend to think of a database repository as a database used for holding the meta data of a database. for example, the relationships between tables, which programs access these tables etc so that this information can be used for judging the impact of change on your db application etc

Share:
14,856
altern
Author by

altern

I'm a software engineer, configuration manager, instructor, musician. My passion is software configuration management and related stuff: version control continuous integration build management deployment management dependency management merge management release management Check out my training dedicated to software configuration management. You can see presentation slides on my slideshare page.

Updated on June 15, 2022

Comments

  • altern
    altern almost 2 years

    What makes repository different from database, filesystem or any other kind of storage? How can I exactly tell that this or that is repository judging by some set of features that it has or does not have?

    When I say 'repository', first of all I mean version control. But there are other examples of repositories, such as digital libraries, for instance. There might be other examples, of course, but all of them would assume that repository is 'the place where you can store something'. But it's not really clear what exact differences does it have that allows to distinct it from other 'places where you can store something'.