Skip to main content

Document Collection

A document collection is a NoSQL database that stores data in JSON format (JavaScript Object Notation). Unlike traditional Relational Database Management Systems, document databases do not require a schema or a pre-defined structure with fixed tables and attributes. This is why they are also known as “non-relational” databases.

A document is based on the concept of a “Key-Value” store. Every key has a corresponding value, different documents have unique keys which help with CRUD operations - Create, Read, Update, and Delete. No two documents can have common primary keys. Multiple documents gathered in one structure is known as a document collection.

Document Attributes

A document can contain attributes that each store a value. A value can either be atomic (number, string, Boolean, or null), or compound (array or embedded document or object). Arrays and sub-objects can contain all of these types, so a single document can contain nested data structures.

Each document has two identifying attributes: The _key identifies it within a single collection, and the document handle identifies it across the entire fabric. Additionally, the document revision attribute distinguishes individual revisions of a document. Transaction only ever see a single document revision.

For example:

{
"_id" : "myusers/3456789",
"_key" : "3456789",
"_rev" : "14253647",
"firstName" : "John",
"lastName" : "Doe",
"address" : {
"street" : "Road To Nowhere 1",
"city" : "Gotham"
},
"hobbies" : [
{"name": "swimming", "howFavorite": 10},
{"name": "biking", "howFavorite": 6},
{"name": "programming", "howFavorite": 4}
]
}

All documents contain special attributes:

You can specify a _key value when you create a document. _id and _key values are unchangeable once the document has been created. The _rev value is automatically updated.

Document Handle

A document handle is a string (_id) that identifies a document in the fabric database. The string value consists of the collection's name and the document's _key separated by a slash /.

Document Key

A document key is an attribute (_key) that identifies a document in its collection and is primarily used for querying. Each document's _key is unchangeable.

If you do not specify a key, one is automatically created. An automatic key is only unique within its collection or sharded collections in a cluster. Automatic keys might not be unique across different fabrics. Each collection has a keyOptions that can disallow user-specified keys completely or use a specific template for automatically creating keys.

Document Revision

A document revision (_rev) is the MVCC (Multiple Version Concurrency Control) token that specifies a revision of a document. Revisions are read-only.

The _rev string is a timestamp that uses the local clock of the database. If different servers in a cluster have a time skew, the timestamps will not be comparable. The database automatically verifies that messages are consistently timestamped on both servers.

If there is causality between events on different servers, timestamps increase from cause to effect. The server might occasionally use a timestamp in the future to maintain consistency.

GDN uses 64-bit unsigned integer values to maintain document revisions internally. We do not document the exact format of the revision values. When returning document revisions to clients, we put them into a string to verify that the revision is not clipped by clients that do not support large integers.

note

You can use the _rev attribute as a precondition for queries to avoid losing updates. If a client modifies a document locally without adjusting the revision value, then commits the changes after another user modifies the same document, the first user's operation is cancelled by the server. Otherwise, the first user would inadvertently overwrite the second user's changes.

Multiple Documents in Single Call

GDN APIs can handle multiple documents in a single command, dramatically reducing the client and server overhead. You can do this by performing operations on JSON arrays of objects instead of a single document. As a consequence, document keys, handles and revisions for preconditions have to be supplied embedded in the individual documents given. Multiple document operations are restricted to a single document or edge collection.

Monetary Data

GDN provides two ways to handle monetary data if you need to capture fractional units of currency and round decimals without precision loss.

  • Integer: You can use a general scale factor for digits up to 252 without precision loss. For example, if you set the scale factor to 100, GDN automatically converts 19.99 to 1999 before performing calculations and converting it back.

  • String: You can use strings if you only want to store and retrieve monetary data. You cannot perform calculations on monetary data in strings.

Data Retrieval

GDN provides the following methods of data retrieval:

  • Queries filter documents based on specified criteria, compute new data, and update or delete existing documents. Queries can be as simple as a "query by example" or as complex as "joins" using many collections or traversing graph structures. GDN queries are written in the C8 Query Language (C8QL).
  • Cursors are used to iterate over the result of queries, so that you get easily processable batches instead of one big hunk.
  • Indexes can speed up your searches. Refer to the Indexing section for more information.