Global Cluster Sharding Reference¶
On this page
The following sections describe sharding behavior and how to enable sharding for Global Clusters.
Sharding Collections for Global Writes¶
Unsharded collections must meet the following compatibility requirements prior to sharding to utilize Global Writes when sharded:
- Every document in the collection must include a
location
field. - The value of the
location
field must be either an ISO-3166-1 alpha 2 country code ("US"
,"DE"
,"IN"
) or a supported ISO-3166-2 subdivision code ("US-DC"
,"DE-BE"
,"IN-DL"
). Documents that do not match this criteria can't be routed to any shard in the cluster. To view the complete list of currently supported country or subdivision codes, visit https://cloud.mongodb.com/static/atlas/country_iso_codes.txt.
For collections that meet the stated requirements, you must shard the collection using the following pattern:
{ "location" : 1, "<secondary_field>" : 1 }
A shard key on the location
field alone may result in bottlenecks,
especially for workloads where a subset of countries or subdivisions
receive the majority of write operations. Atlas Global Writes
requires a compound shard key to
facilitate efficient distribution of sharded data across the cluster.
Atlas Global Cluster shard keys share the same restrictions
as MongoDB shard keys. For example, the secondary shard key field
cannot be an array.
To learn more about:
- How to choose a secondary shard key field and the effect of shard key choice on data distribution, see Choosing a Shard Key.
- What shard key limitations are, see Shard Key Limitations.
After sharding, what you can modify depends upon the version of MongoDB that you run:
MongoDB Version | Modify Shard Key Keys | Modify Shard Key Values |
---|---|---|
MongoDB 4.0 | No | No |
MongoDB 4.2 | No | Yes |
MongoDB 4.4 | Yes, add fields to Key using Data Explorer only | Yes |
MongoDB 5.0 | Yes, add fields to Key using Data Explorer only | Yes |
The Atlas Perform CRUD Operations in Atlas supports creating sharded collections with specific validations for Global Writes.
To learn more, see Shard a Global Collection for Global Writes in Data Explorer.
You can also use mongosh
to execute the
sh.shardCollection()
. After sharding
the collection, you must use the Atlas Perform CRUD Operations in Atlas
to enable Global Writes for that collection.
To learn more about sharding collections via the Data Explorer, see Shard a Global Collection for Global Writes in Data Explorer.
Error Handling¶
If Atlas encounters an error while sharding a collection for global writes, a message appears in the banner at the top of the screen.
Click See Details to learn about the error and the namespace where the error occured. A modal window appears with the complete error message and a Fix Now button:
- Click the Close and navigate to the collection in Data Explorer. You can also click the Fix Now button to go to Data Explorer for that Atlas cluster.
Click the Global Writes tab for the collection mentioned in the error message. You will see an error similar to the following:
- Click Unmanage Collection to cancel the global writes
sharding operation. You must have the
Project Data Access Admin
role to cancel the sharding operation.
After making any necessary changes to the collection as indicated by the error message, you can start the sharding process again.
Possible errors include:
- An index already exists on the custom shard key.
- If the field chosen as the second part of the compound shard key is already indexed, the sharding operation may fail.
- The shard key field is not present.
- All documents in the collection must contain both the shard key fields. This error occurs only in versions earlier than MongoDB 4.4.
- The collection is already sharded.
- If the collection has already been manually sharded, the operation fails.
- The collection has a custom default collation.
- A custom default collation on the collection may cause a sharding error.
Hashed Sharding Options¶
Shard keys use hashed sharding and pre-split data for even distribution. This is only available on Atlas clusters running MongoDB 4.4 and above.
If you enable the use of hashed index shard keys by selecting
Use hashed index as the shard key in the User
Interface or by setting
isCustomShardKeyHashed
through the API,
Atlas distributes the sharded data evenly by hashing the second
field of the shard key.
You can optionally select Pre-split data for even
distribution in the Atlas User Interface or set presplitHashedZone
using the API to
specify whether to perform initial chunk creation and distribution for
an empty or non-existing collection based on the defined zones and zone
ranges for the collection.
When creating a sharded collection using a compound hashed shard key for Global Clusters, Atlas creates at least 1 chunk per location code and attempts to distribute chunks evenly across shards in the cluster. By default, Atlas creates one chunk per location code.
You can also specify the minimum number of chunks to create initially
when sharding an empty collection with a hashed shard key using the Atlas Data Explorer
User Interface or by
setting numInitialChunks
parameter through the API.
If you specify the number of chunks per shard, Atlas creates at least the minimum number of chunks that you specified, with the same number of chunks per location code. If you specify the minimum number of chunks, Atlas sets up zoned sharding quickly, especially if you already know how to geographically distribute your data before sharding.
Global Cluster Write Operations¶
Each write to a sharded collection must include the shard key for the
operation to succeed. For each document in a write operation, MongoDB
uses the location
field of the shard key to determine the zone to
which to route the data. MongoDB selects a shard associated to that
zone as the target for writing the document, facilitating
geographically isolated and segmented data storage.
MongoDB can only guarantee this behavior for inserted documents that
meet the criteria defined in Sharding Collections for Global Writes.
Specifically, MongoDB can route a document whose location
field
doesn't conform to ISO-3166-1 alpha 2 or ISO-3166-2 to any shard in the
cluster.
Global Cluster Read Operations¶
MongoDB query routing depends on whether the read operation includes
the full shard key and that the location
value corresponds to a
supported ISO-3166-1 alpha 2 country code ("US"
, "DE"
, "IN"
)
or a supported ISO-3166-2 subdivision code ("US-DC"
,
"DE-BE"
, "IN-DL"
).
- For queries that do include the full shard key and whose
location
value meets the requirements for Global Writes, MongoDB targets the read operation to the zone which maps to thelocation
value or values specified in the query. - For read operations that don't include the
location
value , or if thelocation
value doesn't correspond to a supported ISO-3166-1 alpha 2 country code or ISO-3166-2 subdivision code, MongoDB must broadcast the read operation to every zone in the cluster. - For Global Writes zones which have Read-only nodes in
geographically distant regions, clients in those regions can query
the local Read-only node for that zone by specifying the
full shard key as part of the query and issuing the read
operation with a Read Preference
of
nearest
.
Secondary reads may return stale data depending on the level of replication lag between the secondary node and the primary.
To learn more about:
- MongoDB read preference, see Read Preference.
- MongoDB query routing, see mongos.
Sharding Collections without Global Writes¶
Global Writes clusters support the same Ranged and Hashed sharding strategies as a standard Atlas sharded cluster. For sharded collections whose shard keys and document schema do not support Global Writes, MongoDB distributes the sharded data evenly across the available shards in the cluster with respect to the chosen shard key. Consider using a separate sharded cluster for data that can't take advantage of Global Writes.
You can't modify a collection to support Global Writes after sharding. We recommend that you choose a shard key that will allow you to use Global Writes for a collection the future.
To learn more about Global Writes sharding requirements, see Sharding Collections for Global Writes.
Unsharded Collections in Global Write Clusters¶
Global Clusters provide the same support for unsharded
collections as a standard Atlas sharded cluster. For each database
in the cluster, MongoDB stores its unsharded collections on a
primary shard. Use sh.status()
from the
mongosh
to determine the primary shard for the database.