Introduction
When using MongoDB, you have the ability to be flexible with the structure of your data. You are not locked into maintaining a certain schema that all of your documents must fit into. For any given field in a document, you are able to use any of the available data types supported by MongoDB. Despite this default way of working, you are able to impose a JSON Schema in MongoDB to add validation on your collections if desired. We won't go into the details of schema design in this guide, but it can have an effect on data typing if implemented.
Data types specify a general pattern for the data they accept and store. It is paramount to understand when to choose a certain data type over another when planning your database. The type chosen is going to dictate how you're able to operate on your data and how it is stored.
JSON and BSON
Before getting into the details of specific data types, it is important to have an understanding of how MongoDB stores data. MongoDB and many other document-based NoSQL databases use JSON (JavaScript Object Notation) to represent data records as documents.
There are many advantages to using JSON to store data. Some of them being:
- easiness to read, learn, and its familiarity among developers
- flexibility in format, whether sparse, hierarchical, or deeply nested
- self-describing, which allows for applications to easily operate with JSON data
- allows for the focus on a minimal number of basic types
JSON supports all the basic data types like string, number, boolean, etc. MongoDB actually stores data records as Binary-encoded JSON (BSON) documents. Like JSON, BSON supports the embedding of documents and arrays within other documents and arrays. BSON allows for additional data types that are not available to JSON.
What are the data types in MongoDB?
Before going into detail, let's take a broad view of what data types are supported in MongoDB.
MongoDB supports a range of data types suitable for various types of both simple and complex data. These include:
Text
String
Numeric
32-Bit Integer
64-Bit Integer
Double
Decimal128
Date/Time
Date
Timestamp
Other
Object
Array
Binary Data
ObjectId
Boolean
Null
Regular Expression
JavaScript
Min Key
Max Key
In MongoDB, each BSON type has both an integer and string identifiers. We'll cover the most common of these in more depth throughout this guide.
With MongoDB's document model, you are able to store data in embedded documents. Check out how you can find, create, update, and delete composite types with the Prisma Client, and model your data by defining fields in your Prisma Schema.
String types
The string type is the most commonly used MongoDB data type. Any value written inside of double quotes ""
in JSON is a string value. Any value that you want to be stored as text is going to be best typed as a String
. BSON strings are UTF-8 and are represented in MongoDB as:
Type | Number | Alias |------------------ | ------ | -------- |String | 2 | "string" |
Generally, drivers for programming languages will convert from the language's string format to UTF-8 when serializing and deserializing BSON. This makes BSON an attractive method for storing international characters with ease for example.
Inserting a document with a String
data type will look something like this:
db.mytestcoll.insertOne({first_name: "Alex"}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d15")}
Querying the collection will return the following:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d15"),first_name: "Alex"}
Using the $type
operator
Before moving on to our next data type, it is important to know how you can type check the value before making any insertions. We'll use the previous example to demonstrate using the $type
operator in MongoDB.
Let's say it has been awhile since we worked with the mytestcoll
collection from before. We want to insert some additional documents into the collection with the first_name
field. To check that we used String
as the data type stored as a value of first_name
originally, we can run the following using either the alias or number value of the data type:
db.mytestcoll.find( { "first_name": { $type: "string" } } )
Or
db.mytestcoll.find( { "first_name": { $type: 2 } } )
Both queries return an output of all documents that have a String
value stored for first_name
from our previous section's insertion:
[ { _id: ObjectId("614b37296a124db40ae74d15"), first_name: "Alex" } ]
If you query for a type that is not stored in the first_name
field of any document, you will get no result. This indicates that it is another data type stored in first_name
.
You are also able to query for multiple data types at a time with the $type
operator like the following:
db.mytestcoll.find( { "first_name": { $type: ["string", "null"] } } )
Because we did not insert any Null
type values into our collection, the result will be the same:
[ { _id: ObjectId("614b37296a124db40ae74d15"), first_name: "Alex" } ]
You can use the same method with all of the following types that we will discuss.
Numbers and numeric values
MongoDB includes a range of numeric data types suitable for different scenarios. Deciding which type to use depends on the nature of the values you plan to store and your use cases for the data. JSON calls anything with numbers a Number. That forces the system to figure out how to turn it into the nearest native data type. We'll start off by exploring integers and how they work in MongoDB.
Integer
The Integer
data type is used to store numbers as whole numbers without any fractions or decimals. Integers can either be positive or negative values. There are two types in MongoDB, 32-Bit Integer
and 64-Bit Integer
. They can be represented in the two ways depicted in the table below, number
and alias
:
Integer type | number | alias |------------ | ----- | ------------ |`32-bit integer`| 16 | "int" |`64-bit integer`| 18 | "long" |
The ranges a value can fit into for each type are the following:
Integer type | Applicable signed range | Applicable unsigned range |------------ | ------------------------------ | ------------------------------- |`32-bit integer`| -2,147,483,648 to 2,147,483,647| 0 to 4,294,967,295 |`64-bit integer`| -9,223,372,036,854,775,808 to | 0 to 18,446,744,073,709,551,6159,223,372,036,854,775,807
The types above are limited by their valid range. Any value outside of the range will result in an error. Inserting an Integer
type into MongoDB will look like the below:
db.mytestcoll.insertOne({age: 26}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d14")}
And finding the result will return the following:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d14"), age: 26}
As suggested by the names, a 32-Bit Integer
has 32 bits of integer precision which is useful for smaller integer values that you do not want to store as a sequence of digits. When growing in number size, you are able to bump up to the 64-Bit Integer
which has 64 bits of integer precision and fits the same use case as the former.
Double
In BSON, the default replacement for JSON's Number is the Double
data type. The Double
data type is used to store a floating-point value and can be represented in MongoDB like so:
Type | Number | Alias |------------------ | ------ | -------- |Double | 1 | "double" |
Floating point numbers are another way to express decimal numbers, but without exact, consistent precision.
Floating point numbers can work with a large number of decimals efficiently but not always exactly. The following is an example of entering a document with the Double
type into your collection:
db.mytestcoll.insertOne({testScore: 89.6}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d13")}
There can be slight differences between input and output when calculating with doubles that could potentially lead to unexpected behavior. When performing operations that require exact values MongoDB has a more precise type.
Decimal128
If you are working with very large numbers with lots of floating point range, then the Decimal128
BSON data type will be the best option. This will be the most useful type for values that require a lot of precision like in use cases involving exact monetary operations. The Decimal128
type is represented as:
Type | Number | Alias |------------------ | ------ | --------- |Decimal128 | 19 | "decimal" |
The BSON type, Decimal128
, provides 128 bits of decimal representation for storing numbers where rounding decimals exactly is important. Decimal128
supports 34 decimal digits of precision, or a sinificand with a range of -6143 to +6144. This allows for a high amount of precision.
Inserting a value using the Decimal128
data type requires using the NumberDecimal()
constructor with your number as a String
to keep MongoDB from using the default numeric type, Double
.
Here, we demonstrate this:
db.mytestcoll.insertOne({price : NumberDecimal("5.099")}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d12")}
When querying the collection, you then get the following return:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d12"),price: "5.099"}
The numeric value maintains its precision allowing for exact operations. To demonstrate the Decimal128
type versus the Double
, we can go through the following exercise.
How precision can be lost based on data type
Say we want to insert a number with many decimal values as a Double
into MongoDB with the following:
db.mytestcoll.insertOne({ price: 9999999.4999999999 }){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d24")}
When we query for this data, we get the following result:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d24"),price: 9999999.5}
This value rounds up to 9999999.5
, losing its exact value that we inputted it with. This makes Double
ill-suited for storage of numbers with many decimals.
The next example demonstrates where precision will get lost when passing a Double
implicitly with Decimal128
instead of a String
like in the previous example.
We start by inserting the following Double
again but with NumberDecimal()
to make it a Decimal128
type:
db.mytestcoll.insertOne({ price: NumberDecimal( 9999999.4999999999 ) }){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d14")}
Note: When making this insertion in the MongoDB shell, the following warning message is displayed:
Warning: NumberDecimal: specifying a number as argument is deprecated and may lead toloss of precision, pass a string instead
This warning message indicates that the number you are trying to pass could be subject to a loss of precision. They suggest to use a String
using NumberDecimal()
so that you do not lose any precision.
If we ignore the warning and insert the document anyways, the loss of precision is seen in the query results from the rounding up of the value:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d14"),price: Decimal128("9999999.50000000")}
If we follow the recommended NumberDecimal()
approach using a String
we'll see the following results with maintained precision:
db.mytestcoll.insertOne({ price: NumberDecimal( "9999999.4999999999" ) } )
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d14"),price: Decimal128("9999999.4999999999")}
For any use case requiring precise, exact values, this return could cause issues. Any work involving monetary operations is an example where precision is going to be extremely important and having exact values is critical to accurate calculations. This demonstration highlights the importance of knowing which numeric data type is going to be best suited for your data.
Date
The BSON Date
data type is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This data type stores the current date or time and can be returned as either a date object or as a string. Date
is represented in MongoDB as follows:
Type | Number | Alias |------------------ | ------ | ------------ |Date | 9 | "date" |
Note: BSON Date
type is signed. Negative values represent dates before 1970.
There are three methods for returning date values.
Date()
- returns a stringnew Date()
- returns a date object using theISODate()
wrapperISODate()
- also returns a date object using theISODate()
wrapper
We demonstrate these options below:
var date1 = Date()var date2 = new Date()var date3 = ISODate()db.mytestcoll.insertOne({firstDate: date1, secondDate: date2, thirdDate: date3}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d22")}
And when returning:
db.mytestcoll.find().pretty(){"_id" : ObjectId("614b37296a124db40ae74d22"),firstDate: 'Tue Sep 28 2021 11:28:52 GMT+0200 (Central European Summer Time)',secondDate: ISODate("2021-09-28T09:29:01.924Z"),thirdDate: ISODate("2021-09-28T09:29:12.151Z")}
Timestamp
There is also the Timestamp
data type in MongoDB for representing time. However, Timestamp
is going to be most useful for internal use and is not associated with the Date
type. The type itself is a sequence of characters used to describe the date and time when an event occurs. Timestamp
is a 64 bit value where:
- the most significant 32 bits are
time_t
value (seconds since Unix epoch) - the least significant 32 bits are an incrementing
ordinal
for operations within a given second
Its representation in MongoDB will look as follows:
Type | Number | Alias |------------------ | ------ | ------------ |Timestamp | 17 | "timestamp" |
When inserting a document that contains top-level fields with empty timestamps, MongoDB will replace the empty timestamp value with the current timestamp value. The exception to this is if the _id
field contains an empty timestamp. The timestamp value will always be inserted as is and not replaced.
Inserting a new Timestamp
value in MongoDB will use the new Timestamp()
function and look something like this:
db.mytestcoll.insertOne( {ts: new Timestamp() });{"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d23")}
When querying the collection, you'll return a result resembling:
db.mytestcoll.find().pretty(){"_id" : ObjectId("614b37296a124db40ae74d24"),"ts" : Timestamp( { t: 1412180887, i: 1 })}
Object
The Object
data type in MongoDB is used for storing embedded documents. An embedded document is a series of nested documents in key: value
pair format. We demonstrate the Object
type below:
var classGrades = {"Physics": 88, "German": 92, "LitTheoery": 79}db.mytestcoll.insertOne({student_name: "John Smith", report_card: classGrades}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d18")}
We can then view our new document:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d18"),student_name: 'John Smith',report_card: {Physics: 88, German: 92, LitTheoery: 79}}
The Object
data type optimizes for storing data that is best accessed together. It provides some efficiencies around storage, speed, and durability as opposed to storing each class mark, from the above example, separately.
Binary data
The Binary data
, or BinData
, data type does exactly what its name implies and stores binary data for a field's value. BinData
is best used when you are storing and searching data, because of its efficiency in representing bit arrays. This data type can be represented in the following ways:
Type | Number | Alias |------------------ | ------ | ------------ |Binary data | 5 | "binData" |
Here is an example of adding some Binary data
into a document in a collection:
var data = BinData(1, "111010110111100110100010101")db.mytestcoll.insertOne({binaryData: data}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d20")}
To then see the resulting document:
db.mytestcoll.find().pretty(){"_id" : ObjectId("614b37296a124db40ae74d20"),"binaryData" : BinData(1, "111010110111100110100010101")}
ObjectId
The ObjectId
type is specific to MongoDB and it stores the document's unique ID. MongoDB provides an _id
field for every document. ObjectId is 12 bytes in size and can be represented as follows:
Type | Number | Alias |------------------ | ------ | ------------ |ObjectId | 7 | "objectId" |
ObjectId consists of three parts that make up its 12-byte makeup:
- a 4-byte timestamp value, representing the ObjectId's creation, measured in seconds since the Unix epoch
- a 5-byte random value
- a 3-byte incrementing counter initialized to a random value
In MongoDB, each document within a collection requires a unique _id
to act as a primary key. If the _id
field is left empty for an inserted document, MongoDB will automatically generate an ObjectId for the field.
There are several benefits to using ObjectIds for the _id
:
- in
mongosh
(MongoDB shell), the creation time of theObjectId
is accessible using theObjectId.getTimestamp()
method. - sorting on an
_id
field that storesObjectId
data types is a close equivalent to sorting by creation time.
We've seen ObjectIds throughout the examples so far, and they will look similar to this:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d19")}
Note: ObjectId values should increase over time, however they are not necessarily monotonic. This is because they:
- Only contain one second of temporal resolution, so values created within the same second do not have guaranteed ordering
- values are generated by clients, which may have differing system clocks
Boolean
MongoDB has the native Boolean
data type for storing true and false values within a collection. Boolean
in MongoDB can be represented as follows:
Type | Number | Alias |------------------ | ------ | ------------ |Boolean | 8 | "bool" |
Inserting a document with a Boolean
data type will look something like the following:
db.mytestcoll.insertOne({isCorrect: true, isIncorrect: false}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d21")}
Then when searching for the document the result will appear as:
db.mytestcoll.find().pretty(){"_id" : ObjectId("614b37296a124db40ae74d21")"isCorrect" : true,"isIncorrect" : false}
Regular Expression
The Regular Expression
data type in MongoDB allows for the storage of regular expressions as the value of a field. MongoDB uses PCRE (Perl Compatible Regular Expression) as its regular expression language.
Its can be represented in the following way:
Type | Number | Alias |------------------ | ------ | ------- |Regular Expression | 11 | "regex" |
BSON allows you to avoid the typical "convert from string" step that is commonly experienced when working with regular expressions and databases. This type is going to be most useful when you are writing database objects that require validation patterns or matching triggers.
For instance, you can insert the Regular Expression
data type like this:
db.mytestcoll.insertOne({exampleregex: /tt/}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d16")}db.mytestcoll.insertOne({exampleregext:/t+/}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d17")}
This sequence of statements will add these documents to your collection. You are then able to query your collection to find the inserted documents:
db.mytestcoll.find().pretty(){_id: ObjectId("614b37296a124db40ae74d16"), exampleregex: /tt/,_id: ObjectId("614b37296a124db40ae74d17"), exampleregex: /t+/}
The regular expression patterns are stored as regex and not as strings. This allows you to query for a particular string and get returned documents that have a regular expression matching the desired string.
JavaScript (without scope)
Much like the previously mentioned Regular Expression
data type, BSON allows for MongoDB to store JavaScript functions without scope as their own type. The JavaScript
type can be recognized as follows:
Type | Number | Alias |------------------ | ------ | ------------ |JavaScript | 13 | "javascript" |
Adding a document to your collection with the JavaScript
data type will look something like this:
db.mytestcoll.insertOne({jsCode: "function(){var x; x=1}"}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d122")}
This functionality allows you to store JavaScript functions inside of your MongoDB collections if needed for a particular use case.
Note: With MongoDB version 4.4 and above, an alternative JavaScript type, the JavaScript with Scope
data type, has been deprecated
Conclusion
In this article, we've covered most of the common data types that are useful when working with MongoDB databases. There are additional types that are not explicitly covered in this guide that may be helpful depending on the use case. Getting started by knowing these types covers most use cases. It is a strong foundation to begin modelling your MongoDB database.
It is important to know what data types are available to you when using a database so that you're using valid values and operating on the data with expected results. There are risks you can run into without properly typing your data like demonstrated in the Double
versus Decimal128
exercise. It's important to think about this ahead of committing to any given type.
If you are interested in checking out Prisma with a MongoDB database, you can check out the data connector documentation.
If you're using MongoDB, checkout Prisma's MongoDB connector! You can use the Prisma Client to manage production MongoDB databases with confidence.
To get started working with MongoDB and Prisma, checkout our getting started from scratch guide or how to add to an existing project.
FAQ
Yes, MongoDB allows for the storage of regular expression with the Regular Expression
data type.
For instance, you can insert the Regular Expression
data type like this:
db.mytestcoll.insertOne({exampleregex: /tt/}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d16")}db.mytestcoll.insertOne({exampleregext:/t+/}){"acknowledged": true,"insertedId": ObjectId("614b37296a124db40ae74d17")}
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix Epoch.
There are three methods for returning date values.
Date()
- returns a string
new Date()
- returns a date object using the ISODate() wrapper
ISODate()
- also returns a date object using the ISODate() wrapper
The returned data's display depends on which method it is stored.
Yes, MongoDB supports both single document structure and multi-document structure for transactional data.
An API and other operations are also available to use with transactional data.
Yes, MongoDB can store arrays with the BSON Array type. It is represented in MongoDB as follows:
Type | Number | Alias |------------------ | ------ | ------------ |Array | 4 | "array" |