What is MongoDB?
In this blog post, I’ll provide an overview of MongoDB and some of its key features. MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling.
Its document-oriented data model makes it easier to split up data across multiple servers. MongoDB automatically takes care of balancing data and load across a cluster, redistributing documents automatically, and routing user requests to the correct machines.
What is Semi-Structured Data?
Semi-structured data is data that does not conform with the formal structure of data. The semi-structured model is a database model where there is no separation between the data and the schema.
What is a Document-Oriented Database?
A document-oriented database is software designed for storing and retrieving information in the form of semi-structured data, also known as documents. It replaces the concept of a “row.” By allowing embedded documents and arrays, the document-oriented approach makes it possible to represent complex hierarchical relationships with a single record.
Key Features
- High Performance: MongoDB provides high performance data persistence. It supports embedded data models to reduce I/O activity on a database system, as well as indexes for faster queries, and can include keys from embedded documents and arrays
- High Availability: To provide high quality availability, MongoDb’s replication facility (known as replica sets) provide both automatic failover and data redundancy. A replica set is a group of MongoDB servers that maintain the same data set and provide both redundancy and increased data availability
- Automatic Scaling: MongoDB provides horizontal scalability as part of its core functionality. Automatic sharding distributes data across a cluster of machines, while replica sets can provide eventually-consistent reads for low-latency deployments
Core Components of MongoDB
- Mongod: As the primary daemon process for the MongoDB system, Mongod handles data requests, manages data access, and performs background management operations
- Mongos: Mongos is utilized as a routing service for MongoDB Shard configurations. This component processes queries from the application layer and determines the data’s location in the sharded cluster so as to complete the commanded operations. From the perspective of the application, a Mongos instance behaves identically to any other MongoDB instance
- Mongo: Mongo is an interactive JavaScript shell interface for MongoDB and provides an interface to test queries and operations directly within the database. Mongo also provides a fully functional JavaScript environment to use with MongoDB
Document in MongoDB
A record in MongoDB is a document, which is a data structure composed of field and value pairs. The values of the fields may include other documents, arrays, and arrays of documents (a group of documents is a collection, such as a table in RDBMS).
BSON and MongoDB
BSON is a binary-encoded serialization of JSON-like documents and is designed to be lightweight, traversable, and efficient. BSON, like JSON, supports the embedding of objects and arrays within other objects and arrays. MongoDB uses BSON as the data storage and network transfer format for its documents.
Sample Representation of a Document in MongoDB
Document:
Referenced Documents:
Embedded Documents:
DB Objects Comparison
SQL Objects | MongoDB Objects |
Database (schema) | Database |
Table | Collection |
Index | Index |
Row | Document |
Column | Field |
Joining | Linking & Embedding |
Partition | Shard |
Replication in MongoDB
Replication is the practice of keeping identical copies of data on multiple servers to keep applications running and data safe. There are two set-up designs within MongoDB: a Replica Set with Replication Cluster and a Replica Set with Arbiter. All replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not return within ten seconds, the other servers mark the unresponsive server as inaccessible.
- Replica Set with Replication Cluster: A cluster of MongoDB servers that implements master-subordinate replication for automated failover. When the primary server is unavailable, it triggers an election for one of the remaining secondary servers to act as the new primary server.
- Replica Set with Arbiter: Arbiter is a Mongod instance where a server is in the replica set but does not hold data. The arbiter participates in elections as a tie-breaker if a replica set has an even number of servers
Sharding in MongoDB
MongoDB’s sharding is the ability to break up a collection in to subsets of data to store them across multiple shards. This allows the application to grow beyond the resource limits of a standalone server or replica set.
Basic Recommended Sharded Cluster for MongoDB
- Sharded cluster: The router (Mongos process) acts as a gatekeeper for all requests coming from the client. It resolves the requests by using the configuration servers to get the metadata information of each shard, then routes the query to the appropriate shards, and finally combines the results given by the shard(s) to return to the client
- Router in sharded cluster: As shown above, more than one Mongos process runs as a router. If utilizing more than one router, both should be configured under a load balancer for an efficient production environment
- Configuration server in sharded cluster: The Mongod process runs as a configuration server that keeps meta data, meaning it makes the decision of which data should be extracted from which shard
- Shard in sharded cluster: Each shard is essentially a replica set that holds a portion of data. To avoid losing any data, it maintains a copy of its data on all secondary servers. In this sense, a collection in MongoDB is stored in multiple parts across multiple shards; to access the full collection, the data must be retrieved from all shards, as shown below
Difference Between Shard and Replication
The shard provides the ability to partition the data and store it across multiple servers. This increases the hardware capacity of the cluster, meaning that resources are not limited to a single machine. Replication, on the other hand, is a duplicate copy of the data in full to be used in the event of a hardware failure.
Default Ports for MongoDB
Default Port | Description |
27017 | The default port for Mongod and Mongos instances. Change this port with port or -port |
27018 | The default port when running with the -shardsvr runtime operation or the shardsvr value for the clusterRole setting in a configuration file |
27019 | The default port when running with the -configsvr runtime operation or the configsvr value for the clusterRole setting in a configuration file |
28017 | The default port for the web status page. The web status page is always accessible at a port number that is 1000 times greater than the determining port |
Install/Uninstall MongoDB at RedHat Linux
Use the following steps to set up a single machine MongoDB database server for the developer/test environment
Mongo Installation on Linux
- Copy the following command and run it on a Linux shell to create a yum repository for Mongo V3.0
echo '[mongodb-org-3.0] name=MongoDB Repository baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.0/x86_64/ gpgcheck=0 enabled=1' >/etc/yum.repos.d/mongodb-org-3.0.repo
- Run the following command to install Mongo
sudo yum install -y mongodb-org
- Run the following command to edit the mongod.config file. To change the value of the bind_ip attribute, set the Linux machine’s IP address to a value of this attribute, then press [ctrl+x], then press [Y]
nano /etc/mongod.conf
- Run the following command to run the Mongod service and to login to MongoDB
service mongod start mongo <IPADDRESS OF LINUX MACHINE> 27017
To uninstall Mongo, use the following code:
yum remove mongo-10gen mongo-10gen-server
Important Files/Paths in Mongo
/var/lib/mongo | Default Mongo database path |
/etc/mongod.conf | Mongo Configuration file |
/var/run/mongodb/mongod.pid | Mongo pid file path |
/var/log/mongodb/mongod.log | Mongo log path |
Enable Authentication
Authentication is the process of proving the identity of a user. Use the following script to create two users, one with root role permissions and one with readwrite role permissions. The root role user will hold top level permissions in MongoDB, while the readwrite role permissions allow the user to perform CRUD on collections. For more details on built-in roles in Mongo, visit the official reference manual.
- Root role user: Apply the following command in the Mongo shell to create a new user (“rootuser” with a password “12345”) who has a root role on the admin database
use admin db.createUser({user:"rootuser",pwd:"12345", roles:[{role:"root",db:"admin"}]})
- Readwrite user: Apply the following command in the Mongo shell to create a new user (“webuser” with password “12345”) who has readwrite permissions on the application database (“companydb” used in this example)
use companydb db.createUser({user:"webuser", pwd:"12345", roles:[ "readWrite" ]})
Basic MongoDB Commands
Apply the following command on the Linux shell to connect to the MongoDB with the Mongo shell:
mongo IPAddress:PORT (e.g. mongo 192.168.1.10:27017)
SN# | Command | Purpose |
1 |
. db |
Show current database |
2 |
show dbs show databases |
Show all databases |
3 |
use <databasename> |
Switch to any database or create new database |
4 |
db.createCollection("Movies") |
Create collection |
5 |
db.getCollectionNames() show collections |
Get collection names |
6 |
db.Movies.insert({"Title": "Titanic","LeadActor": "Lionardo","LeadActress": "Kate Winslet", "Genre": ["Action","Family"]}) db.Movies.insert([ {"Title": "Focus","Genre": ["Action"]}, {"Title": "Fright Night","Genre": ["Horror"]} ]) |
Insert single or multiple documents into collection |
7 |
db. Movies.update({"Title": "Focus"},{$set:{"Genre": ["Action","Drama"]}}) By default, Mongo will update the first document that comes up under the specific search criteria. To update multiple documents, use the following commands db. Movies.update({"Title": "Focus"},{$set:{"Genre": ["Action","Drama"]}} ,{multi:true}) |
Update document(s) in “Movies” collection |
8 |
db.COLLECTION_NAME.save() |
Insert/update any document |
9 |
db.COLLECTION_NAME.remove() |
Remove documents that fit certain search criteria |
10 |
db.COLLECTION_NAME.drop() |
Drop collection |
11 |
db.COLLECTION_NAME.find() db.Movies.find({"Title":"Titanic"}) |
Find documents that fit specific search criteria |
12 |
db.COLLECTION_NAME.findOne() |
Find the first document that fits specific search criteria |
13 |
db.COLLECTION_NAME.find().limit(1) |
Limit number of documents displayed that fit certain search criteria |
14 |
db.COLLECTION_NAME.find().limit(10).skip(1) |
Skip a certain number of documents displayed in output |
15 |
db.COLLECTION_NAME.find().limit(1).pretty() |
Display formatted output |
16 |
db.serverCmdLineOpts() |
Return documents that report on arguments or configuration options used to start Mongod or Mongos instance |
17 |
db.COLLECTION_NAME.find().sort({"FIELDNAME":1}) |
Sort documents on basis of any field and pass value. Use 1 for ascending order and -1 for descending order |
18 |
db.COLLECTION_NAME.drop() |
Drop collection |
HTTP Status Page
MongoDB provides a web interface that exposes diagnostic and monitoring information on a simple webpage. The web interface is available at the 28017 port: https://IP-Address:28017. If the Mongod process is not running on its default port, add 1000 in the Mongod process port.
Configuration to Enable HTTP Status Page
Run the following command on the Linux shell to open mongod.config. Then find the httpinterface and rest key and set the value as “true.”
nano /etc/mongod.conf
.Net Interaction with MongoDB (3.0.5)
MongoDB drivers for .net can be downloaded here. Below is the basic sample C# code, which will allow interaction with MongoDB. This code will enable users to read a Mongo document from a collection, as well as write to the Mongo collection itself.
class Program { static void Main(string[] args) { MongoDB mongo = new MongoDB(); mongo.WriteToCollection().Wait(); mongo.ReadFromCollection().Wait(); } } public class MongoDB { IMongoDatabase DB; public MongoDB() { string MongoConnectionString = "mongodb://192.168.85.128:27017/dnc"; IMongoClient _client = new MongoClient(MongoConnectionString); DB = _client.GetDatabase("dnc"); } public async Task WriteToCollection() { IMongoCollection<BsonDocument> _collection = DB.GetCollection<BsonDocument>("Employee"); BsonDocument _document = new BsonDocument(); List<BsonDocument> Documents = new List<BsonDocument>(); BsonElement _elementEmpCode = new BsonElement("EmpCode", "1001"); BsonElement _elementEmpName = new BsonElement("EmpName", "Goofy"); _document.Add(_elementEmpCode); _document.Add(_elementEmpName); await _collection.InsertOneAsync(_document); } public async Task ReadFromCollection() { IMongoCollection<BsonDocument> _collection = DB.GetCollection<BsonDocument>("Employee"); FilterDefinition<BsonDocument> filter = Builders<BsonDocument>.Filter.Eq("EmpCode", "1001"); List<BsonDocument> Result = await _collection.Find(filter).ToListAsync(); foreach (var document in Result) { List<BsonElement> element = document.Elements.ToList(); element.ForEach(delegate(BsonElement obj) { Console.WriteLine(obj.Name + ":" + obj.Value); }); } } }
Stay in Touch
Keep your competitive edge – subscribe to our newsletter for updates on emerging software engineering, data and AI, and cloud technology trends.