Here I am providing the answers for FINAL exam M101J,which I found out upon solving.Hope you use it wisely, my point of discussing each of the questions is that everyone could check which point are they going wrong and yes I could also get a better solution than mine. So please use it as a extra check after you have solved the question once on your part so that the explanations benefit you the most.
Question 1 :
Here we need to query the enron dataset calculate the number of messages sent by Andrew Fastow, CFO, to Jeff Skilling, the president. Andrew Fastow's email addess was andrew.fastow@enron.com. Jeff Skilling's email was jeff.skilling@enron.com
So for this first we need to download the enron zip/tar and then import in the mongoDB database name enron and collection name messages . Command for import
mongoimport -d enron -c messages > enron.json
Now switch to mongo Shell commands:
use enron
db.messages.find({"headers.To":"andrew.fastow@enron.com","headers.From":"jeff.skilling@enron.com"}).count()
This will produce the answer as 3
Question 2:
Please use the Enron dataset you imported for the previous problem. For this question you will use the aggregation framework to figure out pairs of people that tend to communicate a lot. To do this, you will need to unwind the To list for each message.
The mongo shell command which will retrieve the desired answer would be
db.messages.aggregate([
{
$project: {
from: "$headers.From",
to: "$headers.To"
}
},
{
$unwind: "$to"
},
{
$group : { _id : { _id: "$_id", from: "$from", to: "$to" }
}
},
{
$group : { _id : { from: "$_id.from", to: "$_id.to" }, count: {$sum :1}
}
},
{
$sort : {count:-1}
},
{
$limit: 2
}
])
This would give you the top 2 communication , and check the top most which would turn out to be :
"result" : [
{
"_id" : {
"from" : "susan.mara@enron.com",
"to" : "jeff.dasovich@enron.com"
},
"count" : 750
},
{
"_id" : {
"from" : "soblander@carrfut.com",
"to" : "soblander@carrfut.com"
},
"count" : 679
}
],
"ok" : 1
So, it clearly shows the answer is "susan.mara@enron.com" to "jeff.dasovich@enron.com"
Question 3:
In this problem you will update a document in the Enron dataset to illustrate your mastery of updating documents from the shell. Please add the email address "mrpotatohead@10gen.com" to the list of addresses in the "headers.To" array for the document with "headers.Message-ID" of "<8147308.1075851042335.JavaMail.evans@thyme>"
For this there would be a simple update expression using mongo shell as :
db.messages.update({"headers.Message-ID":"<8147308.1075851042335.JavaMail.evans@thyme>"},{$addToSet:{"headers.To":"mrpotatohead@10gen.com"}})
Then run the validation code and get the validation code as : 897h6723ghf25gd87gh28
Question 4:
Enhancing the Blog to support viewers liking certain comments.
Here you need to work on the BlogPostDAO.java at the area marked as XXXXXX
postsCollection.update(new BasicDBObject("permalink", permalink), new BasicDBObject("$inc", new BasicDBObject("comments." + ordinal + ".num_likes", 1)));
Here in the above command we search the posts collection with the permanent link and increment the like counter by one for the comment which is clicked for like or in other words the ordinal or the order of the comment in the comments array, this ensures that the like is incremented for the comment clicked for like.
Doing this you could see that the like button starts working.
Now run the validator , and you will get the code as : 983nf93ncafjn20fn10f
Question 5 :
In this question a set of indexes are given and we have to select the indexes which might have been used , in execution of
As the Find portion searches on a,b and a,c and sorting is carries on c reverse order.
_id_ -- This index is not used either in sort or find clause of the operation
a_1_b_1 -- This index is used in the find operation as find is on a,b
a_1_c_1 -- This index is used in the find operation as find is on a,c
c_1 -- This index is also used, because there is a provision that a index is not utilized for the find operation but for the sort it is used as sort({'c':-1})
a_1_b_1_c_1 - This involves all the three a,b,c and this is also used as it can also be used as a valid index
Question 6
Suppose you have a collection of students of the following form:
Add an index on last_name, first_name if one does not already exist.
Set w=0, j=0 on writes
Remove all indexes from the collection
Provide a hint to MongoDB that it should not use an index for the inserts
Build a replica set and insert data into the secondary nodes to free up the primary nodes.
option 1 - As a fact adding index affects reading not writing so it would be indifferent with the indexing so not this option
Option 2 seems to be valid as when w=0 and j=0 is done for the writes no waiting is done at all are no wait is required to obtain as the write confirmations , simply the data is dumped without verification therefore speeding the writes
Option 3 removing indexes would actually help as it would reduce the load and speed up the writing process
Option 4 This seems absurd
Option 5 This is not actually possible as writes are not possible on the secondary nodes so not valid option
Question 7
You have been tasked to cleanup a photosharing database. The database consists of two collections, albums, and images. Every image is supposed to be in an album, but there are orphan images that appear in no album. Here are some example documents (not from the collections you will be downloading).
When you are done removing the orphan images from the collection, there should be 90,017 documents in the images collection.
In order to remove the Orphans talked I wrote a Java Program :
/**
*
* @author Ankur Gupta
*/
public class Test {
public static void main(String[] args) throws IOException {
MongoClient c = new MongoClient(new MongoClientURI("mongodb://localhost"));
DB db = c.getDB("finaltask");
int i =0;
DBCollection album = db.getCollection("albums");
DBCollection image = db.getCollection("images");
DBCursor cur = image.find();
cur.next();
while (cur.hasNext()){
Object id = cur.curr().get("_id");
DBCursor curalbum = album.find(new BasicDBObject("images", id));
if(!curalbum.hasNext()){
image.remove(new BasicDBObject("_id", id));
}
cur.next();
}
}
}
In order to verify above statement after removing orphans :
db.albums.aggregate({$unwind:"$images"},{$group:{_id:null,sum:{$sum:"$images"},count:{$sum:1}}})
The result looks like:
"result" : [
{
"_id" : null,
"sum" : NumberLong("4501039268"),
"count" : 90017
}
],
"ok" : 1
To prove you did it correctly, what are the total number of images with the tag 'sunrises" after the removal of orphans?
db.images.find({"tags":"sunrises"}).count()
This will fetch the final answer as 45044
Question 8:
Supposed we executed the following Java code. How many animals will be inserted into the "animals" collection?
When you run the above , then you will see an error is thrown that there is a duplicate ID , as we are trying to add , documents again and again on the same Id as we are modifying the same document . So the only one document will be inserted in the collection which will be the first insert as {_id::xxx,"animal","monkey"}
then when again ("animal","cat") is tried to push then the id is same so , it throws duplicate key . So answer is that only one document gets inserted.
Question 9:
Imagine an electronic medical record database designed to hold the medical records of every individual in the United States. Because each person has more than 16MB of medical history and records, it's not feasible to have a single document for every patient. Instead, there is a patientcollection that contains basic information on each person and maps the person to a patient_id, and arecord collection that contains one document for each test or procedure. One patient may have dozens or even hundreds of documents in the record collection.
We need to decide on a shard key to shard the record collection. What's the best shard key for therecord collection, provided that we are willing to run scatter gather operations to do research and run studies on various diseases and cohorts? That is, think mostly about the operational aspects of such a system.
patient_id
_id
primary care physican (your principal doctor)
date and time when medical record was created
patient first name
patient last name
Here among the options given for the shard key most favourable is patient_id , as there are large number of patient_id and they have been distributed in different diseases, and when a scatter gather operation is carried out then the data is far more expanded on the basis of patient_id.
Other options are not suitable for the scatter and gather operation.
Question 10:
Understanding the output of explain We perform the following query on the enron dataset:
Here the correct options will be :
Option 1 seems to be correct as if you could notice that "cursor" : "BtreeCursor headers.From_1" that means that headers.From_1 is used which is not in the find clause but is in the sorting
Option 2 also seems to be correct as "cursor" : "BtreeCursor headers.From_1" the cursor is used in the sorting phase
Option 3 This option is wrong as 83057 records as n=83057
Option 4 This option is correct as if we see nscanned objects is 120477 so it has scanned all
Hope that above explanation prove helpful, please put your precious comments and suggestions on better method to do any question.
Question 1 :
Here we need to query the enron dataset calculate the number of messages sent by Andrew Fastow, CFO, to Jeff Skilling, the president. Andrew Fastow's email addess was andrew.fastow@enron.com. Jeff Skilling's email was jeff.skilling@enron.com
So for this first we need to download the enron zip/tar and then import in the mongoDB database name enron and collection name messages . Command for import
mongoimport -d enron -c messages > enron.json
Now switch to mongo Shell commands:
use enron
db.messages.find({"headers.To":"andrew.fastow@enron.com","headers.From":"jeff.skilling@enron.com"}).count()
This will produce the answer as 3
Question 2:
Please use the Enron dataset you imported for the previous problem. For this question you will use the aggregation framework to figure out pairs of people that tend to communicate a lot. To do this, you will need to unwind the To list for each message.
The mongo shell command which will retrieve the desired answer would be
db.messages.aggregate([
{
$project: {
from: "$headers.From",
to: "$headers.To"
}
},
{
$unwind: "$to"
},
{
$group : { _id : { _id: "$_id", from: "$from", to: "$to" }
}
},
{
$group : { _id : { from: "$_id.from", to: "$_id.to" }, count: {$sum :1}
}
},
{
$sort : {count:-1}
},
{
$limit: 2
}
])
This would give you the top 2 communication , and check the top most which would turn out to be :
"result" : [
{
"_id" : {
"from" : "susan.mara@enron.com",
"to" : "jeff.dasovich@enron.com"
},
"count" : 750
},
{
"_id" : {
"from" : "soblander@carrfut.com",
"to" : "soblander@carrfut.com"
},
"count" : 679
}
],
"ok" : 1
So, it clearly shows the answer is "susan.mara@enron.com" to "jeff.dasovich@enron.com"
Question 3:
In this problem you will update a document in the Enron dataset to illustrate your mastery of updating documents from the shell. Please add the email address "mrpotatohead@10gen.com" to the list of addresses in the "headers.To" array for the document with "headers.Message-ID" of "<8147308.1075851042335.JavaMail.evans@thyme>"
For this there would be a simple update expression using mongo shell as :
db.messages.update({"headers.Message-ID":"<8147308.1075851042335.JavaMail.evans@thyme>"},{$addToSet:{"headers.To":"mrpotatohead@10gen.com"}})
Then run the validation code and get the validation code as : 897h6723ghf25gd87gh28
Question 4:
Enhancing the Blog to support viewers liking certain comments.
Here you need to work on the BlogPostDAO.java at the area marked as XXXXXX
postsCollection.update(new BasicDBObject("permalink", permalink), new BasicDBObject("$inc", new BasicDBObject("comments." + ordinal + ".num_likes", 1)));
Here in the above command we search the posts collection with the permanent link and increment the like counter by one for the comment which is clicked for like or in other words the ordinal or the order of the comment in the comments array, this ensures that the like is incremented for the comment clicked for like.
Doing this you could see that the like button starts working.
Now run the validator , and you will get the code as : 983nf93ncafjn20fn10f
Question 5 :
In this question a set of indexes are given and we have to select the indexes which might have been used , in execution of
db.fubar.find({'a':{'$lt':10000}, 'b':{'$gt': 5000}}, {'a':1, 'c':1}).sort({'c':-1})
As the Find portion searches on a,b and a,c and sorting is carries on c reverse order.
_id_ -- This index is not used either in sort or find clause of the operation
a_1_b_1 -- This index is used in the find operation as find is on a,b
a_1_c_1 -- This index is used in the find operation as find is on a,c
c_1 -- This index is also used, because there is a provision that a index is not utilized for the find operation but for the sort it is used as sort({'c':-1})
a_1_b_1_c_1 - This involves all the three a,b,c and this is also used as it can also be used as a valid index
Question 6
Suppose you have a collection of students of the following form:
{ "_id" : ObjectId("50c598f582094fb5f92efb96"), "first_name" : "John", "last_name" : "Doe", "date_of_admission" : ISODate("2010-02-21T05:00:00Z"), "residence_hall" : "Fairweather", "has_car" : true, "student_id" : "2348023902", "current_classes" : [ "His343", "Math234", "Phy123", "Art232" ] }Now suppose that basic inserts into the collection, which only include the last name, first name and student_id, are too slow. What could potentially improve the speed of inserts. Check all that apply.
Add an index on last_name, first_name if one does not already exist.
Set w=0, j=0 on writes
Remove all indexes from the collection
Provide a hint to MongoDB that it should not use an index for the inserts
Build a replica set and insert data into the secondary nodes to free up the primary nodes.
option 1 - As a fact adding index affects reading not writing so it would be indifferent with the indexing so not this option
Option 2 seems to be valid as when w=0 and j=0 is done for the writes no waiting is done at all are no wait is required to obtain as the write confirmations , simply the data is dumped without verification therefore speeding the writes
Option 3 removing indexes would actually help as it would reduce the load and speed up the writing process
Option 4 This seems absurd
Option 5 This is not actually possible as writes are not possible on the secondary nodes so not valid option
Question 7
You have been tasked to cleanup a photosharing database. The database consists of two collections, albums, and images. Every image is supposed to be in an album, but there are orphan images that appear in no album. Here are some example documents (not from the collections you will be downloading).
When you are done removing the orphan images from the collection, there should be 90,017 documents in the images collection.
In order to remove the Orphans talked I wrote a Java Program :
/**
*
* @author Ankur Gupta
*/
public class Test {
public static void main(String[] args) throws IOException {
MongoClient c = new MongoClient(new MongoClientURI("mongodb://localhost"));
DB db = c.getDB("finaltask");
int i =0;
DBCollection album = db.getCollection("albums");
DBCollection image = db.getCollection("images");
DBCursor cur = image.find();
cur.next();
while (cur.hasNext()){
Object id = cur.curr().get("_id");
DBCursor curalbum = album.find(new BasicDBObject("images", id));
if(!curalbum.hasNext()){
image.remove(new BasicDBObject("_id", id));
}
cur.next();
}
}
}
In order to verify above statement after removing orphans :
db.albums.aggregate({$unwind:"$images"},{$group:{_id:null,sum:{$sum:"$images"},count:{$sum:1}}})
The result looks like:
"result" : [
{
"_id" : null,
"sum" : NumberLong("4501039268"),
"count" : 90017
}
],
"ok" : 1
To prove you did it correctly, what are the total number of images with the tag 'sunrises" after the removal of orphans?
db.images.find({"tags":"sunrises"}).count()
This will fetch the final answer as 45044
Question 8:
Supposed we executed the following Java code. How many animals will be inserted into the "animals" collection?
public class Question8 { public static void main(String[] args) throws IOException { MongoClient c = new MongoClient(new MongoClientURI("mongodb://localhost")); DB db = c.getDB("test"); DBCollection animals = db.getCollection("animals"); BasicDBObject animal = new BasicDBObject("animal", "monkey"); animals.insert(animal); animal.removeField("animal"); animal.append("animal", "cat"); animals.insert(animal); animal.removeField("animal"); animal.append("animal", "lion"); animals.insert(animal); } }
then when again ("animal","cat") is tried to push then the id is same so , it throws duplicate key . So answer is that only one document gets inserted.
Question 9:
Imagine an electronic medical record database designed to hold the medical records of every individual in the United States. Because each person has more than 16MB of medical history and records, it's not feasible to have a single document for every patient. Instead, there is a patientcollection that contains basic information on each person and maps the person to a patient_id, and arecord collection that contains one document for each test or procedure. One patient may have dozens or even hundreds of documents in the record collection.
We need to decide on a shard key to shard the record collection. What's the best shard key for therecord collection, provided that we are willing to run scatter gather operations to do research and run studies on various diseases and cohorts? That is, think mostly about the operational aspects of such a system.
patient_id
_id
primary care physican (your principal doctor)
date and time when medical record was created
patient first name
patient last name
Here among the options given for the shard key most favourable is patient_id , as there are large number of patient_id and they have been distributed in different diseases, and when a scatter gather operation is carried out then the data is far more expanded on the basis of patient_id.
Other options are not suitable for the scatter and gather operation.
Question 10:
Understanding the output of explain We perform the following query on the enron dataset:
db.messages.find({'headers.Date':{'$gt': new Date(2001,3,1)}},{'headers.From':1, _id:0}).sort({'headers.From':1}).explain()and get the following explain output.
{ "cursor" : "BtreeCursor headers.From_1", "isMultiKey" : false, "n" : 83057, "nscannedObjects" : 120477, "nscanned" : 120477, "nscannedObjectsAllPlans" : 120581, "nscannedAllPlans" : 120581, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 250, "indexBounds" : { "headers.From" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "Andrews-iMac.local:27017" }
The query did not utilize an index to figure out which documents match the find criteria.
The query used an index for the sorting phase.
The query returned 120,477 documents
The query performed a full collection scan
Here the correct options will be :
Option 1 seems to be correct as if you could notice that "cursor" : "BtreeCursor headers.From_1" that means that headers.From_1 is used which is not in the find clause but is in the sorting
Option 2 also seems to be correct as "cursor" : "BtreeCursor headers.From_1" the cursor is used in the sorting phase
Option 3 This option is wrong as 83057 records as n=83057
Option 4 This option is correct as if we see nscanned objects is 120477 so it has scanned all
Hope that above explanation prove helpful, please put your precious comments and suggestions on better method to do any question.
Are these answers correct?
ReplyDeleteYes I have solved them all and these very answers were 10 on 10 correct
DeleteThanks.
ReplyDeleteThanks Ankur!! You've solved all my doubts.
ReplyDeleteI had this solution for Q.2:
db.messages.aggregate([
{$project: {
from: "$headers.From",
to: "$headers.To"
}},
{$unwind: "$to"},
{$project: {
pair: {
from: "$from",
to: "$to"
},
count: {$add: [1]}
}},
{$group: {
_id: "$pair",
count: {$sum: 1}
}},
{$sort: {
count: -1
}},
{$limit: 2},
{$skip: 1}
])
which query is correct? both are giving different results.
DeleteI can guarantee the correctness of the query i have defined in blog along with the explanation.
DeleteThanks
Question 5 :
ReplyDeletea_1_c_1 -- This index is used in the find operation as find is on a,c
----
answer correct, but explanation is wrong. find({...},{'a':1,'c':1}) is projection part so it doesn't use index, but a_1_c_1 can be particularly used for part {'a':{'$lt':10000}, 'b':{'$gt': 5000}}
Hello, I was very encouraged to find this site. The reason being that this is such an informative post. I wanted to thank you for this informative read of the subject.
ReplyDeleteCourt Evaluators Georgia
State Certification
Welcome!!
DeleteI am really glad that this post came in handy for you!!
Cheers!!
Can you explain question 2 a little more, how it works
ReplyDeletequestion 6, Remove all indexes from the collection? you can't remove ALL indexes! (_id)
ReplyDeleteNot sure how come that question 10, option 4 be correct if
ReplyDelete(a) cursor name is not Basic Cursor which is the sign of full scan
(b) nscannedObjectsAllPlans is more than nscannedObjects, which means that we have more than 120477 objects in the collection
it's definitely NOT the full scan, your answer is misleading
nscannedObjectsAllPlans simply shows the extra scans that the find operation did. The nscannedObjects shows all of the objects scanned, while nscandedObjects shows all objects scanned across multiple iterations. Since the enron dataset, which was given, only has 120477 documents in it, and nscanned is 120477, then the entire collection is scanned.
DeleteI too agree with you.
ReplyDelete"mongoimport -d enron -c messages > enron.json" - well I think operator is incorrect. It should be :
mongoimport -d enron -c messages < enron.json
hi! i dont have the option to submit any validation code for question number 3. it just asks me to enter the query in the shell and submit. However, here you mention about a validation code for question 3. Why is that so?
ReplyDeleteYour query for question 1 is wrong:
ReplyDeletedb.messages.find({"headers.To":"andrew.fastow@enron.com","headers.From":"jeff.skilling@enron.com"}).count()
the correct query is :
db.messages.find({"headers.From":"andrew.fastow@enron.com","headers.To":"jeff.skilling@enron.com"}).count()
headers.From is Andrew and To is Jeff
This comment has been removed by the author.
ReplyDeleteHi The answer to Question no 7 is wrong the answer is 44787 & there is no option of 45044.
ReplyDeleteHello ankur as i can see your answers are correct but can you please post the new answer of question seven as data is slightly chnaged by mongo people
ReplyDeleteThe question 10 option 4 explanation is wrong. We can't know from the nscanned if it has scanned all, since we don't know the total number of documents. But we can see from indexBounds that it has scanned from $minElement to $maxElement which means it has scanned all values in the index, from min to max.
ReplyDeleteThe nscannedAllPlans value (120581) is bigger than the nscanned (120477) but it doesn't mean that there are >=120581 documents as someone suggested in a comment. nscannedAllPlans is the total number of scanned items when all plans are summed. So, this means that mongo scanned 120581-120477=104 items with other plans before it decided that the one showed by the explain() is the best.
right, full collection scan means not using the default BasicCursor, not any other index.
DeleteQuestion 5, Its not possible to use index C, since the query is on the "a" and "b" attribution !
ReplyDeleteIts possible to use the _id ofcourse, because it's anyway taking less space in the memory and there's no need to load the whole collection while doing a query also
DeleteMy answer is that all indexes could be used !
DeleteBetter performance for question 7
ReplyDeleteBasicDBObject query = new BasicDBObject();
BasicDBObject fields = new BasicDBObject("images", 1);
DBCursor albumCursor = albumsCollection.find(query, fields);
Set existedImages = new HashSet();
System.out.println("Start Fetching album images ... ");
while (albumCursor.hasNext()) {
try {
BasicDBList images = (BasicDBList) albumCursor.next().get(
"images");
//Build album image ids
for (Object oo : images) {
existedImages.add((Integer) oo);
}
} catch (Exception e) {
e.printStackTrace();
}
}
//Fetch all image ids
Set imageIds = new HashSet(
imagesCollection.distinct("_id"));
System.out.println("Start verifying orphan images ... ");
// Verify orphan images
for (Integer existedImage : existedImages) {
if (imageIds.contains(existedImage)) {
imageIds.remove(existedImage);
}
}
System.out.println("Start Deleting " + imageIds.size() + " images");
for (Integer imageId : imageIds) {
imagesCollection.remove(new BasicDBObject("_id", imageId));
}
System.out.println("Finished");
Cool. could you prove full program and dataset to run your program. I use IntelliJ IDEA to compile. After adding declaration for main (), and add jar file for mongodb as
ReplyDeleteimport com.mongodb.BasicDBObject;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.DBCursor;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;
import java.util.HashSet;
import java.util.Set;
import java.io.IOException;
public class question7 {
public static void main(String[] args) throws IOException {
BasicDBObject query = new BasicDBObject();
BasicDBObject fields = new BasicDBObject("images", 1);
DBCursor albumCursor = albumsCollection.find(query, fields);
...
Intellij tolds me it cannot find symbol albumsCollection
Hi Ankur Gupta
ReplyDeleteat java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:188)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:116)
I did not know why ? (i put you file in same folder with two json files). In fact, i run mongod first, and mongo in second command window,
In Question 4:
ReplyDeleteI have to import the file posts.json?
I need to delete something?
Is the correct answer for question 6?
ReplyDelete.2
.3
Question 6, option 3: "Remove all indexes from the collection". It is impossible. Index { _id: 1 } can never be dropped. Otherwise - removing any index could improve write speed.
ReplyDeletei learned so much thnks
ReplyDelete"http://www.studentwhiz.com/Online-Classes/MGT-498/Final/119/529/" title="MGT 498 Final" rel="dofollow">MGT 498 Final Exam
this is cheating..... and not helping anyone.
ReplyDeleteGood initiative taken by author.
ReplyDeleteThanks for this step by step walk-through ! This article could help since it is giving a key to some problems ....
ReplyDeleteThank you. Your post help me to understand Mongodb more.
ReplyDeleteThank you, it helped me more in the exam
ReplyDeleteFor question no to the answer is not susan.mara@enron.com to jeff.dasovich@enron.com. It should be susan.mara@enron.com to richard.shapiro@enron.com with 974 entries.
ReplyDeleteHere is the query I used :
db.messages.aggregate([{$project : { _id : 0, from : "$headers.From", to : "$headers.To" }}, {$unwind : "$to"}, {$group: {_id : {from: "$from", to: "$to"}, count: { $sum: 1 }}}, { $sort : { count : -1}}, { $limit: 3 }])
Removing the duplicates bit is what you forget. Double grouping required.
Deletewill it work for latest exam in 2015?
ReplyDeleteAre you doing this? I am working on it :)
DeleteNo. Dataset for 2015 is changed a bit. Most questions will yield results as before. Few will have different numbers as answers.
Deletehow do i run the validation code? what's this and where is it available?
ReplyDeleteQ. No. 2 answer by +Ankur Gupta will not work for large datasets. Since MongoDB 2.6, there is addition of allowDiskUse : true parameter for aggregate queries
ReplyDeleteHence solution is modified now as:
db.runCommand(
{aggregate: "messages",
pipeline:[
{
$project: {
from: "$headers.From",
to: "$headers.To"
}
},
{
$unwind: "$to"
},
{
$group : { _id : { _id: "$_id", from: "$from", to: "$to" } }
},
{
$group : { _id : { from: "$_id.from", to: "$_id.to" }, count: {$sum :1} }
},
{
$sort : {count:-1}
},
{
$limit: 2
}
], allowDiskUse: true
}
)
Q. No. 7
ReplyDeleteAnkur's Java Solution still stands fine. Though it seems like the program is hitting MongoDB Database serially very heavily. Orphan removal Op finished in 10 minutes or so !!
I would try to work on solution which can speed up.
db.albums.aggregate({$unwind:"$images"},{$group:{_id:null,sum:{$sum:"$images"},count:{$sum:1}}}) returns
{
"result" : [
{
"_id" : null,
"sum" : 4486251271.0000000000000000,
"count" : 89737.0000000000000000
}
],
"ok" : 1.0000000000000000
}
and
db.images.find({"tags":"sunrises"}).count() returns result 44787
The Answer (for 2015 Q 7) is a) 44787
Q no. 8
ReplyDeleteOutput
02:26:18.652 [main] INFO org.mongodb.driver.connection - Opened connection [connectionId{localValue:2, serverValue:29}] to 127.0.0.1:27017
02:26:18.677 [main] DEBUG org.mongodb.driver.protocol.insert - Inserting 1 documents into namespace test.animals on connection [connectionId{localValue:2, serverValue:29}] to server 127.0.0.1:27017
02:26:19.438 [main] DEBUG org.mongodb.driver.protocol.insert - Insert completed
02:26:19.439 [main] DEBUG org.mongodb.driver.protocol.insert - Inserting 1 documents into namespace test.animals on connection [connectionId{localValue:2, serverValue:29}] to server 127.0.0.1:27017
Exception in thread "main" com.mongodb.MongoWriteException: E11000 duplicate key error index: test.animals.$_id_ dup key: { : ObjectId('55a425f2cf9e082184c58d9c') }
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:487)
at com.mongodb.MongoCollectionImpl.insertOne(MongoCollectionImpl.java:277)
at com.mongodb.test.OrphanDataRemover.run(OrphanDataRemover.java:25)
at com.mongodb.test.OrphanDataRemover.main(OrphanDataRemover.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Hence Answer is 1 record got inserted.
Hi Ankur,I have overall 5+ years of experience n java/j2ee,struts,jpa, can you please guide me how can i clear the mongodb interviews.
ReplyDeleteThe information u provided was so attractive..thnks for the information..for more details click here..<a href="http://www.rstrainings.com/mongodb-online-training.html>online mongo db training</a>
ReplyDeletedo u have the answers of 2016 mongodb for java developers
ReplyDeleteThis is great Blog here my friend! Very informative, I appreciate all the information that you just shared with me very much and I’ll be back to read more in the future thanks for sharing. MongoDB Online Training EUROPE AND AMERICA
ReplyDeleteFantastic blog! I like the post a lot! Very interesting and useful read. I'll be back in the future to check your posts. Thanks for sharing wonderful thoughts.
ReplyDeleteThumbs up!
Hire PHP Developer
Great review! You hair looks like silk! And Yes it has grown a lot! Thanks taking the time out to do this!
ReplyDeleteMongoDB Training in Chennai
Thank for sharing you are good information provide use from this blog
ReplyDeletethis blog is very informative .one of the recommanded blog best regards from
ReplyDeletesbrtrainings
mongo DB online training
we are very greatful to you such a great explanation.one of the recommanded blog.very useful content best regards from sbr learn more from
ReplyDeletesbr training
This comment has been removed by the author.
ReplyDeleteThanks for sharing great information in your blog. We Got to learn new things from your Blog.It was very nice blog to learn about SAP BASIS.
ReplyDeleteSAP FICO Training Hyderabad
Online SAP FICO Training in USA
Wow, the programming information on this site has helped me finish my assignment in time and I am very grateful to the author for sharing such resourceful information. I will be visiting this site regularly to learn more programming skills. Check out my article by clicking on How to Select a Dissertation Editing Service.
ReplyDeleteGreat blog Thank for sharing nice information,
ReplyDeletegoldenslot slot games
gclub casino
gclub
Revanth Technologies is the Best Software Training Institute for Java, C#.NET, ASP.NET,
ReplyDeleteOracle, Testing Tools, Selenium, Android, iPhone in Hyderabad India which provides online
training classes. Online Training is given by Real-Time Expert Only.
Key features are:
1.One to One and group Sessions as per student convenience.
2.Techning by Real-Time Experts
3.Flexible Timings and etc..
4.Customized course content as per student's requirement.
Visit Our website for Java Course Content:
http://www.http://www.revanthtechnologies.com/java-online-training-from-india.php
For more details please contact 9290971883 / 9247461324 or drop a mail to revanthonlinetraining@gmail.com
For more mongodb certificate solutions please visit https://visionfortech.blogspot.in/p/mongodb-certificate-s.html
ReplyDeleteFor more mongodb certificate solutions please visit
ReplyDeletehttps://visionfortech.blogspot.in/p/mongodb-certificate-s.html
Nice blog. Thanks for sharing such great information.Inwizards offers Mongo database services for our Mongodb Client. Start mongodb development with our skilled and experienced mongodb developers. Intrested click here - Mongo Database Services
ReplyDeleteThe most effective method to Solve MongoDB Error Message 11000 through MongoDB Technical Support
ReplyDeleteAt whatever point you confront this Error 11000 which basically signifies "Copy Key Error Index" it implies the file has two passages for a similar ID. To get very best arrangement with respect to this issue, contact to MongoDB Online Support or MongoDB Customer Support USA. We at Cognegic empower proactive checking and observing of your whole database and help to keep the issue before they happen. We just consider what is essential for the best practice and diminish single purpose of disappointment.
For More Info: https://cognegicsystems.com/
Contact Number: 1-800-450-8670
Email Address- info@cognegicsystems.com
Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801
ReplyDeleteNice blog..! I really loved reading through this article. Thanks for sharing such
a amazing post with us and keep blogging...
mongo db online training in hyderabad
Thanks for sharing information. It’s quite interesting. You can also get complete info on MangoDB here
MongoDB Training in
Hyderabad
thanks for your information visit us at MongoDB Training in Hyderabad
ReplyDeleteThank you for sharing such an awesome post. Its very lovely to read and i like your sentence formation.
ReplyDeleteTableau Online Training In Hyderabad
Tableau Online Training In Bangalore, Pune, Noida.
It is very good and useful for students and developer .Learned a lot of new things from your post. Thank you so much.
ReplyDeleteMicroservices Training in Hyderabad
Good and nice article.Please visit our website.
ReplyDeleteMongoDB Training in Hyderabad
Thanks for sharing this blog post,Nice written skill Java online course Bangalore
ReplyDeleteThanks for sharing.
ReplyDeleteMachine Learning training in Pallikranai Chennai
Pytorch training in Pallikaranai chennai
Data science training in Pallikaranai
Python Training in Pallikaranai chennai
Deep learning with Pytorch training in Pallikaranai chennai
Bigdata training in Pallikaranai chennai
Mongodb training in Pallikaranai chennai
Digital marketing company in coimbatore and SEO company in coimbatore. We provide Website design, e-commerce website design , web application development ,mobile application development in android and ios, SEO services, digital marketing services ,content writing agency,graphic design ,branding service for all kinds of business.
ReplyDeleteWhich of the Following are Factors Related to Search Engine Optimization (SEO)
Adelaide Search Engine Optimisation
ecommerce website development company in coimbatore
Web development company in Coimbatore
Website Design Company in Erode
Content Writing for Agencies
Web design company in Tirupur
what is happiness quotes