We would have a list of documents sorted by user
and then subsorted by date, so it would look something like the following:
User 123 on March 13, 2010
User 123 on March 12, 2010
User 123 on March 11, 2010
User 123 on March 5, 2010
User 123 on March 4, 2010
User 124 on March 12, 2010
User 124 on March 11, 2010
...
This looks fine at this scale, but imagine if the application has millions of users who
have dozens of status updates per day. If the index entries for each user’s status messages
take up a page’s worth of space on disk, then for every “latest statuses” query, the
database will have to load a different page into memory. This will be very slow if the
site becomes popular enough that not all of the index fits into memory.
If we flip the index order to {date : -1, user : 1}, the database can keep the last
couple days of the index in memory, swap less, and thus query for the latest statuses
for any user much more quickly.
1. What are the queries you are doing? Some of these keys will need to be indexed.
2. What is the correct direction for each key?
3. How is this going to scale? Is there a different ordering of keys that would keep
more of the frequently used portions of the index in memory?
If you can answer these questions, you are ready to index your data.