https://code.google.com/p/nsscache/wiki/BackgroundOnNameServiceSwitch
The POSIX API
POSIX is a standard that defines an operating system interface and its environment; describing available library calls, utilities, environment vars, escape sequences, regexps, when to take coffee breaks (aka how long your code takes to compile), etc.
GNU/Linux is (generally) POSIX compliant.
The relevant component of POSIX is the definition of function calls to access directory information -- databases of people/groups/hosts/etc.
Here are some examples of the POSIX API functions, the method in which applications access information in the the system databases.
- get*nam() -> get a database entry by its human readable name
- get*id() -> get a database entry by its computer readable name
- get*ent() -> get the next entry in a database; a mechanism for iterating over the entire database
(The asterisk replaces the short name of the database being accessed.)
These functions get called all the time, for example:
- at login (to find out who you are and what your groups are)
- ls -l (mapping uid/gid of a file to username/group)
- resolving hostnames to IP addresses
- many others: NIS netgroups, automount locations, rpc names, TCP and UDP protocol names
It doesn't matter for the most part that these API calls are made all the time, because when the API was designed, the database that stored this information is a plain text file on the local machine, and accessing that is both fast and 100% reliable (ignoring of course hardware issues on the local machine, at which point you have bigger problems :-)
As we got bigger networks and lots of shared computing infrastructure, we moved to directory services. /etc/hosts stopped scaling, so we got DNS, and it all went downhill from there.
System administrators wanted to get the system databases from other sources like NIS, NIS+, LDAP, Hesiod (gag), DNS, etc. To facilitate that, you want to allow easy runtime configuration changes, i.e. different types of data may need to be stored in different places -- users in/etc/passwd versus hosts in DNS.
First implemented by Sun, this was dubbed the name service switch, or NSS for short.
The Name Service Switch
Perhaps you're familiar with the Name Service Switch configuration file, /etc/nsswitch.conf:
passwd: compat files
group: compat files
shadow: compat files
hosts: files dns
On the right hand side of the colon are the data sources, where NSS will go to retrieve the system database. It progresses left to right, checking each source in turn until the data is found.
On the left hand side of the colon, the groupings of data, the database itself, which we are calling "maps" -- in this example, the passwd database API functions are mapped to the "compat" and "files" data sources.
For our own convenience, this document will refer to both the POSIX API described above, and the GNU libc implementation of the Name Service Switch as both "NSS".
# /etc/nsswitch.conf
passwd: files
When an NSS function is called, the NSS implementation reads its configuration file /etc/nsswitch.conf, which names the library that implements the data retrieval. NSS dynamically loads this library, in this example, libnss_files.so. The correct function within this library is then called, for example _nss_files_getpwuid().
libnss_files then opens and parses /etc/passwd, and returns (typically a struct).
NSS + RFC 2307 LDAP
# /etc/nsswitch.conf
passwd: files ldap
Add in a directory service, and you get a situation familiar to many sysadmins. /etc/nsswitch.conf would now also list ldap in addition to filesin this example.
If NSS were to load libnss_files.so, and find nothing, it would then load libnss_ldap.so. libnss_ldap.so would make a network connection to the LDAP server, perform a query, and convert the LDAP results into the right return structure.
This means that every query will translate into a TCP connection with handshake overhead, possibly over SSL with its crypto overhead, and then do various ASN.1 and BER en- and decodings within the LDAP protocol itself...
Name Service Cache Daemon
So we also typically run a caching daemon, provided by GNU libc, called nscd.
It's accessed via a UNIX socket, and though poorly demonstrated by this diagram, loads the nss modules itself in order to act as a hit-and-miss cache.
It has several threads to that it can respond to several requests at the same time.
If the cache has the response, it returns it straight away. If not, it dlopens the NSS module, e.g. libnss_ldap.so, waits for the reply, caches it, and then returns it.