【tools&tips】sphinx生成本页面的带链接目录¶

sphinx生成本页目录的问题¶

使用restructuredText书写博客时，每篇博客单独一个网页。如果想要使用sphinx为每个页面生成本页的目录，找了很久发现似乎sphinx并不支持。在网上也有类似的问题。

下文就实际的两种情形来讨论解决方式。

单文档模式。在一个sphinx project里面，只由一个文档index组成。
多级文档模式。在一个sphinx project里面，存在多级。假设多级文档的总控文档为index。sphinx里还有子目录A/B/C。A/B/C/的主控文档分别为Aindex，Bindex，Cindex。

一种解决方式¶

sphinx v1.1.3之后的版本，如果给每个博客都新建一个工程的话，倒是可以间接的解决。假定博客文件为index.rst。在toctree中将文件自己包含在内。虽然会引起sphinx的warning警告，但是也可以暂时满足生成本页面的需求。

目录
================================

.. toctree::

   index

但是这种方式并不解决根本问题。因为这种方式存在多个弊端：

仅仅支持单文档模式。必须为每篇博客都新建一个工程。多级文档模式下会因为递归调用，导致整个工程无法生成HTML。
可恶的warning提醒，红的扎眼。

因此这种方案只是权宜之计。还得想其他办法。

最终解决方式¶

其实sphinx的官方文档中，提到了一种解决方式。

The special entry name self stands for the document containing the toctree directive. This is useful if you want to generate a “sitemap” from the toctree.

也就是说，sphinx已经考虑到了，使用关键字”self”来生成本网页的目录。

目录
================================

.. toctree::

   self

但是在实际使用中会发现，self只能生成本网页的title，对于本网页的其他标题，如一级、二级标题，均不生成相应的链接。

因此还是要从源码着手解决。大致了解了下sphinx的代码，从sphinx-1.2.1-py2.7.eggsphinxenviroment.py文件中找到了这样一个函数_entries_from_toctree。

def _entries_from_toctree(toctreenode, parents,
                                  separate=False, subtree=False):
            """Return TOC entries for a toctree node."""
            refs = [(e[0], e[1]) for e in toctreenode['entries']]
            entries = []
            for (title, ref) in refs:
                try:
                    refdoc = None
                    if url_re.match(ref):
                        reference = nodes.reference('', '', internal=False,
                                                    refuri=ref, anchorname='',
                                                    *[nodes.Text(title)])
                        para = addnodes.compact_paragraph('', '', reference)
                        item = nodes.list_item('', para)
                        toc = nodes.bullet_list('', item)
                    elif ref == 'self':
                        # 'self' refers to the document from which this
                        # toctree originates
                        ref = toctreenode['parent']
                        if not title:
                            title = clean_astext(self.titles[ref])
                        reference = nodes.reference('', '', internal=True,
                                                    refuri=ref,
                                                    anchorname='',
                                                    *[nodes.Text(title)])
                        para = addnodes.compact_paragraph('', '', reference)
                        item = nodes.list_item('', para)
                        # don't show subitems
                        toc = nodes.bullet_list('', item)
                    else:
                        if ref in parents:
                            self.warn(ref, 'circular toctree references '
                                      'detected, ignoring: %s <- %s' %
                                      (ref, ' <- '.join(parents)))
                            continue
                        refdoc = ref
                        toc = self.tocs[ref].deepcopy()
                        self.process_only_nodes(toc, builder, ref)
                        if title and toc.children and len(toc.children) == 1:
                            child = toc.children[0]
                            for refnode in child.traverse(nodes.reference):
                                if refnode['refuri'] == ref and 
                                       not refnode['anchorname']:
                                    refnode.children = [nodes.Text(title)]
                     ...

这里面有一行注释，很清晰的写出，当使用self的时候，”don’t show subitems”。

好了，我们的任务明确了，那我们的任务就是要让他”show subitems”。

现在我们来想，为什么在toctree里写index（假设文档就是index.rst）会引起递归调用。是因为在解析这篇文档时，将本文档当做主控文档，为本文档建立子文档时。遇到了index，那么他会认为index是一个子文档，于是给自己添加一个<toctree>index</toctree>。

系统遇到<toctree>index</toctree>，对其进行解析时，打开index，把index作为主文档，再次添加一个<toctree>index</toctree>到<toctree>index</toctree>里，形成递归，且没有递归头。当递归到一定层数，sphinx会自动给出个跳出递归的错误。

那么，现在，我们要做的其实就很简单了，就是让他在解析自身的时候，不进行递归。方法也很简单，就是遍历，删除掉其中的toctree节点。在这里，我使用了深度优先的递归搜索方式。

...
elif ref == 'self':
    # 'self' refers to the document from which this
    # toctree originates
    ref = toctreenode['parent']
    if not title:
        title = clean_astext(self.titles[ref])
    reference = nodes.reference('', '', internal=True,
                                refuri=ref,
                                anchorname='',
                                *[nodes.Text(title)])
    para = addnodes.compact_paragraph('', '', reference)
    item = nodes.list_item('', para)
    # don't show subitems
    toc = nodes.bullet_list('', item)

    # show subitems！！！
    if ref not in parents:
        toc = self.tocs[ref].deepcopy()
        def clean_toctree(toc):
            poplist = []
            if len(toc.children) > 0:
                for index, child in enumerate(toc.children):
                    if clean_toctree(child):
                        poplist.append(index)
            else:
                if toc.__class__ == addnodes.toctree:
                    return True
            for index, popindex in enumerate(poplist):
                toc.pop(popindex - index)
        clean_toctree(toc)
    else:
        para = addnodes.compact_paragraph('', '')
        item = nodes.list_item('', para)
        toc = nodes.bullet_list('', item)

else:
    ...

这里我还做了一个判断，就是如果是多级文档的时候，Aindex中也写了self（Aindex中还有子标题a1，a2，a3）。

那么在index作为主控文档生成目录的时候，因为Aindex的toctree中有self，会将a1，a2，a3，加入到目录。index再去从self之下继续读取Aindex正文的时候，会再次添加a1，a2，a3，加入到目录。那么就造成了a1，a2，a3的重复添加，在目录里很不不美观。

因此在这里做了一个判断，

if ref not in parents:

这段代码就是用来判断这里的self是Aindex本身还是在总控文档index的引用。如果是总控文档的引用，那么就不添加子标题a1，a2，a3（因为总控文档还会读取文档的正文再次添加）。

附： environment.py的补丁文件（Sphinx v1.2.1）