如何实现docx文档的版本管理

docx文件提交到git很容易,对docx文件实现版本管理的难点在于如何比较其中的差异,方法简要概述如下:

  1. 利用pandoc实现将docx文件转成md文件(只能是docx,而不能是doc)
  2. 通过git hook,配置pre-commit和post-commit,在git commit时自动调用生成md文件的过程,并将md文件提交到库中
  3. md文件作为docx的副本,且可以通过diff直接查看

参考:https://github.com/vigente/gerardus/wiki/Integrate-git-diffs-with-word-docx-files

1. 安装pandoc

2. Tell git how to handle diffs of .docx files.

  1. Create or edit file ~/.gitconfig (linux, Mac) or “c:\Documents and Settings\user.gitconfig” (Windows) to add

    1
    2
    3
    4
    5
    [diff "pandoc"]
    textconv=pandoc --to=markdown
    prompt = false
    [alias]
    wdiff = diff --word-diff=color --unified=1
  2. In your paper directory, create or edit file .gitattributes (linux, Windows and Mac) to add

    1
    *.docx diff=pandoc
  3. You can commit .gitattributes so that it stays with your paper for use in other computers, but you’ll need to edit ~/.gitconfig in every new computer you want to use.

3. 配置git hook

This is only going to work from linux/Mac or Windows running git from a bash shell.

  1. Install pandoc. Pandoc is a program to convert between different file formats. It’s going to allow us to convert Word files (.docx) to Markdown (.md).

  2. Set up git hooks to enable automatic generation and tracking of Markdown copies of .docx files.

    Copy these hook files to your git project’s .git/hooks directory and rename them, or soft-link to them with ln -s, and make them executable (chmod u+x *.sh):

    Now every time you run git commit, the pre-commit hook will automatically run before you see the window to enter the log message. The hook is a script that makes a copy in Markdown format (.md) of every .docx file you are committing. The post-commit hook then amends the commit adding the .md files.