/usr/share/pyshared/grokmirror-0.3.5.egg-info is in grokmirror 0.3.5-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 | Metadata-Version: 1.0
Name: grokmirror
Version: 0.3.5
Summary: Smartly mirror git repositories that use grokmirror
Home-page: https://www.kernel.org/pub/software/network/grokmirror
Author: Konstantin Ryabitsev
Author-email: mricon@kernel.org
License: GPLv3+
Description: GROKMIRROR
==========
--------------------------------------------
Framework to smartly mirror git repositories
--------------------------------------------
:Author: mricon@kernel.org
:Date: 2013-05-27
:Copyright: The Linux Foundation and contributors
:License: GPLv3+
:Version: 0.3.5
DESCRIPTION
-----------
Grokmirror was written to make mirroring large git repository
collections more efficient. Grokmirror uses the manifest file published
by the master mirror in order to figure out which repositories to
clone, and to track which repositories require updating. The process is
extremely lightweight and efficient both for the master and for the
mirrors.
CONCEPTS
--------
Grokmirror master publishes a json-formatted manifest file containing
information about all git repositories that it carries. The format of
the manifest file is as follows::
{
"/path/to/bare/repository.git": {
"description": "Repository description",
"reference": "/path/to/reference/repository.git",
"modified": timestamp,
"symlinks": [
"/location/to/symlink",
...
],
}
...
}
The manifest file is usually gzip-compressed to preserve bandwidth.
Each time a commit is made to one of the git repositories, it
automatically updates the manifest file using an appropriate git hook,
so the manifest.js file always contains the most up-to-date information
about the repositories provided by the git server and their
last-modified date.
The mirroring clients will constantly poll the manifest.js file and
download the updated manifest if it is newer than the locally stored
copy (using ``Last-Modified`` and ``If-Modified-Since`` http headers).
After downloading the updated manifest.js file, the mirrors will parse
it to find out which repositories have been updated and which new
repositories have been added.
For all newly-added repositories, the clients will do::
git clone --mirror git://server/path/to/repository.git \
/local/path/to/repository.git
For all updated repositories, the clients will do::
GIT_DIR=/local/path/to/repository.git git remote update
When run with ``--purge``, the clients will also purge any repositories
no longer present in the manifest file received from the server.
Shared repositories
~~~~~~~~~~~~~~~~~~~
Grokmirror will automatically recognize when repositories share objects
via alternates. E.g. if repositoryB is a shared clone of repositoryA
(that is, it's been cloned using ``git clone -s repositoryA``), the
manifest will mention the referencing repository, so grokmirror will
mirror repositoryA first, and then mirror repositoryB with a
``--reference`` flag. This greatly reduces the bandwidth and disk use
for large repositories.
See man git-clone_ for more info.
.. _git-clone: https://www.kernel.org/pub/software/scm/git/docs/git-clone.html
SERVER SETUP
------------
Install grokmirror on the server using your preferred way.
**IMPORTANT: Currently, only bare git repositories are supported.**
You will need to add a hook to each one of your repositories that would
update the manifest upon repository modification. This can either be a
post-receive hook, or a post-update hook. The hook must call the
following command::
/usr/bin/grok-manifest -m /repos/manifest.js.gz -t /repos -n `pwd`
The **-m** flag is the path to the manifest.js file. The git process must be
able to write to it and to the directory the file is in (it creates a
manifest.js.randomstring file first, and then moves it in place of the
old one for atomicity).
The **-t** flag is to help grokmirror trim the irrelevant toplevel disk
path. E.g. if your repository is in /var/lib/git/repository.git, but it
is exported as git://server/repository.git, then you specify ``-t
/var/lib/git``.
The **-n** flag tells grokmirror to use the current timestamp instead of the
exact timestamp of the commit (much faster this way).
Before enabling the hook, you will need to generate the manifest.js of
all your git repositories. In order to do that, run the same command,
but omit the -n and the \`pwd\` argument. E.g.::
/usr/bin/grok-manifest -m /repos/manifest.js.gz -t /repos
The last component you need to set up is to automatically purge deleted
repositories from the manifest. As this can't be added to a git hook,
you can either run the ``--purge`` command from cron::
/usr/bin/grok-manifest -m /repos/manifest.js.gz -t /repos -p
Or add it to your gitolite's ``rm`` ADC using the ``--remove`` flag::
/usr/bin/grok-manifest -m /repos/manifest.js.gz -t /repos -x $repo.git
If you would like grok-manifest to honor the ``git-daemon-export-ok``
magic file and only add to the manifest those repositories specifically
marked as exportable, pass the ``--check-export-ok`` flag. See
``git-daemon(1)`` for more info on ``git-daemon-export-ok`` file.
MIRROR SETUP
------------
Install grokmirror on the mirror using your preferred way.
Locate repos.conf and modify it to reflect your needs. The default
configuration file is heavily commented.
Add a cronjob to run as frequently as you like. For example, add the
following to ``/etc/cron.d/grokmirror.cron``::
# Run grok-pull every minute as user "mirror"
* * * * * mirror /usr/bin/grok-pull -p -c /etc/grokmirror/repos.conf
Make sure the user "mirror" (or whichever user you specified) is able to
write to the toplevel, log and lock locations specified in repos.conf.
If you already have a bunch of repositories in the hierarchy that
matches the upstream mirror and you'd like to reuse them instead of
re-downloading everything from the master, you can pass the ``-r`` flag
to tell grok-pull that it's okay to reuse existing repos. This will
delete any existing remotes defined in the repository and set the new
origin to match what is configured in the repos.conf.
GROK-FSCK
---------
Git repositories can get corrupted whether they are frequently updated
or not, which is why it is useful to routinely check them using "git
fsck". Grokmirror ships with a "grok-fsck" utility that will run "git
fsck" on all mirrored git repositories. It is supposed to be run
nightly from cron, and will do its best to randomly stagger the checks
so only a subset of repositories is checked each night. Any errors will
be sent to the user set in MAILTO.
To enable grok-fsck, first locate the fsck.conf file and edit it to
match your setup -- e.g., it must know where you keep your local
manifest. Then, add the following to ``/etc/cron.d/grok-fsck.cron``::
# Make sure MAILTO is set, for error reports
MAILTO=root
# Run nightly, at 2AM
00 02 * * * mirror /usr/bin/grok-fsck -c /etc/grokmirror/fsck.conf
You can force a full run using the ``-f`` flag, but unless you only have
a few smallish git repositories, it's not recommended, as it may take
several hours to complete.
Before it runs, grok-fsck will put an advisory lock in the git-directory
being checked (repository.git/grokmirror.lock). Grok-pull will recognize
the lock and will postpone any incoming updates to that repository until
the next grok-pull run.
FAQ
---
Why is it called "grok mirror"?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because it's developed at kernel.org and "grok" is a mirror of "korg".
Also, because it groks git mirroring.
Why not just use rsync?
~~~~~~~~~~~~~~~~~~~~~~~
Rsync is extremely inefficient for the purpose of mirroring git trees
that mostly consist of a lot of small files that very rarely change.
Since rsync must calculate checksums on each file during each run, it
mostly results in a lot of disk thrashing.
Additionally, if several repositories share objects between each-other,
unless the disk paths are exactly the same on both the remote and local
mirror, this will result in broken git repositories.
It is also a bit silly, considering git provides its own extremely
efficient mechanism for specifying what changed between revision X and
revision Y.
Why not just run "git pull" from cron every minute?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is not a complete mirroring strategy, as this won't notify you when
the remote mirror adds new repositories. It is also not very nice to the
remote server, especially the one that carries hundreds of repositories.
Additionally, this will not automatically take care of shared
repositories for you. See "Shared repositories" under "CONCEPTS".
Platform: UNKNOWN
|