Huge performance difference between cloning the same repo in git and hg versions

A repository (with some quite large files in it) is added to RhodeCode in both its git and hg versions, and when cloning it (without checkout) it takes 10 mn for Mercurial, and 20 seconds for git. It is a problem for us as we think of using Mercurial.

Any thought on this difference, and how to reduce it?
I use RhodeCode CE 4.5.2

Thanks,
Charles

Hi Charles,

It’s very odd, usually HG backend is much faster than GIT one since it uses more optimized internals. How many workers = N do you have set inside rhodecode.ini configurations ?

Also is it possible to test this on our test instance ? (code.rhodecode.com) ?

Hi Marcin,

Unfortunately it isn’t a project I can share, even for testing purposes. I have a VM with 4 CPUs out of 8 on host, and updated nb of workers from 2 to 9 on both VcsServer and Community, but it doesn’t help.

Puzzling, have you tried running it with hg clone --debug ?

Does your instance is running behind an HTTP server? Maybe it’s limiting the speed? I’d try a direct clone to RhodeCode instance IP if it’s running behind an HTTP server.

In 4.4+ release all clones should run via a streaming HTTP mode which really should be fast. My current suspicion is that something might be blocking that?

Using --debug I make the server crash (?): http://pastebin.com/jXKw5Ytg

Clone output:

$hg clone --debug --noupdate http://user@10.67.6.69/repo
using http://10.67.6.69/repo
http auth: user user, password not set
sending capabilities command
query 1; heads
sending batch command
requesting all changes
sending getbundle command
abort: HTTP Error 500: Internal Server Error

I downloaded the RhodeCode EE OVA VM, upgraded to 4.5.2 and switched to community, so no HTTP server behind it. This is run inside an enterprise network, so I don’t what kind of filtering there may be here. Let me try with a repo of moderate size.

Have you checked the logs of vcsserver after the 500 error ? There should be some trace of the error.

If you’re using our OVA then also try cloning behind a nginx server which is there pre-installed.

Best,

EDIT: sorry i missed that there’s a traceback

Interesting is output of vcsserver log, with the error traceback of the last clone command: http://pastebin.com/6jVXsFxA

Is there some docs for explaining how to switch OVA to the nginx server? I only find https://docs.rhodecode.com/RhodeCode-Enterprise/admin/nginx-config-example.html but I don’t understand exactly what I should do?

there’s an NGINX pre-configured and isntalled on the OVA, you should just adjust the file in /etc/nginx/sites-enabled/ and bind it to server_name that you have set via DNS or simply /etc/hosts file. Then accessing VM just by that server_name should handle the connection via NGINX.

Also i found this: settings ui from file: [paths] default=C:\Users\cbrossollet\dev\empreinte10

Is there a custom .hgrc file in this repo ? maybe it’s causing issues ? RhodeCode reads such files and settings from those propagate to RhodeCode server.

Switched to NGINX. The hg.rc was just referencing inexistent path, I removed that.

Still the same behavior, Gunicorn process eating 100% cpu for the first 5 mn on server side (and 40% of memory, on a total of 4 GB), and then transfer occurs, albeit quite slow (python’s client often eating cpu without network or file activity)

Hmm running out of ideas here. It’s very odd we have users having 100GB mercurial repos and no problems of such.

It looks like my repo had really too big files, a simple hg serve was serving a cloen in an endless time. I have converted to the largefiles extensions, and now things are done in a minute.
I’m all well now.

Thanks for sharing. Interesting that you hit Mercurial limitations. Btw how big were the files, was the same repo fine when using GIT (with same file sizes) ?

Biggest file is 430 MB, lot of files around 10 MB, and yes, using Git with a replica of the repo, things are going pretty fast (20 sec)