Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Migrating away from http-replicator
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6699

PostPosted: Fri Jun 07, 2019 10:57 pm    Post subject: Migrating away from http-replicator Reply with quote

This is one of the last packages on my systems that only work with python2. It's extra-painful because it also requires portage and all its deps to be built with PYTHON_TARGETS=python2_7.

I've been looking for alternatives, but would it be better to just get rid of a fetch proxy entirely and share $DISTDIR over NFS? What happens when two emerge processes try to download the same file in that setup?
Back to top
View user's profile Send private message
Etal
Veteran
Veteran


Joined: 15 Jul 2005
Posts: 1753

PostPosted: Sun Jun 23, 2019 2:06 pm    Post subject: Reply with quote

Some time ago, had a VPS with spare disk space, and I had nginx set up to cache distfiles. Unfortunately it's long gone and I don't think I have a backup of the config.

The basic idea was that nginx would be set up as a caching proxy with gentoo mirrors set as upstream. So when I make a request to nginx, it goes out to the mirror and forwards the response while at the same time downloading it locally, so that the next time it gets a request for the same package, it just returns the local copy. I think that's basically what http-replicator does, and should be very simple to set up since nginx is essentially a proxy that can also serve web pages :)

(A small complication I ran into was that the proxy host would end up having 2 copies of each of its packages if it uses itself as the mirror (one in the nginx directory and one in /usr/portage/distfiles). So what I did was set DESTDIR to both directories (it actually takes a list), had nginx check /usr/portage/distfiles if it doesn't have it in its own directory before going out to the mirror, and the proxy host to connect to the mirrors (not itself) so that it downloads directly into distfiles. In retrospect, a cron job to just regularly wipe distfiles would have sufficed even if a little less efficient.)
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Sun Jun 23, 2019 2:54 pm    Post subject: Re: Migrating away from http-replicator Reply with quote

Ant P. wrote:
I've been looking for alternatives, but would it be better to just get rid of a fetch proxy entirely and share $DISTDIR over NFS? What happens when two emerge processes try to download the same file in that setup?

I've also felt that it was getting archaic.

My setup is that I download portage onto my central server. This is the only computer that points sync-uri to gentoo. The other machines point to the server.
Server:
Code:
sync-uri = rsync://rsync.gentoo.org/gentoo-portage
clients:
Code:
sync-uri = rsync://192.168.0.102/gentoo-portage


I already have apache running on the server (I used to use it to screen scrape my stock quotes). Perhaps I should sym-link /usr/portage/distfiles to a directory under the web page and make that the first entry in GENTOO_MIRRORS like http-replicator is now. Cronjobs could use rsync to update the server for packages that the clients use but the server doesn't, my server having about half the packages that the clients have. I think this is pretty much what http-replicator does minus the cronjobs.

OR, how about mounting /usr/portage/distfiles as a SAMBA share? The Linux file system should pretty much handle multiple access?
EDIT: DUH! That's what you said in the original post. Very sorry. Haven't had my morning coffee yet!

The first method has the advantage of having multiple copies of distfiles instead of only one. Disk space is cheap.
Back to top
View user's profile Send private message
Havin_it
Veteran
Veteran


Joined: 17 Jul 2005
Posts: 1187
Location: Edinburgh, UK

PostPosted: Sat Feb 22, 2020 2:38 pm    Post subject: Reply with quote

Hope the necrobump is forgiveable, as http-replicator has now been given the last rites I feel this topic has currency again. (I'm probably late to that party, but my affected server had to go on update-freeze for a couple of months while I took some other old packages out behind the barn.)

For my part I'm not keen on the samba/nfs mount approach as it leaves the client box reliant on the mount staying up throughout the merge process (which in my case is usually a day plus). I prefer to get all the distfiles onto the client before commencing a big merge, to reduce the things that can clobber the process the moment I leave for work and waste 8 hours' build-time :evil:

My preferred approach would be similar to what @Etal did with nginx, although use apache2 in my case since the server already runs it. I imagine it's possible: I've set up various proxies though my apache before, and I know it has extensive options for such. The stuff I've done was always guided though, and had a defined endpoint: in this case it would be proxying to any host requested, which I'm not at all familiar with.

If I can get something working I'll report it here. meanwhile if anyone has input or other suggestions, I'd welcome it.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Sat Feb 22, 2020 2:54 pm    Post subject: Reply with quote

Havin_it wrote:
If I can get something working I'll report it here. meanwhile if anyone has input or other suggestions, I'd welcome it.

Please! I've been wondering about this, although I don't mind keeping python2 around if it doesn't cause complications.
Back to top
View user's profile Send private message
cboldt
l33t
l33t


Joined: 24 Aug 2005
Posts: 899

PostPosted: Sat Feb 22, 2020 6:32 pm    Post subject: Reply with quote

I'm in the same boat. Been working with nginx because that is what I have. I'm able to get it to pass the download through. After quite a few attempts to get nginx to save what it is passing through, no joy. At this point don't have a working config to share. I do have plenty of frustration if anybody is running short.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6699

PostPosted: Sat Feb 22, 2020 9:34 pm    Post subject: Reply with quote

I just gave up on http-replicator after I saw it masked and made everything share a rw NFS mount. I already have it serving ebuilds over NFS, so if a problem were to happen it'd show up through those first.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Sat Feb 22, 2020 11:35 pm    Post subject: Reply with quote

Ant P, maybe this isn't a bad way to go. My server that gets portage from the internet is up 24/7 anyway, barring reboots or maintenance problems.
Generally, I like backups in situations like this, but replacing portage is only an "emerge --sync" away, right?
Do you share /usr/portage? or one or more sub-directories?
How about using rsync with the server as source? pro's and con's? I'd be interested in your thoughts.
And VERY interested in the thoughts of ultra-prudent NeddySeagoon.

What DO others who have multiple machines do who are not using http-replicator doing? Syncing one machine on Monday. another on Tuesday, ... ?

EDIT:
BTW, I moved the ebuild directory to local and tried to re-emerge version 3. The emerge failed because of missing patches. I'm reluctant to try v 4.0 because why switch to an unsupported version?
Back to top
View user's profile Send private message
Anon-E-moose
Advocate
Advocate


Joined: 23 May 2008
Posts: 4805
Location: Dallas area

PostPosted: Sun Feb 23, 2020 12:16 am    Post subject: Reply with quote

Never used http-replicator, but have set my system up this way.

mini server (24x7) nas as well as mini web server and portage (ebuilds and distfiles) and the rsync server

I have shareable portage on a tmpfs, /tmp/portage (sync daily)
I also keep a compressed tarball of the whole portage directory minus the distfiles on a mirrored raid array as well as the distfiles in their own directory on the array.
Code:
$ rsync nas::
portage           Gentoo Portage tree
bdist             Gentoo Portage distfiles repository
backup            backup file system
wbkup             windows backup pics & stuff
media             media/movies,etc


My main box (24x7) and my laptop (on when needed)
Both are set up to do emerge --sync from my nas box (although I typically only update laptop once a week)
Both check the distfiles storage area (mirror file) before they go to the internet to get a file
The laptop also checks against my main box's local distfile for any copy (in case I have downloaded it but haven't copied it to backup yet) using the mirror file

Then every so often I rsync from the various local machines distfiles to the storage one (haven't automated it yet) that will pick up any new packages.
_________________
PRIME x570-pro, 3700x, RX 550 - 5.8 zen kernel
Acer E5-575 (laptop), i3-7100u - i965 - 5.5 zen kernel
---both---
gcc 9.3.0, profile 17.1 (no-pie & modified) amd64-no-multilib, eudev, openrc, openbox, palemoon
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Sun Feb 23, 2020 12:34 am    Post subject: Reply with quote

Anon-E-moose wrote:
Both are set up to do emerge --sync from my nas box (although I typically only update laptop once a week)
Both check the distfiles storage area (mirror file) before they go to the internet to get a file

Could you amplify this? i.e. I understand what you are saying but not how to do it.
Back to top
View user's profile Send private message
Anon-E-moose
Advocate
Advocate


Joined: 23 May 2008
Posts: 4805
Location: Dallas area

PostPosted: Sun Feb 23, 2020 12:54 am    Post subject: Reply with quote

I'll lay it out in more detail in the morning.
_________________
PRIME x570-pro, 3700x, RX 550 - 5.8 zen kernel
Acer E5-575 (laptop), i3-7100u - i965 - 5.5 zen kernel
---both---
gcc 9.3.0, profile 17.1 (no-pie & modified) amd64-no-multilib, eudev, openrc, openbox, palemoon
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Sun Feb 23, 2020 2:26 am    Post subject: Reply with quote

Thank You!
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6699

PostPosted: Sun Feb 23, 2020 3:00 am    Post subject: Reply with quote

Tony0945 wrote:
Ant P, maybe this isn't a bad way to go. My server that gets portage from the internet is up 24/7 anyway, barring reboots or maintenance problems.
Generally, I like backups in situations like this, but replacing portage is only an "emerge --sync" away, right?
Do you share /usr/portage? or one or more sub-directories?
How about using rsync with the server as source? pro's and con's? I'd be interested in your thoughts.

At this point I've outsourced everything to NFS:
- $DISTDIR
- repositories (the server calls eix-sync, so this can be read-only)
- $PKGDIR (I use FEATURES=binpkg-multi-instance and add $CHOST in the path to be safe)
- parts of /etc/portage/ as symlinks (helps to keep things like env/ and package.license/ consistent)
- /var/cache/eix/
- /var/cache/edb/dep/ (this isn't supported, but I already did the rest so might as well. It's also exported read-only)
- /usr/src/linux/ for good measure, saving a few gigabytes

Apart from /etc/portage most of those can be shredded and re-populated trivially, and are covered by checksums everywhere, so I'm not too worried about data loss.
Back to top
View user's profile Send private message
Anon-E-moose
Advocate
Advocate


Joined: 23 May 2008
Posts: 4805
Location: Dallas area

PostPosted: Sun Feb 23, 2020 11:29 am    Post subject: Reply with quote

Let me see if I can more more sense of what I posted earlier :lol:

Box A - mini server - nas, web, backup including /usr/portage, distfiles and packages.
- run rsyncd
- from /etc/rsyncd.conf
Code:
[portage]
   path = /tmp/portage
   comment = Gentoo Portage tree
   exclude = /distfiles /packages

[bdist]
   path = /mnt/backup/portage
   comment = Gentoo Portage distfiles repository
   include = /distfiles
   exclude = "*"
   read only = false


I have set portage up on a tmpfs as
1. I don't update the mini server so I didn't want to use the real /usr/portage directory.
2. easy enough to repopulate the tmp directory if the machine had to be rebooted.
3. I don't use any checking of manifest as I leave that for the client machine, it's a download point only

cron
@ ~2.30AM get portage into tmpfs
@ ~3AM backup all systems that are up, including /tmp/portage on this server, copy into backup directory.



Box B - main machine.
- also run rsyncd
- from /etc/rsyncd.conf
Code:
[portage]
   path = /usr/portage
   comment = Gentoo Portage tree
   exclude = /distfiles /packages

[distfiles]
   path = /usr/portage
   comment = Gentoo distfiles
   include = /distfiles
   exclude = "*"


- /etc/portage/mirror
Code:
$ cat /etc/portage/mirrors
local rsync://192.168.1.11/bdist


- /etc/portage/repos.cons/gentoo.conf
Code:
 $ cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
...
[gentoo]
location = /usr/portage
sync-type = rsync
sync-uri = rsync://192.168.1.11/portage
...


cron
@ 2:45AM "emerge --sync" (against nas box)

Box C - laptop
- not running rsyncd.

- /etc/portage/mirror
Code:
$ cat /etc/portage/mirrors
local rsync://192.168.1.10/distfiles rsync://192.168.1.11/bdist


- /etc/portage/repos.cons/gentoo.conf
Code:
$ cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
...

[gentoo]
location = /usr/portage
sync-type = rsync
sync-uri = rsync://192.168.1.11/portage



---------------
Box A 192.168.1.11
Box B 192.168.1.10
Box C 192.168.1.17
--------------

So, with this doing "emerge --sync" on clients, syncs against the nas machine (/tmp/portage) when distfiles are needed, then
box b (main client) checks against nas distfiles backup directory, then if doesn't find tarball (new) then regular download
box c (laptop) checks against nas distfiles backup directory, then box b distfiles dir, then if doesn't find tarball (new) then regular download

after emerge @world/pkg then any files downloaded get rsync'd against box a (backup distfiles directory)

---

Note: with this setup all machines have their own /usr/portage including a distfiles directory

If one didn't want to keep a local /usr/portage with or w/o distfiles and packages directories then it would be easy enough to delete all local files that you want to keep after coping then to nas server for backup

https://wiki.gentoo.org/wiki//etc/portage/mirrors
https://wiki.gentoo.org/wiki/Local_Mirror
https://wiki.gentoo.org/wiki/Local_distfiles_cache

---

So a quick look with rsync gives this

Box a
Code:
$ rsync nas::
portage           Gentoo Portage tree
bdist             Gentoo Portage distfiles repository


Box b
Code:
 $ rsync don64::
portage           Gentoo Portage tree
distfiles         Gentoo distfiles

_________________
PRIME x570-pro, 3700x, RX 550 - 5.8 zen kernel
Acer E5-575 (laptop), i3-7100u - i965 - 5.5 zen kernel
---both---
gcc 9.3.0, profile 17.1 (no-pie & modified) amd64-no-multilib, eudev, openrc, openbox, palemoon
Back to top
View user's profile Send private message
Havin_it
Veteran
Veteran


Joined: 17 Jul 2005
Posts: 1187
Location: Edinburgh, UK

PostPosted: Sun Feb 23, 2020 7:12 pm    Post subject: Reply with quote

For those interested, here's an apache2 vhost config for a basic caching forward http proxy.
Code:

Listen 8088
<IfDefine DEFAULT_VHOST>

<VirtualHost *:8088>
    ServerName distproxy.mytld.dom
    ServerAlias distproxy

    ErrorLog /var/log/apache2/distproxy-error_log
    CustomLog /var/log/apache2/distproxy-access_log common

   ProxyRequests On
   ProxyVia On

    #Cache
    CacheEnable disk http://
    CacheRoot /path/to/distfiles/
    CacheIgnoreNoLastMod On
    CacheDefaultExpire 86400
#Allow caching files up to 300MB
    CacheMaxFileSize 300000000

    <Proxy "*">
#Make sure only local IPs can use the proxy
    Require ip 192.168
    </Proxy>
</VirtualHost>

</IfDefine>


I just got this working so probably might need tweaks to behave as desired over time wrt retention. YMMV.

Not really what *I* was hoping for, as the on-disk cache isn't just a big ol' pile of files with their original names as was the case with http-replicator. That means the server has to "download" the distfiles itself into its $DISTDIR when emerging, meaning two copies of every file. Better if the cache was in a form you could just point the server's $DISTDIR towards.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6699

PostPosted: Sun Feb 23, 2020 7:57 pm    Post subject: Reply with quote

Havin_it wrote:
Not really what *I* was hoping for, as the on-disk cache isn't just a big ol' pile of files with their original names as was the case with http-replicator. That means the server has to "download" the distfiles itself into its $DISTDIR when emerging, meaning two copies of every file. Better if the cache was in a form you could just point the server's $DISTDIR towards.

Yeah, that was one nice thing about http-replicator's flat layout... just point PORTAGE_RO_DISTDIRS at it and it symlinks instead of fetching them.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Mon Feb 24, 2020 12:12 am    Post subject: Reply with quote

Ran across this Gentoo wiki Local Mirror
My /etc/rsyncd.conf is already laid out as shown. Except I have no max connections line.

Is this all I have to do? Then unmerge http-repllicator?


Last edited by Tony0945 on Mon Feb 24, 2020 1:33 am; edited 1 time in total
Back to top
View user's profile Send private message
Anon-E-moose
Advocate
Advocate


Joined: 23 May 2008
Posts: 4805
Location: Dallas area

PostPosted: Mon Feb 24, 2020 12:31 am    Post subject: Reply with quote

Local mirror handles the sharing of /usr/portage/<files> but not caching distfiles
_________________
PRIME x570-pro, 3700x, RX 550 - 5.8 zen kernel
Acer E5-575 (laptop), i3-7100u - i965 - 5.5 zen kernel
---both---
gcc 9.3.0, profile 17.1 (no-pie & modified) amd64-no-multilib, eudev, openrc, openbox, palemoon
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Mon Feb 24, 2020 1:41 am    Post subject: Reply with quote

How about changing
Code:
exclude=distfiles/ packages/
to
Code:
exclude= packages/

Packages are arch unique to each CPU arch but not distfiles. Would this change allow distfiles to rsyn'ed? First I would copy them all to the server to prevent them from being lost.
I have noticed on common packages that the distfile is downloaded over the internet on each machine. Since my internet is only 30 MBps tops, downloading (syncing) from the LAN should be much faster. But if the server doesn't download then they will get lost at the next sync so that's not so good. mounting from a common location is best of all. But that means yet another partition doesn't it?

EDIT:
Code:
MSI ~ # du -hs /usr/portage/distfiles
40G     /usr/portage/distfiles
How big a partition would I need to have? 64G? OTOH, distfiles can be on HDD instead of SSD as they are on two machines now and more in the future?
On the third hand, HDD's are really cheap and I could add another somewhere.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Mon Feb 24, 2020 3:32 am    Post subject: Reply with quote

OK, this seems to work for distfiles. it probably needs refinement:

On the server, named Trantor, I added this to /etc/samba/smb.conf
Code:
[distfiles]
comment = Central System Portage Distfiles
        path=/usr/portage/distfiles
        force user = portage
        browseable = yes
        public = yes
        create mask = 0644
        directory mask = 07ff
        writeable = yes
        read only = no
        hosts allow = 192.168.0.96/27  127.
        acl allow execute always = True

On the client, named MSI, I added this to /etc/fstab
Code:
//trantor/distfiles     /usr/portage/distfiles  cifs    auto,vers=1.0,users,user=guest,password=none,rw   0 0

It looks OK . I can create and destroy files on either side as root. The files are owned by portage:portage. There is added cost on the clients of reading over the LAN, but it's gigabyte LAN.
So anything downloaded by a fetch is instantly available to the other boxes without syncing.

Anyone care to correct my code here? I'm sure it can use some correction.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 4046
Location: Illinois, USA

PostPosted: Tue Feb 25, 2020 11:12 pm    Post subject: Reply with quote

There must be a permissions problem. The clent could not download the taraball for gentoo-sources-5.4.X until I unmounted /usr/portage/distfiles. The server is still on gentoo-sources-4.x
All i see in /var/log/messages is:
Code:
MSI ~ # cat /var/log/messages
Executed nightly log rotation
Feb 25 02:15:01 MSI root[20042]: /home/tony/.VirtualBox: 311.6 GiB (334578769920 bytes) trimmed on /dev/sdc3
Feb 25 02:15:43 MSI root[20042]: /boot/efi: 96.5 MiB (101144064 bytes) trimmed on /dev/sda1
Feb 25 02:15:43 MSI root[20042]: /: 125.4 GiB (134606393344 bytes) trimmed on /dev/sda2
Feb 25 14:56:47 MSI kernel: cifs_setlk: 9 callbacks suppressed
Feb 25 17:01:08 MSI kernel: CIFS: Attempting to mount //trantor/distfiles
The last line surely comes from re-mounting /usr/portage/distfiles after the emerge.
Back to top
View user's profile Send private message
mmogilvi
n00b
n00b


Joined: 13 May 2011
Posts: 52

PostPosted: Fri Feb 28, 2020 5:34 am    Post subject: Reply with quote

I recently updated a post I wrote several months ago describing several aspects of the current status of http-replicator in a fair amount of detail. See https://forums.gentoo.org/viewtopic-t-1101822-highlight-httpreplicator.html

New changes: I suspect that repcacheman will probably stop working before http-replicator itself, for reasons outlined in the above post.

----
@Anon-E-moose: Generally it is best to configure clients to use http-replicator via the "http_proxy" variable in make.conf, rather than pointing gentoo.conf's sync-uri to an incomplete proxy. This allows it to work transparently even for files that aren't downloaded at all yet.

Also see the original http-replcator forum thread: https://forums.gentoo.org/viewtopic-t-173226.html , and the instructions that version 3's ebuild logs after install (although the FETCHCOMMAND is no longer correct; see the status post linked above):
Code:
        ewarn "Before starting http-replicator, please follow the next few steps:"
        elog "- Modify /etc/conf.d/http-replicator if required."
        ewarn "- Run /usr/bin/repcacheman to set up the cache."
        elog "- Add http_proxy=\"http://serveraddress:8080\" to make.conf on"
        elog "  the server as well as on the client machines."
        elog "- Make sure FETCHCOMMAND adds the X-unique-cache-name header to"
        elog "  HTTP requests in make.conf (or maybe portage will add it to"
        elog "  the default make.globals someday).  Example:"
        elog '   FETCHCOMMAND="wget -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" --header=\"X-unique-cache-name: \${FILE}\" \"\${URI}\""'
        elog '   RESUMECOMMAND="wget -c -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" --header=\"X-unique-cache-name: \${FILE}\" \"\${URI}\""'
        elog "- Arrange to periodically run repcacheman on this server,"
        elog "  to clean up the local /usr/portage/distfiles directory."
        elog "- Arrange to periodically run something like the following"
        elog "  on this server.  'eclean' is in app-portage/gentoolkit."
        elog "    ( export DISTDIR=/var/cache/http-replicator/"
        elog "      eclean -i distfiles )"
        elog "- Even with FETCHCOMMAND fixing most cases, occasionally"
        elog "  an older invalid version of a file may end up in the cache,"
        elog "  causing checksum failures when portage tries to fetch"
        elog "  it.  To recover, either use eclean (above), manually delete"
        elog "  the relevant file from the cache, or temporarily comment"
        elog "  out the http_proxy setting.  Commenting only requires"
        elog "  access to client config, not server cache."
        elog "- Make sure GENTOO_MIRRORS in /etc/portage/make.conf starts"
        elog "  with several good http mirrors."
        elog
        elog "For more information please refer to the following forum thread:"
        elog "  http://forums.gentoo.org/viewtopic-t-173226.html"
        elog
Back to top
View user's profile Send private message
mmogilvi
n00b
n00b


Joined: 13 May 2011
Posts: 52

PostPosted: Fri Feb 28, 2020 5:36 am    Post subject: Reply with quote

I intend to keep running a local copy of the http-replicator 3 ebuild I saved off my personal overlay until it actually breaks more than I can live with. After that, if a better alternative doesn't pop up, I'll probably just stop trying to use any proxy server at all, even though I don't really want to increase load on other people's servers.

For version 3, @Tony0945 mentioned missing patches: The trick is to include the files from the net-proxy/http-replicator/files directory in your local overlay, not just the ebuild file. Also, technically neither version 3 nor version 4 has been supported for many years (see above linked forum post).

I don't really like any of the alternatives mentioned on this thread so far:

  • Other proxy servers like nginx seem much more general and bloated than the special purpose http-replicator. They definitely don't have anything like the repcacheman script to remove duplicates from distfiles that are easily re-downloadable from the cache. Nor are they likely to have the ability to use "(export DISTFILES=/var/cache/http-replicator/ ; eclean -i distfiles)" (from app-portage/gentoolkit) to clean up the cache directory. (On the other hand, figuring out how to set up some other proxy server is the best replacement strategy I've heard of so far.)
  • Periodically rsync'ing distfiles requires extra care in sequencing actions to really get the benefit. It may copy around files that are only needed on one machine, and/or won't use already-downloaded copies files that don't happen to be needed by any of the "parent" rsync servers (especially since the "rootmost" server is likely headless and light).
  • Remote file systems (NFS or Samba) require always making sure it is mounted before using emerge. Or else auto-mount it at boot time, but that can be problematic if you aren't always on the same LAN with the server, or if the server isn't always up and running. Plus these tend to be really big/wide exposure points, increasing security risks.
Back to top
View user's profile Send private message
cboldt
l33t
l33t


Joined: 24 Aug 2005
Posts: 899

PostPosted: Fri Feb 28, 2020 10:36 am    Post subject: Reply with quote

I have not been working on the nginx possibility, but think it may not even work in the long run. It saves the cached files named as md5sum of a concatentation that includes URI. This precludes cleaning with distclean. It also creates likelihood of duplicate entries in the nginx cache, if the same source happens to be asked from different repositories.

There make be a way to command nginx to save the cached files as their actual filename, but I wasn't going to look for that until I got the caching to work at all .

Maybe easier to learn the http-replicator code and update that to python3 ;-)
Back to top
View user's profile Send private message
Anon-E-moose
Advocate
Advocate


Joined: 23 May 2008
Posts: 4805
Location: Dallas area

PostPosted: Fri Feb 28, 2020 12:06 pm    Post subject: Reply with quote

It would take a rewrite to move http-replicator to python3 and I suspect it's not exactly straightforward as the code hasn't been touched/updated since 2013.
_________________
PRIME x570-pro, 3700x, RX 550 - 5.8 zen kernel
Acer E5-575 (laptop), i3-7100u - i965 - 5.5 zen kernel
---both---
gcc 9.3.0, profile 17.1 (no-pie & modified) amd64-no-multilib, eudev, openrc, openbox, palemoon
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum