View previous topic :: View next topic |
Author |
Message |
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Thu Aug 28, 2014 11:43 am Post subject: Migrating portage tree to git |
|
|
Why isn't it done yet? Using git instead of rsync allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, etc. It also greatly speeds up sync times.
So far I've managed to convert the portrage cvs repo to git while saving the commit history.
Code: | ~/gentoo/git/gentoo-x86.git> du -sh .git
1.8G .git
|
While it seems to be quite a lot, it can be significantly reduced by dropping most of the history when it comes to public use.
If somebody provides me with hosting, I can make a daily-updated mirror as an experiment.
UPD: check this post for the how-to on updating your portage tree with git.[/url]
Last edited by apathetic on Tue Sep 02, 2014 1:27 pm; edited 2 times in total |
|
Back to top |
|
|
Roman_Gruber Advocate
Joined: 03 Oct 2006 Posts: 3846 Location: Austro Bavaria
|
Posted: Thu Aug 28, 2014 2:18 pm Post subject: |
|
|
I like those gentoo changelog files.
Does git provide hand written changelog files? these changelogs give me some ideas whats going on. I read it for some certain packages when hunting bugs.
Second thought. Is it possible to downgrade your portage tree from git?
i already had at least 3 times the demand to downgrade the portage tree to urgently fix a broken box. It happened once because a gnome upgrade was pushing in too many packages which for some reason broke my box and I was in a rush to have a working box. Downgrading around a week solved it and I was able to work this day. Later I upgraded my box successfully.
I belive the system how it is set up as of now has some mayor reasons why it is as it is.
And I also believe there are other issues which are much more important to improve as switching to something new.
If you want to reduce the portage tree on your box, read the article about squasfs and portage tree. you may be able to save some of your disk space. |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10589 Location: Somewhere over Atlanta, Georgia
|
Posted: Thu Aug 28, 2014 2:29 pm Post subject: Re: Migrating portage tree to git |
|
|
apathetic wrote: | ...allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, ... | I very much doubt that moving the Portage tree to Git will be accompanied by any of those changes as other portions of Portage depend on those elements being present. Now, faster than rsync? Yes, that will be welcome, indeed.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters.
Last edited by John R. Graham on Thu Aug 28, 2014 2:34 pm; edited 1 time in total |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Thu Aug 28, 2014 2:32 pm Post subject: |
|
|
tw04l124 wrote: | Does git provide hand written changelog files? |
git log > Changelog
tw04l124 wrote: | Is it possible to downgrade your portage tree from git? |
Yes, with ease.
tw04l124 wrote: | I belive the system how it is set up as of now has some mayor reasons why it is as it is. |
Or it might be just legacy.
tw04l124 wrote: | If you want to reduce the portage tree on your box, read the article about squasfs and portage tree. you may be able to save some of your disk space. |
I want to make the portage tree easier to maintain. |
|
Back to top |
|
|
a3li Retired Dev
Joined: 02 Sep 2008 Posts: 122 Location: 독일
|
Posted: Thu Aug 28, 2014 10:54 pm Post subject: |
|
|
Er, git isn't on the table to replace rsync. _________________ I am Confuism. Do not bother me. |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Thu Aug 28, 2014 11:29 pm Post subject: |
|
|
a3li wrote: | Er, git isn't on the table to replace rsync. |
Why not? The funtoo experience shows that its better in many ways. The only downside so far is the amount of space .git occupies which can be easily reduced by not cloning the full commit history. |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Fri Aug 29, 2014 6:03 am Post subject: Re: Migrating portage tree to git |
|
|
John R. Graham wrote: | I very much doubt that moving the Portage tree to Git will be accompanied by any of those changes as other portions of Portage depend on those elements being present |
Portage is prepared for thin-manifests (i.e. manifests only for the tarballs) since ages, and they are common practice for git overlays.
I do not know whether it is planned to remove ChangeLogs - I hope not, because otherwise you can see them only online or if you download the full history.
To be honest, I am also surprised how long gentoo needs for such a relatively small change: They are working on it now since several years. It is a sign that gentoo is seriously lacking man-power.
I am not a git expert, but I doubt it: Either you download shallow repositories, in which case you must download everything, or you must have stored the full history. This is the problem which you also have with the linux kernel.
And portage has this problem much more, since with any triivial change in an eclass thousands of files are changed.
This is why I am wondering since years why gentoo has no policy for collecting changes to eclasses (except for emergencies). |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Fri Aug 29, 2014 8:12 am Post subject: Re: Migrating portage tree to git |
|
|
mv wrote: | I am not a git expert, but I doubt it: Either you download shallow repositories, in which case you must download everything, or you must have stored the full history. |
Shallow clones have been reworked in Git 2.0, now you can safely pull data into a shallow clone. The details are here https://github.com/git/git/commit/58babfffdeeecaa4d6edecaac1fb0c595218b801 |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Fri Aug 29, 2014 12:08 pm Post subject: Re: Migrating portage tree to git |
|
|
apathetic wrote: | Shallow clones have been reworked in Git 2.0, now you can safely pull data into a shallow clone. |
But this sounds to me as if all changed files would need to be downloaded, and not only the "diffs".
OK, this is about the same as what rsync does with --whole-file which is what portage uses for the tree.
So, essentially you get the same traffic; probably some more since checksums need to be transferred for the new files - since all files are so short, this is a considerable factor. |
|
Back to top |
|
|
a3li Retired Dev
Joined: 02 Sep 2008 Posts: 122 Location: 독일
|
Posted: Fri Aug 29, 2014 12:19 pm Post subject: |
|
|
apathetic wrote: | a3li wrote: | Er, git isn't on the table to replace rsync. |
Why not? The funtoo experience shows that its better in many ways. The only downside so far is the amount of space .git occupies which can be easily reduced by not cloning the full commit history. |
I think github would be rather angry if we piggybacked all of our syncing on them. Oh, and we have plenty of rsync mirrors all over the world.
That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree. _________________ I am Confuism. Do not bother me. |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Fri Aug 29, 2014 1:28 pm Post subject: |
|
|
a3li wrote: | I think github would be rather angry if we piggybacked all of our syncing on them. Oh, and we have plenty of rsync mirrors all over the world. |
Nobody said anything about using github. Setting up a read-only git mirror shouldn't be any more difficult than setting up an rsync one. |
|
Back to top |
|
|
hasufell Retired Dev
Joined: 29 Oct 2011 Posts: 429
|
Posted: Fri Aug 29, 2014 11:34 pm Post subject: |
|
|
a3li wrote: |
That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree. |
The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.
You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.
Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented. |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Sat Aug 30, 2014 5:34 am Post subject: |
|
|
hasufell wrote: | The rsync approach is bad and should be killed with fire. It's inherently insecure. |
mgorny was working on a squashfs-diff based update mechanism: This sounds to be the traffic-least solution so far and could easily be imlpemented gpg-signed.
Does anybody know how far this has proceeded? AFAIK the solution is ready and just would need some servers to hold the diffs. |
|
Back to top |
|
|
hasufell Retired Dev
Joined: 29 Oct 2011 Posts: 429
|
Posted: Sat Aug 30, 2014 1:06 pm Post subject: |
|
|
mv wrote: | hasufell wrote: | The rsync approach is bad and should be killed with fire. It's inherently insecure. |
mgorny was working on a squashfs-diff based update mechanism: This sounds to be the traffic-least solution so far and could easily be imlpemented gpg-signed.
Does anybody know how far this has proceeded? AFAIK the solution is ready and just would need some servers to hold the diffs. |
git only fetches the diff to an existing clone |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Sat Aug 30, 2014 3:28 pm Post subject: |
|
|
hasufell wrote: | git only fetches the diff to an existing clone |
Apparently, this is not true for shallow clones which is what the "normal" user will have.
Moreover, squashfs is better anyway than any other filesystem for the portage tree.
And it seem this solution exists already while git probably still takes years. |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Sat Aug 30, 2014 3:42 pm Post subject: |
|
|
mv wrote: | Apparently, this is not true for shallow clones which is what the "normal" user will have. |
What? Read again about shallow clones, they have been reworked. I've posted the link earlier in this thread.
mv wrote: | Moreover, squashfs is better anyway than any other filesystem for the portage tree.
And it seem this solution exists already while git probably still takes years. |
Are you insane? Portage already supports git repos, nothing prevents you from changing url and sync-type in /etc/portage/repos.conf/gentoo.conf right now. The only problem right now is hosting, which I don't have. This is what this thread is all about. |
|
Back to top |
|
|
hasufell Retired Dev
Joined: 29 Oct 2011 Posts: 429
|
Posted: Sat Aug 30, 2014 4:58 pm Post subject: Re: Migrating portage tree to git |
|
|
apathetic wrote: | Why isn't it done yet? |
Because infra hasn't pushed for it hard enough.
Some people claim there are tools that need migration. I don't know of any (probably only some infra-specific stuff no one knows about). People are using git overlays for years. The portage tree is basically no different.
Afais we only need
* policies for git usage (commit policy, branching model etc) with appropriate documentation
* someone to set up the repo(s) with an empty history, migrate ssh keys, access management and do the initial migration commit
But all we get is bikeshed about this topic every 3 months on dev ML. I don't have any hopes left. |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Sat Aug 30, 2014 5:09 pm Post subject: |
|
|
apathetic wrote: | What? Read again about shallow clones, they have been reworked. I've posted the link earlier in this thread. |
And I have replied to your posting with the link. If you think that my reply was wrong (which may be, since one has to be a git expert to understand from the link what is really transferred), please provide facts. Again: To me it seems from the link as if all modified files are transferred completely, and not only their "diff"s. If you understand the contrary, please cite the corresponding passage from the link.
Because squashfs is a better filesystem than any other for the portage tree? It certainly is.
Quote: | The only problem right now is hosting |
Yes, the "only" problem is the complete transfer of the infrastructure. Which is already being worked on since many years. What makes you think that this will now suddenly change? |
|
Back to top |
|
|
mv Watchman
Joined: 20 Apr 2005 Posts: 6747
|
Posted: Sat Aug 30, 2014 5:18 pm Post subject: Re: Migrating portage tree to git |
|
|
hasufell wrote: | But all we get is bikeshed about this topic every 3 months on dev ML. I don't have any hopes left. |
Last time I heard about it, there was already some testing git repository set up. So I am quite optimistic that it will come, but even the people who had set up that testing repository did not claim that it is ready yet. It seems, one still has to be patient. |
|
Back to top |
|
|
Budoka l33t
Joined: 03 Jun 2012 Posts: 777 Location: Tokyo, Japan
|
Posted: Sat Aug 30, 2014 6:03 pm Post subject: |
|
|
hasufell wrote: | a3li wrote: |
That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree. |
The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.
You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.
Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented. |
This concerns me and is the first time I have heard that rsync was insecure/vulnerable. rsync isn't gpg signed? Can you elaborate? I currently use eix: Code: | eix-sync && glsa-check -l | Should I change to Code: | emerge-webrsync && glsa-check -l | ??? |
|
Back to top |
|
|
hasufell Retired Dev
Joined: 29 Oct 2011 Posts: 429
|
Posted: Sat Aug 30, 2014 7:09 pm Post subject: |
|
|
Budoka wrote: | hasufell wrote: | a3li wrote: |
That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree. |
The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.
You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.
Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented. |
This concerns me and is the first time I have heard that rsync was insecure/vulnerable. rsync isn't gpg signed? Can you elaborate? I currently use eix: Code: | eix-sync && glsa-check -l | Should I change to Code: | emerge-webrsync && glsa-check -l | ??? |
Ofc rsync is not gpg signed. You are syncing a file TREE, not pulling a single file.
Only the Manifests _may_ be signed. We don't even have implemented checking these automatically, so emerge-webrsync is (almost) the only secure way currently.
But it still sucks. We should use git with enforced gpg-signing of EVERY commit.
IMO, it's much easier to sneak in vulnerabilities on distro level instead of trying to do it on upstream level. But people still don't take this seriously enough. |
|
Back to top |
|
|
desultory Bodhisattva
Joined: 04 Nov 2005 Posts: 9410
|
Posted: Mon Sep 01, 2014 3:56 am Post subject: |
|
|
hasufell wrote: | Because infra hasn't pushed for it hard enough. | Either start giving them 36 hour days so that they might "push harder", or take up the work yourself. Until then such pronouncements come off as, at best, woefully poorly informed considering the information which is made available for anyone who cares to read it.
hasufell wrote: | Some people claim there are tools that need migration. I don't know of any (probably only some infra-specific stuff no one knows about). People are using git overlays for years. The portage tree is basically no different. | Perhaps you could try reading the mailing lists, the topic comes up rather often on gentoo-dev, mostly due to similarly poorly informed opinions being repeatedly regurgitated for no evident reason than to ignore good advice. Then there is the gentoo-scm mailing list, which is dedicated to discussion of such matters.
hasufell wrote: | IMO, it's much easier to sneak in vulnerabilities on distro level instead of trying to do it on upstream level. But people still don't take this seriously enough. | "We have met the enemy and he is us."? |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
|
Back to top |
|
|
steveL Watchman
Joined: 13 Sep 2006 Posts: 5153 Location: The Peanut Gallery
|
Posted: Mon Sep 01, 2014 11:02 am Post subject: Re: Migrating portage tree to git |
|
|
apathetic wrote: | Why isn't it done yet? Using git instead of rsync allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, etc. It also greatly speeds up sync times.
So far I've managed to convert the portage cvs repo to git while saving the commit history.
..
If somebody provides me with hosting, I can make a daily-updated mirror as an experiment. |
Well done :-)
Can I ask whether you've corrected the email addresses? When patrick posted his experimental results, that was the next step that needed to be carried out. #git on IRC: chat.freenode.net can help with the process.
I'm sure we can get hosting sorted between us, but it would be better done in collaboration with Patrick (bonsaikitten) imo, since he knows the Gentoo requirements very well, and runs a tinderbox against the tree.
Not sure about removing the hashes etc; perhaps commit hooks can add them when deploying the rsync tree. |
|
Back to top |
|
|
apathetic n00b
Joined: 28 Aug 2014 Posts: 36
|
Posted: Mon Sep 01, 2014 11:09 am Post subject: Re: Migrating portage tree to git |
|
|
steveL wrote: | Can I ask whether you've corrected the email addresses? |
What do you mean? |
|
Back to top |
|
|
|