Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Migrating portage tree to git
View unanswered posts
View posts from last 24 hours

Goto page 1, 2, 3  Next  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Thu Aug 28, 2014 11:43 am    Post subject: Migrating portage tree to git Reply with quote

Why isn't it done yet? Using git instead of rsync allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, etc. It also greatly speeds up sync times.

So far I've managed to convert the portrage cvs repo to git while saving the commit history.

Code:
~/gentoo/git/gentoo-x86.git> du -sh .git
1.8G   .git


While it seems to be quite a lot, it can be significantly reduced by dropping most of the history when it comes to public use.

If somebody provides me with hosting, I can make a daily-updated mirror as an experiment.

UPD: check this post for the how-to on updating your portage tree with git.[/url]


Last edited by apathetic on Tue Sep 02, 2014 1:27 pm; edited 2 times in total
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3846
Location: Austro Bavaria

PostPosted: Thu Aug 28, 2014 2:18 pm    Post subject: Reply with quote

I like those gentoo changelog files.

Does git provide hand written changelog files? these changelogs give me some ideas whats going on. I read it for some certain packages when hunting bugs.

Second thought. Is it possible to downgrade your portage tree from git?

i already had at least 3 times the demand to downgrade the portage tree to urgently fix a broken box. It happened once because a gnome upgrade was pushing in too many packages which for some reason broke my box and I was in a rush to have a working box. Downgrading around a week solved it and I was able to work this day. Later I upgraded my box successfully.

I belive the system how it is set up as of now has some mayor reasons why it is as it is.

And I also believe there are other issues which are much more important to improve as switching to something new.


If you want to reduce the portage tree on your box, read the article about squasfs and portage tree. you may be able to save some of your disk space.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10589
Location: Somewhere over Atlanta, Georgia

PostPosted: Thu Aug 28, 2014 2:29 pm    Post subject: Re: Migrating portage tree to git Reply with quote

apathetic wrote:
...allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, ...
I very much doubt that moving the Portage tree to Git will be accompanied by any of those changes as other portions of Portage depend on those elements being present. Now, faster than rsync? Yes, that will be welcome, indeed. ;)

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.


Last edited by John R. Graham on Thu Aug 28, 2014 2:34 pm; edited 1 time in total
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Thu Aug 28, 2014 2:32 pm    Post subject: Reply with quote

tw04l124 wrote:
Does git provide hand written changelog files?

git log > Changelog

tw04l124 wrote:
Is it possible to downgrade your portage tree from git?

Yes, with ease.

tw04l124 wrote:
I belive the system how it is set up as of now has some mayor reasons why it is as it is.

Or it might be just legacy.

tw04l124 wrote:
If you want to reduce the portage tree on your box, read the article about squasfs and portage tree. you may be able to save some of your disk space.


I want to make the portage tree easier to maintain.
Back to top
View user's profile Send private message
a3li
Retired Dev
Retired Dev


Joined: 02 Sep 2008
Posts: 122
Location: 독일

PostPosted: Thu Aug 28, 2014 10:54 pm    Post subject: Reply with quote

Er, git isn't on the table to replace rsync.
_________________
I am Confuism. Do not bother me.
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Thu Aug 28, 2014 11:29 pm    Post subject: Reply with quote

a3li wrote:
Er, git isn't on the table to replace rsync.

Why not? The funtoo experience shows that its better in many ways. The only downside so far is the amount of space .git occupies which can be easily reduced by not cloning the full commit history.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Fri Aug 29, 2014 6:03 am    Post subject: Re: Migrating portage tree to git Reply with quote

John R. Graham wrote:
I very much doubt that moving the Portage tree to Git will be accompanied by any of those changes as other portions of Portage depend on those elements being present

Portage is prepared for thin-manifests (i.e. manifests only for the tarballs) since ages, and they are common practice for git overlays.
I do not know whether it is planned to remove ChangeLogs - I hope not, because otherwise you can see them only online or if you download the full history.
To be honest, I am also surprised how long gentoo needs for such a relatively small change: They are working on it now since several years. It is a sign that gentoo is seriously lacking man-power.
Quote:
faster than rsync

I am not a git expert, but I doubt it: Either you download shallow repositories, in which case you must download everything, or you must have stored the full history. This is the problem which you also have with the linux kernel.
And portage has this problem much more, since with any triivial change in an eclass thousands of files are changed.
This is why I am wondering since years why gentoo has no policy for collecting changes to eclasses (except for emergencies).
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Fri Aug 29, 2014 8:12 am    Post subject: Re: Migrating portage tree to git Reply with quote

mv wrote:
I am not a git expert, but I doubt it: Either you download shallow repositories, in which case you must download everything, or you must have stored the full history.


Shallow clones have been reworked in Git 2.0, now you can safely pull data into a shallow clone. The details are here https://github.com/git/git/commit/58babfffdeeecaa4d6edecaac1fb0c595218b801
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Fri Aug 29, 2014 12:08 pm    Post subject: Re: Migrating portage tree to git Reply with quote

apathetic wrote:
Shallow clones have been reworked in Git 2.0, now you can safely pull data into a shallow clone.

But this sounds to me as if all changed files would need to be downloaded, and not only the "diffs".
OK, this is about the same as what rsync does with --whole-file which is what portage uses for the tree.
So, essentially you get the same traffic; probably some more since checksums need to be transferred for the new files - since all files are so short, this is a considerable factor.
Back to top
View user's profile Send private message
a3li
Retired Dev
Retired Dev


Joined: 02 Sep 2008
Posts: 122
Location: 독일

PostPosted: Fri Aug 29, 2014 12:19 pm    Post subject: Reply with quote

apathetic wrote:
a3li wrote:
Er, git isn't on the table to replace rsync.

Why not? The funtoo experience shows that its better in many ways. The only downside so far is the amount of space .git occupies which can be easily reduced by not cloning the full commit history.


I think github would be rather angry if we piggybacked all of our syncing on them. Oh, and we have plenty of rsync mirrors all over the world.

That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree.
_________________
I am Confuism. Do not bother me.
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Fri Aug 29, 2014 1:28 pm    Post subject: Reply with quote

a3li wrote:
I think github would be rather angry if we piggybacked all of our syncing on them. Oh, and we have plenty of rsync mirrors all over the world.


Nobody said anything about using github. Setting up a read-only git mirror shouldn't be any more difficult than setting up an rsync one.
Back to top
View user's profile Send private message
hasufell
Retired Dev
Retired Dev


Joined: 29 Oct 2011
Posts: 429

PostPosted: Fri Aug 29, 2014 11:34 pm    Post subject: Reply with quote

a3li wrote:

That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree.

The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.

You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.

Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sat Aug 30, 2014 5:34 am    Post subject: Reply with quote

hasufell wrote:
The rsync approach is bad and should be killed with fire. It's inherently insecure.

mgorny was working on a squashfs-diff based update mechanism: This sounds to be the traffic-least solution so far and could easily be imlpemented gpg-signed.
Does anybody know how far this has proceeded? AFAIK the solution is ready and just would need some servers to hold the diffs.
Back to top
View user's profile Send private message
hasufell
Retired Dev
Retired Dev


Joined: 29 Oct 2011
Posts: 429

PostPosted: Sat Aug 30, 2014 1:06 pm    Post subject: Reply with quote

mv wrote:
hasufell wrote:
The rsync approach is bad and should be killed with fire. It's inherently insecure.

mgorny was working on a squashfs-diff based update mechanism: This sounds to be the traffic-least solution so far and could easily be imlpemented gpg-signed.
Does anybody know how far this has proceeded? AFAIK the solution is ready and just would need some servers to hold the diffs.

git only fetches the diff to an existing clone
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sat Aug 30, 2014 3:28 pm    Post subject: Reply with quote

hasufell wrote:
git only fetches the diff to an existing clone

Apparently, this is not true for shallow clones which is what the "normal" user will have.
Moreover, squashfs is better anyway than any other filesystem for the portage tree.
And it seem this solution exists already while git probably still takes years.
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Sat Aug 30, 2014 3:42 pm    Post subject: Reply with quote

mv wrote:
Apparently, this is not true for shallow clones which is what the "normal" user will have.

What? Read again about shallow clones, they have been reworked. I've posted the link earlier in this thread.

mv wrote:
Moreover, squashfs is better anyway than any other filesystem for the portage tree.
And it seem this solution exists already while git probably still takes years.


Are you insane? Portage already supports git repos, nothing prevents you from changing url and sync-type in /etc/portage/repos.conf/gentoo.conf right now. The only problem right now is hosting, which I don't have. This is what this thread is all about.
Back to top
View user's profile Send private message
hasufell
Retired Dev
Retired Dev


Joined: 29 Oct 2011
Posts: 429

PostPosted: Sat Aug 30, 2014 4:58 pm    Post subject: Re: Migrating portage tree to git Reply with quote

apathetic wrote:
Why isn't it done yet?

Because infra hasn't pushed for it hard enough.

Some people claim there are tools that need migration. I don't know of any (probably only some infra-specific stuff no one knows about). People are using git overlays for years. The portage tree is basically no different.

Afais we only need
* policies for git usage (commit policy, branching model etc) with appropriate documentation
* someone to set up the repo(s) with an empty history, migrate ssh keys, access management and do the initial migration commit

But all we get is bikeshed about this topic every 3 months on dev ML. I don't have any hopes left.
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sat Aug 30, 2014 5:09 pm    Post subject: Reply with quote

apathetic wrote:
What? Read again about shallow clones, they have been reworked. I've posted the link earlier in this thread.

And I have replied to your posting with the link. If you think that my reply was wrong (which may be, since one has to be a git expert to understand from the link what is really transferred), please provide facts. Again: To me it seems from the link as if all modified files are transferred completely, and not only their "diff"s. If you understand the contrary, please cite the corresponding passage from the link.
Quote:
Are you insane?

Because squashfs is a better filesystem than any other for the portage tree? It certainly is.
Quote:
The only problem right now is hosting

Yes, the "only" problem is the complete transfer of the infrastructure. Which is already being worked on since many years. What makes you think that this will now suddenly change?
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6747

PostPosted: Sat Aug 30, 2014 5:18 pm    Post subject: Re: Migrating portage tree to git Reply with quote

hasufell wrote:
But all we get is bikeshed about this topic every 3 months on dev ML. I don't have any hopes left.

Last time I heard about it, there was already some testing git repository set up. So I am quite optimistic that it will come, but even the people who had set up that testing repository did not claim that it is ready yet. It seems, one still has to be patient.
Back to top
View user's profile Send private message
Budoka
l33t
l33t


Joined: 03 Jun 2012
Posts: 777
Location: Tokyo, Japan

PostPosted: Sat Aug 30, 2014 6:03 pm    Post subject: Reply with quote

hasufell wrote:
a3li wrote:

That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree.

The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.

You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.

Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented.


This concerns me and is the first time I have heard that rsync was insecure/vulnerable. rsync isn't gpg signed? Can you elaborate? I currently use eix:
Code:
eix-sync && glsa-check -l
Should I change to
Code:
emerge-webrsync && glsa-check -l
???
Back to top
View user's profile Send private message
hasufell
Retired Dev
Retired Dev


Joined: 29 Oct 2011
Posts: 429

PostPosted: Sat Aug 30, 2014 7:09 pm    Post subject: Reply with quote

Budoka wrote:
hasufell wrote:
a3li wrote:

That doesn't go to say the current rsync approach can't be improved, and there are things being worked on (some of them you've mentioned) to reduce the number of possibly redundant information in the tree.

The rsync approach is bad and should be killed with fire. It's inherently insecure. Everyone who does not use "emerge-webrsync" with gpg signature check should think twice about what he is doing.

You are pulling totally unverified stuff. Not only unverified in the sense of who wrote the ebuilds, but also unverified in the sense of what you actually get from the server or the man in the middle.

Using git makes it easier to solve these problems. That's why the accepted GLEPs 58-61 are still unimplemented.


This concerns me and is the first time I have heard that rsync was insecure/vulnerable. rsync isn't gpg signed? Can you elaborate? I currently use eix:
Code:
eix-sync && glsa-check -l
Should I change to
Code:
emerge-webrsync && glsa-check -l
???

Ofc rsync is not gpg signed. You are syncing a file TREE, not pulling a single file.

Only the Manifests _may_ be signed. We don't even have implemented checking these automatically, so emerge-webrsync is (almost) the only secure way currently.

But it still sucks. We should use git with enforced gpg-signing of EVERY commit.

IMO, it's much easier to sneak in vulnerabilities on distro level instead of trying to do it on upstream level. But people still don't take this seriously enough.
Back to top
View user's profile Send private message
desultory
Bodhisattva
Bodhisattva


Joined: 04 Nov 2005
Posts: 9410

PostPosted: Mon Sep 01, 2014 3:56 am    Post subject: Reply with quote

hasufell wrote:
Because infra hasn't pushed for it hard enough.
Either start giving them 36 hour days so that they might "push harder", or take up the work yourself. Until then such pronouncements come off as, at best, woefully poorly informed considering the information which is made available for anyone who cares to read it.
hasufell wrote:
Some people claim there are tools that need migration. I don't know of any (probably only some infra-specific stuff no one knows about). People are using git overlays for years. The portage tree is basically no different.
Perhaps you could try reading the mailing lists, the topic comes up rather often on gentoo-dev, mostly due to similarly poorly informed opinions being repeatedly regurgitated for no evident reason than to ignore good advice. Then there is the gentoo-scm mailing list, which is dedicated to discussion of such matters.

hasufell wrote:
IMO, it's much easier to sneak in vulnerabilities on distro level instead of trying to do it on upstream level. But people still don't take this seriously enough.
"We have met the enemy and he is us."?
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Mon Sep 01, 2014 9:15 am    Post subject: Reply with quote

desultory wrote:
Then there is the gentoo-scm mailing list, which is dedicated to discussion of such matters.


Yeah, right, and monty python.

http://article.gmane.org/gmane.linux.gentoo.scm-migration/184

It looks dead, Jim.
Back to top
View user's profile Send private message
steveL
Watchman
Watchman


Joined: 13 Sep 2006
Posts: 5153
Location: The Peanut Gallery

PostPosted: Mon Sep 01, 2014 11:02 am    Post subject: Re: Migrating portage tree to git Reply with quote

apathetic wrote:
Why isn't it done yet? Using git instead of rsync allows removing most of the crap from metadata, getting rid of Changelog files (git log covers that), removing ebuild hashes from manifests, etc. It also greatly speeds up sync times.

So far I've managed to convert the portage cvs repo to git while saving the commit history.
..
If somebody provides me with hosting, I can make a daily-updated mirror as an experiment.

Well done :-)

Can I ask whether you've corrected the email addresses? When patrick posted his experimental results, that was the next step that needed to be carried out. #git on IRC: chat.freenode.net can help with the process.

I'm sure we can get hosting sorted between us, but it would be better done in collaboration with Patrick (bonsaikitten) imo, since he knows the Gentoo requirements very well, and runs a tinderbox against the tree.

Not sure about removing the hashes etc; perhaps commit hooks can add them when deploying the rsync tree.
Back to top
View user's profile Send private message
apathetic
n00b
n00b


Joined: 28 Aug 2014
Posts: 36

PostPosted: Mon Sep 01, 2014 11:09 am    Post subject: Re: Migrating portage tree to git Reply with quote

steveL wrote:
Can I ask whether you've corrected the email addresses?


What do you mean?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum